We’re Hiring! Our key client is growing like crazy and adding a Data Scientist (Direct Hire) to their team.
Pay Rate: $130,000 – $210,000 depending on experience.
Work Type: Hybrid – Houston, TX 77077
Additional Information: Standard Driver's License and Motor Vehicle Record (MVR) are required. Relocation assistance is provided.
Job Overview:
Build, train, and deploy large-scale, self-supervised "foundation" models that learn rich representations of time series and sequential sensor data, in addition to textual and vision data. These models will be fine-tuned for tasks such as anomaly/event detection, predictive maintenance, forecasting, classification, or multi-modal sensor fusion for industrial and scientific applications.
Responsibilities:
* Data/Signal Processing
* Time Series & Sequential Data: Manage processing, augmentation, and feature engineering for financial, industrial, IoT, medical, or other sensor streams (univariate/multivariate time series).
* Sensor Data Analysis: Apply expertise with diverse sensor modalities (e.g., accelerometers, temperature, vibration, audio, images), sampling rates, synchronization, and real-world noise/artifact handling.
* Multi-Modality Learning: Integrate heterogeneous data types (time series, images, text, audio, structured) into robust deep learning architectures and cross-modal representation learning.
* Machine Learning & Foundation Model Expertise
* Self-supervised and Semi-supervised Learning: Develop time series foundation models, masked modeling, contrastive methods, temporal predictive coding, and multimodal alignment and fusion.
* Model Architectures: Utilize sequence models (RNNs, GRU/LSTM, TCN), 1D/2D/3D CNNs, Transformers (BERT, ViT, TimeSFormer), graph neural networks, diffusion/generative models, and multi-modal/fusion encoders.
* Transfer Learning & Fine-Tuning at Scale: Implement prompt/adapter-based strategies, temporal domain adaptation, and few-shot learning for specialized tasks.
* Evaluation Metrics: Apply regression/classification (MSE, F1, AUC), time series similarity (DTW, correlation), event detection/segmentation (IoU, accuracy), and business/end-user KPIs.
* Software & Infrastructure
* Programming: Demonstrate expert-level Python (NumPy, SciPy, Pandas) and C++/CUDA for custom kernels and high-performance preprocessing.
* Deep Learning Frameworks: Work with PyTorch (Lightning, Distributed), TensorFlow/Keras, and JAX/Flax.
* Large-scale Training: Manage multi-GPU and multi-node clusters, mixed-precision, ZeRO optimization, and scalable data loaders for long sequences.
* Data Engineering: Build robust pipelines for ingesting, cleaning, segmenting, and aligning large-scale, time-synchronized multi-sensor datasets.
* Mathematical & Algorithmic Foundations
* Theoretical Knowledge: Apply Linear Algebra, Probability & Statistics, and Optimization (stochastic, convex/non-convex, Bayesian).
* Signal Processing: Use Fourier/wavelet analysis, filters (Kalman, Savitzky-Golay), resampling, and noise modeling.
* Numerical Methods: Utilize ODE/PDE solvers, inverse problems, regularization, and time-frequency methods for complex systems.
* Collaboration & Communication
* Cross-disciplinary Teamwork: Collaborate with domain experts, engineers, product owners, and end-users from industrial, scientific, or medical backgrounds.
* Presentation: Provide clear presentations of complex model behaviors (interpretability, attention analysis), uncertainty quantification, and value impact.
Qualifications:
* MS / Ph.D. in Computer Science, Data Science, AI, or related fields.
* 3+ years of relevant experience in Data Science, AI, or related fields.
VEVRAA Federal Contractor – priority referral of Protected Veterans requested. An Equal Opportunity Employer – Qualified applicants are considered without regard to race, color, religion, sex, sexual orientation, gender identity national origin, age, disability, status as a protected veteran, or other characteristics protected by law.