Machine Learning Engineer | Python | Pytorch | Distributed Training | Optimisation | GPU | Hybrid, San Jose, CA
Enigma · San Jose, CA · 1 month ago
About This Role
Machine Learning Engineer | Python | Pytorch | Distributed Training | Optimisation | GPU | Hybrid, San Jose, CA
Title: Machine Learning EngineerLocation: San Jose, CAResponsibilities:Productize and optimize models from Research into reliable, performant, and cost-efficient services with clear SLOs (latency, availability, cost).Scale training across nodes/GPUs (DDP/FSDP/ZeRO, pipeline/tensor parallelism) and own throughput/time-to-train using profiling and optimization.Implement model-efficiency techniques (quantization, distillation, pruning, KV-cache, Flash Attention) for training and inference without materially degrading quality.Build and maintain model-serving systems (vLLM/Triton/TGI/ONNX/TensorRT/AITemplate) with batching, streaming, caching, and memory management.Integrate with vector/feature stores and data pipelines (FAISS/Milvus/Pinecone/pgvector; Parquet/Delta) as needed for production.Define and track performance and cost KPIs; run continuous improvement loops and capacity planning.Partner with ML Ops on CI/CD, telemetry/observability, model registries; partner with Scientists on reproducible handoffs and evaluations.
Educational Qualifications:Bachelors in computer science, Electrical/Computer Engineering, or a related field required; Master’s preferred (or equivalent industry experience).Strong systems/ML engineering with exposure to distributed training and inference optimization.
Industry Experience: 3–5 years in ML/AI engineering roles owning training and/or serving in production at scale.Demonstrated success delivering high-throughput, low-latency ML services with reliability and cost improvements.Experience collaborating across Research, Platform/Infra, Data, and Product functions.
Technical Skills:Familiarity with deep learning frameworks: PyTorch (primary), TensorFlow.Exposure to large model training techniques (DDP, FSDP, ZeRO, pipeline/tensor parallelism); distributed training experience a plusOptimization: experience profiling and optimizing code execution and model inference: (PTQ/QAT/AWQ/GPTQ), pruning, distillation, KV-cache optimization, Flash AttentionScalable serving: autoscaling, load balancing, streaming, batching, caching; collaboration with platform engineers.Data & storage: SQL/NoSQL, vector stores (FAISS/Milvus/Pinecone/pgvector), Parquet/Delta, object stores.Write performant, maintainable codeUnderstanding of the full ML lifecycle: data collection, model training, deployment, inference, optimization, and evaluation.
Machine Learning Engineer | Python | Pytorch | Distributed Training | Optimisation | GPU | Hybrid, San Jose, CA
Job Details
Seniority
Mid-Senior level
Employment
Full-time
Function
Information Technology
Industry
Hospitals and Health Care
Required Skills
Ready to Apply for This Role?
Create your AI-optimized CV and apply for Machine Learning Engineer | Python | Pytorch | Distributed Training | Optimisation | GPU | Hybrid, San Jose, CA at Enigma in minutes.
Apply Now with HireyraSimilar Jobs
Data Reporting Analyst
Assyst · Austin, TX
Data Center Material Inventory Specialist - Travel
Linx Llc · Phoenix, AZ
Data Center Material Inventory Specialist - Travel
Linx Llc · Jackson, MS
CNC Programmer
Husky Technologies · Milton, VT
Service Technician
Husky Technologies · Chicago, IL