Technical Staff Member Optimizing Machine Learning Efficiency jobs in San Mateo – Browse 210 openings on RoboApply Jobs

Technical Staff Member - Optimizing Machine Learning Efficiency

embedding-vcSan Francisco Bay Area

On-site Full-time

Clicking Apply Now takes you to AutoApply where you can tailor your resume and apply.

Experience Level

Experience

About the job

Join us at Moonlake, where we leverage AI to craft immersive world simulations that push the boundaries of technology.

Role Overview

Enhancing Training Efficiency

Implement advanced dataloaders, fusion techniques, activation rematerialization, and gradient checkpointing strategies.
Utilize FSDP/ZeRO/tensor+pipeline parallelism and fine-tune NCCL settings for optimal performance.

Boosting GPU and Kernel Performance

Conduct Nsight profiling and develop Triton/CUDA kernels along with fused operations.
Implement flash-attention-style optimizations, sequence packing, and KV-cache improvements.

Optimizing Inference Processes

Facilitate low-latency serving, continuous batching, and speculative decoding techniques.
Engage in quantization methods (GPTQ/AWQ), model distillation, and pruning practices.

Infrastructure and Reliability Enhancements

Manage SLURM/K8s multi-node jobs and ensure checkpoint hygiene.
Focus on determinism, environment pinning, and effective GPU failure management.

We pride ourselves on being an on-site, collaborative team located in San Mateo.

Similar jobs

Browse all companies, explore by city & role, or SEO search pages. View directory listings: all jobs, search results, location & role pages.

1 - 20 of 210 Jobs

Select all on this page (20)

Apply

Technical Staff Member - Optimizing Machine Learning Efficiency

embedding-vc

Full-time|On-site|San Francisco Bay Area

Join us at Moonlake, where we leverage AI to craft immersive world simulations that push the boundaries of technology.Role OverviewEnhancing Training EfficiencyImplement advanced dataloaders, fusion techniques, activation rematerialization, and gradient checkpointing strategies.Utilize FSDP/ZeRO/tensor+pipeline parallelism and fine-tune NCCL settings for opt…

Jan 15, 2026

Apply

Technical Staff Member - Advanced Machine Learning Optimization

Moonlake

Full-time|On-site|San Mateo

Join Moonlake, a pioneering company harnessing AI to develop immersive world simulations.Role OverviewEnhancing Training EfficiencyImplement data loaders, fusion techniques, activation rematerialization, and gradient checkpointing.Optimize training with FSDP/ZeRO/tensor+pipeline parallelism and NCCL tuning.Improving GPU and Kernel PerformanceConduct Nsight profiling, develop Triton/CUDA kernels, and create fused operations.Implement flash-attention style accelerations, sequence packing, and KV-cache optimizations.Optimizing InferenceFocus on low-latency serving, continuous batching, and speculative decoding strategies.Apply quantization methods (GPTQ/AWQ), distillation, and pruning techniques.Infrastructure and ReliabilityManage SLURM/Kubernetes multi-node jobs and ensure checkpoint hygiene.Maintain determinism, environment pinning, and effectively handle GPU failures.Our dedicated team thrives on collaboration in our San Mateo office.

Nov 25, 2025

Apply

Principal Machine Learning Engineer for Engineering Efficiency

Roblox Corporation

Full-time|On-site|San Mateo, CA, United States

Join Roblox as a Principal Machine Learning Engineer specializing in Engineering Efficiency! In this pivotal role, you will leverage cutting-edge machine learning techniques to enhance our engineering processes and optimize performance. Collaborate with a team of talented engineers to deliver innovative solutions that drive productivity and efficiency across our platforms.

Mar 3, 2026

Apply

Technical Staff Member - ML Infrastructure & Performance

Moonlake

Full-time|On-site|San Mateo

Welcome to Moonlake, where we harness AI to create immersive world simulations.Our Mission: To enhance throughput, latency, and cost efficiency—deploying our models 2–10 times faster and more affordably, all while maintaining quality.Key Responsibilities:Optimize GPU performance through CUDA/Triton kernels, FlashAttention, paged attention, and CUDA Graphs.Manage the serving stack, including TensorRT-LLM/Triton Inference Server, vLLM/TGI; implement continuous batching and on-GPU KV reuse; explore speculative decoding/medusa and mixture-of-agents routing.Enhance parallelism via FSDP/ZeRO, TP/PP/expert parallel, and fine-tune NCCL.Implement quantization/PEFT techniques such as AWQ/GPTQ/FP8 and LoRA/DoRA serving.Oversee systems like Ray/k8s/Argo, ensuring observability with Prom/Grafana/OpenTelemetry, autoscaling, A/B testing infrastructure, and canary deployments with rollback capabilities.Ideal Candidate Profile:Candidates should have prior experience in infrastructure-heavy startups, particularly at companies like Databricks or Roblox. We are dedicated to fostering a collaborative, in-person team environment in San Mateo.

Sep 27, 2025

Apply

Technical Staff Member - AI Training Infrastructure

Fireworks AI

Full-time|$175K/yr - $220K/yr|On-site|San Mateo, CA

About Us:At Fireworks AI, we are at the forefront of developing innovative generative AI infrastructure. Our platform is recognized for delivering top-tier models and the industry's fastest, most scalable inference capabilities. As an industry leader in LLM inference speed, we are pushing boundaries with groundbreaking projects, including our own function calling and multimodal models. Fireworks is a Series C startup valued at $4 billion, supported by premier investors such as Benchmark, Sequoia, Lightspeed, Index, and Evantic. Our passionate and collaborative team is comprised of seasoned professionals from Meta PyTorch and Google Vertex AI.The Role: We are seeking a Training Infrastructure Engineer to design, build, and optimize the infrastructure that underpins our large-scale model training operations. Your contributions will be pivotal in establishing high-performance AI training infrastructure. You'll work closely with AI researchers and engineers to develop robust training pipelines, optimize distributed training workloads, and guarantee the reliability of model development.

Mar 5, 2026

Apply

Technical Staff Member - Embodied Agents

embedding-vc

Full-time|On-site|San Francisco Bay Area

Join Moonlake, the forefront of AI technology for crafting immersive world simulations.Your Role: Design and train advanced embodied agents capable of perceiving their environment through vision, depth, and language; reasoning through memory and planning; and acting with precision in both continuous and discrete control.We are dedicated to fostering a collaborative, in-person team environment based in San Mateo, CA.

Jan 15, 2026

Apply

Senior Machine Learning Engineer - Engine Optimization | Roblox | San Mateo, CA

Roblox

Full-time|$195.8K/yr - $242.1K/yr|On-site|San Mateo, CA, United States

Roblox is a vibrant platform where millions of users come together to explore, create, play, learn, and connect in immersive 3D experiences crafted by a diverse global community of developers.At Roblox, we are dedicated to building innovative tools and a robust platform that empower our community to bring their imaginative experiences to life. Our vision is to transform how people unite, no matter where they are in the world or what device they use. We are on a mission to connect a billion individuals with optimism and civility, and we seek exceptional talent to help us achieve this goal.Joining Roblox means you will be at the forefront of shaping the future of human interaction, tackling unique technical challenges at scale, and creating safer, more respectful shared experiences for all.Our engine's resource management and streaming systems are crucial for providing a seamless, stable, and responsive experience for Roblox users across a vast array of devices and network conditions. These systems collaboratively manage compute, memory, bandwidth, and rendering quality while delivering dynamic world content in real time as players interact with their environments. The challenges we face include highly dynamic environments, unpredictable user behaviors, and opaque signals stemming from device and OS limitations.This position offers a unique chance to lead the integration of machine learning into real-time engine optimization. You will develop the ML framework for predictive resource allocation and content fetching, transitioning from heuristic-based logic to adaptive, data-driven decision-making. Your contributions will directly influence stability, visual quality, responsiveness, and content delivery across billions of global play sessions.

Feb 10, 2026

Apply

Software Engineer: Machine Learning Infrastructure

Generalist

Full-time|On-site|San Francisco Bay Area (San Mateo) or Boston (Somerville)

About the RoleAt Generalist, we are at the forefront of training expansive robot foundation models, leveraging cutting-edge GPU hardware, primarily from Nvidia, to execute distributed training tasks and experimental research. Our operations demand exceptional storage solutions and optimized data loading processes, necessitating the full utilization of cloud infrastructure alongside custom-built solutions.In this role, you will take charge of our inference infrastructure. Our robotic systems rely on a dedicated fleet of on-premises GPUs designed for demanding real-time computations and latency-sensitive applications within resource-constrained environments.Your Responsibilities:Manage and optimize our GPU compute fleets.Facilitate user-friendly access to GPUs for researchers, ensuring optimal utilization.Enhance ML data loading, transport, and storage systems in extensively utilized distributed environments.Oversee the orchestration of our robot inference fleets.You May Excel in This Position If You:Have experience managing large GPU fleets for large-scale, distributed training or inference.Possess significant expertise in using Slurm or Kubernetes for ML workload orchestration.Have developed high-scale ML data loaders and preparation systems.Understand the intricacies of ML hardware, storage, and networking systems.Are familiar with the Nvidia GPU ecosystem.

Feb 12, 2026

Apply

Technical Staff Member - ML Infrastructure & Performance

embedding-vc

Full-time|On-site|San Mateo, CA

Join the innovative team at Moonlake, where we harness the power of AI to create real-time interactive content.Mission: Elevate performance metrics by enhancing throughput, reducing latency, and optimizing costs - deploying our models 2–10 times faster and at lower costs without compromising quality.Scope of Work:GPU Performance: Expertise in CUDA/Triton kernels, FlashAttention family, paged attention, and CUDA Graphs.Serving Stack: Proficiency with TensorRT-LLM/Triton Inference Server, vLLM/TGI; continuous batching; on-GPU KV reuse; speculative decoding/medusa; and mixture-of-agents routing.Parallelism: Experience with FSDP/ZeRO, TP/PP/expert parallel; NCCL tuning.Quantization/PEFT: Familiarity with AWQ/GPTQ/FP8; LoRA/DoRA serving.Systems: Knowledge of Ray/k8s/Argo, observability tools (Prom/Grafana/OpenTelemetry), autoscaling, A/B infrastructure, and canary + rollback.Tech Signals:Ideal candidates will have previous experience at infrastructure-heavy startups such as Databricks or Roblox.We are dedicated to maintaining an on-site, in-person team based in San Mateo.

Dec 12, 2025

Apply

Software Engineer: ML Optimization

Generalist

Full-time|On-site|San Francisco Bay Area (San Mateo) or Boston (Somerville)

About the RoleJoin our dynamic team, affectionately known as MBMB (More Big More Better), where you will play a crucial role in optimizing our training and on-robot inference stacks. We are seeking bold innovations that drive substantial improvements rather than incremental changes.Your Responsibilities Will Include:Maximizing GPU performance through innovative strategiesDeploying machine learning, hardware, and software modifications that yield significant advancementsEnhancing both inference and training stacks for optimal performanceIdeal Candidates Will:Possess proficiency in the latest machine learning techniques, particularly for training and inference optimizations within transformer and diffusion-based architecturesHave a relentless pursuit of ML optimizations across various domains, including CUDA kernels, ML architecture, frontend and backend network bottlenecks, CPU inefficiencies, NVLink, and communication protocols, as well as optimizations in libraries such as Torch, NumPy, and Python.

Feb 12, 2026

Apply

Senior Machine Learning Engineer

zaimler

Full-time|On-site|San Mateo, CA

About zaimlerAt zaimler, we recognize that AI agents struggle to make decisions based on fragmented data. Today’s enterprise data is scattered across numerous systems, lacking shared context and structure, which contributes to the challenges faced in enterprise AI. As we transition from copilots to fully autonomous agents, we're establishing a revolutionary infrastructure layer to support this evolution.zaimler serves as the foundational context infrastructure for the agentic era. Our platform autonomously uncovers domain knowledge, maps intricate relationships, and equips AI agents with the semantic understanding necessary for precise operations at scale. Envision knowledge graphs that facilitate real-time inference, designed specifically for systems that require reasoning capabilities, rather than mere data retrieval.Founded by industry veterans Biswajit Das and Sofus Macskassy, zaimler is a small yet highly skilled team at the seed stage, collaborating with major enterprises across sectors such as insurance, travel, and technology. If you aspire to develop the infrastructure that will underpin the next decade of AI advancements, we invite you to join our journey.About the RoleWe are in search of a Senior Machine Learning Engineer to enhance our dynamic team, ideally based in the Bay Area or open to relocation. The perfect candidate will possess strong expertise in one or more of the following areas: Knowledge Extraction, Natural Language Understanding, Unsupervised Learning, Information Retrieval, and Fine-tuning Large Language Models (LLMs). In this pivotal role, you will be instrumental in developing and training the models, pipelines, and methodologies that drive our semantic graph systems. We seek an individual with a robust background in machine learning, natural language processing, LLMs, and semantic technologies, along with a proven history of managing complex, large-scale machine learning projects.

Nov 27, 2024

Apply

Machine Learning Engineer - Entry Level

parspect

Full-time|Hybrid|Hybrid - San Mateo, California

Join our innovative team at parspect as a Machine Learning Engineer, where you’ll be at the forefront of developing cutting-edge machine learning solutions. This is an exciting opportunity for an entry-level candidate eager to expand their skills in a dynamic and supportive environment.

Apr 29, 2026

Apply

Applied Machine Learning Scientist in Cheminformatics (Staff/Principal)

Genesis Therapeutics

Full-time|On-site|San Mateo, CA

About the TeamBecome a part of a groundbreaking team at the intersection of artificial intelligence and biochemistry.At Genesis Molecular AI, we have assembled a close-knit team of distinguished deep learning researchers, software engineers, and pioneers in drug discovery. Our ambitious mission is to develop the next generation of AI foundation models that will enable innovative therapies for patients suffering from serious illnesses.We go beyond the application of machine learning in biology; we conduct pioneering research at the confluence of machine learning, physics, and computational chemistry, continuously pushing the limits of each discipline. You will collaborate with top-tier multidisciplinary researchers to design and implement scalable generative foundation models, backed by extensive computational resources and large-scale simulations.About the RoleThis exceptional opportunity is tailored for a scientist passionate about leveraging advanced AI to tackle real-world challenges in drug discovery. You will play a pivotal role in connecting our long-term research initiatives with our active drug discovery programs. Your objective will be to construct, assess, monitor, and refine our cutting-edge models, integrating them directly into ongoing drug programs and leading the efforts in model validation, deployment, and analysis to inform the discovery of new medications.As both a translator and strategist, you will ensure that our research targets the most pressing challenges and that our drug hunters can harness the full capabilities of our leading AI platform. This role necessitates an in-depth understanding of cheminformatics, computational chemistry, and experimental techniques, robust data science skills, and an aptitude for articulating complex concepts to a diverse, multidisciplinary team.Positions are available at multiple seniority levels: Senior, Staff, and Principal.

Jul 30, 2025

Apply

Senior Machine Learning Engineer - Safety Expertise

Roblox

Full-time|$195.8K/yr - $242.1K/yr|On-site|San Mateo, CA, United States

At Roblox, millions of individuals engage daily to explore, create, play, learn, and connect with friends in immersive 3D digital experiences crafted by our global community of developers and creators.We are dedicated to building tools and platforms that empower our community to realize their imaginative experiences. Our vision is to transform how people connect from anywhere in the world and on any device. Our mission is to unite one billion people with optimism and civility, and we are seeking exceptional talent to help us achieve this goal.Joining Roblox means you will play a pivotal role in shaping the future of human interaction, tackling unique technical challenges at scale, and contributing to the creation of safer, more civil shared experiences for all.The Rights Manager and Content Suitability teams, part of our Safety Experience, are in search of a Senior Machine Learning Engineer.The Safety Experience organization develops tools and systems that empower Roblox users and creators to control their experiences while enabling moderators to uphold our community standards. Our focus includes education, intervention, visibility, and action.Our initiatives include:Monitoring and influencing user behavior for enhanced safety.Building scalable and efficient moderation platforms.Providing transparency and educational resources for parents and developers.Empowering users to manage their own safety.Allowing IP owners to manage their creations on Roblox.Delivering quick and accurate support through our Customer Care Chatbot.Your responsibilities will include:Implementing machine learning solutions for safety-focused systems.Promoting a culture of technical excellence and inclusivity.Breaking down long-term product requirements into actionable phases for continuous improvement.Designing and building large-scale machine learning models with billions of parameters, ensuring they are production-ready.Facilitating complex technical decisions with empathy and foresight.

Feb 10, 2026

Apply

Technical Staff Member - Diffusion Model

embedding-vc

Full-time|On-site|San Francisco Bay Area

Join us at Moonlake, where we harness the power of AI to create immersive world simulations.Model Development & ArchitectureDesign and refine innovative 2D/3D/image/video/audio diffusion architectures.Engage in conditioning tasks involving text, images, poses, layouts, and control signals with multi-modal encoders and guidance strategies.Training & OptimizationConduct large-scale diffusion training to enhance model performance.Focus on improving sample quality while optimizing computational resources through advanced objectives, distillation, and consistency models.Control & AlignmentImplement cutting-edge techniques such as ControlNet, LoRA, and IP-Adapters to manage style, identity, geometry, and control.Develop robust editing, inpainting, and personalization pipelines, including DreamBooth and custom subject/style tuning.Our vibrant team is dedicated to working on-site, currently located in San Mateo.

Jan 15, 2026

Apply

Technical Staff Member - Cloud Infrastructure

Fireworks AI

Full-time|On-site|New York, NY; San Mateo, CA

Join Fireworks AI as a Technical Staff Member specializing in Cloud Infrastructure. In this pivotal role, you will be at the forefront of our innovative cloud solutions, collaborating with a dynamic team of professionals dedicated to advancing cloud technologies. Your expertise will contribute to building and maintaining robust infrastructure that supports our evolving business needs.

May 1, 2026

Apply

Machine Learning Research Scientist, Foundation Models (Senior/Staff/Principal)

Genesis Therapeutics

Full-time|On-site|San Mateo, CA

Join Our Innovative TeamAt Genesis Molecular AI, we are a dedicated group of skilled deep learning researchers, software engineers, and pioneers in drug discovery. Our collective mission is to develop pioneering AI foundation models that will enable revolutionary therapies for patients suffering from severe conditions.We don't merely implement machine learning in biology; we engage in groundbreaking research at the convergence of machine learning, physics, and computational chemistry, challenging the limits of each discipline. As part of the Genesis AI team, you will collaborate with top-tier multidisciplinary researchers to design and construct generative and discriminative foundation models from an extensive range of molecular data, utilizing state-of-the-art computing resources and large-scale simulations.Your RoleThis position provides a unique platform for a scientific innovator to contribute to the future of generative AI in drug discovery. As an integral member of the Genesis AI team, you will guide and propel our research initiatives focusing on foundation models. You will spearhead significant research projects in areas such as reinforcement learning, innovative model architectures, and advanced pretraining and post-training techniques. Your primary goal will be to develop groundbreaking models and insights crucial for discovering new medications.This role demands a profound curiosity and a collaborative mindset. You will be an essential team player, working closely with our exceptional engineers and drug discovery specialists to transform complex concepts into reality. Additionally, we encourage you to be an influential voice in the scientific community, and we will actively support your efforts to publish research breakthroughs at premier ML conferences such as NeurIPS, ICML, and ICLR.Positions are available at several levels of seniority: Senior, Staff, and Principal.Key ResponsibilitiesLead transformative research projects from initiation to execution, addressing core challenges in generative modeling for molecular systems.Design and develop innovative models and algorithms, leveraging the latest advancements in diffusion models, flow matching, reinforcement learning, large language models, and other cutting-edge fields.

Jul 30, 2025

Apply

Technical Staff Member - Code Generation

Moonlake

Full-time|On-site|San Mateo

Welcome to Moonlake, where we leverage AI to craft immersive world simulations.Mission: Join us as an Applied AI Research Engineer focused on designing and coding intelligent agents (post-training and systems).Scope of Work:Design agentic systems: Develop tool catalogs, function calls, program synthesis, repair loops, and control mechanisms such as ReAct, Reflexion, ToT, and LangGraph, along with self-verification and sandboxed execution.Evaluation mindset: Create comprehensive task suites for multi-step coding, including full-stack LLM engineering, prompt libraries, routing, retrieval, KV-cache management, streaming, and telemetry.Security and isolation: Implement Docker/firejail, manage network egress controls, maintain secrets hygiene, and ensure dependency pinning for supply-chain integrity.Strong post-training capabilities: Conduct supervised fine-tuning, preference and trace reinforcement learning (DPO/RLAIF/RLHF), dataset curation, reward shaping, and safety filtering.Technical Signals:Experience shipping agents that successfully navigate real repository test suites from start to finish.Published research in the fields of agentic systems and code generation, contributing to frameworks or open-source evaluations such as LangGraph, AutoGen, Guidance, LEAP, and SWE-bench variants.Developed datasets from execution traces, demonstrating significant enhancements from data over parameters.We are committed to maintaining an on-site, collaborative team environment based in San Mateo.

Oct 1, 2025

Apply

Technical Staff Member, Cluster Management

Fireworks AI

Full-time|On-site|San Mateo, CA

About Us:At Fireworks AI, we are pioneering the next generation of generative AI infrastructure. Our innovative platform is designed to deliver the highest quality models with unparalleled speed and scalability in inference. Recognized as a leader in LLM inference speed, we continuously push the boundaries of technology through transformative projects, including our proprietary function calling and multimodal models. As a Series C company valued at $4 billion, we are supported by esteemed investors like Benchmark, Sequoia, Lightspeed, Index, and Evantic. Our team is a dynamic blend of visionaries and builders, with a strong foundation from Meta PyTorch and Google Vertex AI.The Role:As a Member of the Technical Staff specializing in Cluster Management at Fireworks AI, you will be pivotal in ensuring our world-scale virtual AI cloud operates reliably, efficiently, and at peak performance. Your expertise in large-scale distributed systems, cloud infrastructure, and operational excellence will be essential as you collaborate with top-tier software engineers and AI specialists to elevate our advanced AI platforms, addressing the rapid growth and evolving needs of our applications. This position is ideal for individuals passionate about building resilient, observable, and automated systems that drive customer success.

Mar 5, 2026

Apply

Applied Machine Learning Engineer

Fireworks AI

Full-time|$170K/yr - $240K/yr|On-site|New York, NY; San Mateo, CA

About Us:At Fireworks AI, we are pioneering the future of generative AI infrastructure. Our platform is recognized for delivering the highest-quality models with the fastest and most scalable inference capabilities in the industry. Proudly benchmarked as a leader in LLM inference speed, we are at the forefront of innovation with projects like function calling and multimodal models. As a Series C company valued at $4 billion, we are supported by prestigious investors including Benchmark, Sequoia, Lightspeed, Index, and Evantic. Our team comprises ambitious builders, many of whom hail from Meta PyTorch and Google Vertex AI.The Role:As an Applied Machine Learning Engineer, you will play a crucial role in bridging the gap between advanced AI research and practical applications. Your responsibilities will include developing, fine-tuning, and operationalizing machine learning models that deliver significant business value and enhance user experiences. This hands-on engineering position demands a blend of deep technical expertise and a strong customer-centric approach to create scalable AI solutions.Key Responsibilities:Customer Success: Collaborate with the Go-To-Market (GTM) team, including Account Executives and Solutions Architects, to ensure seamless integration and successful deployment of machine learning solutions.Demo / Proof of Concept (PoC): Develop and present engaging PoCs that showcase the capabilities of our AI technology.Application Build: Design, develop, and deploy end-to-end AI-powered applications customized to meet customer needs.Platform Features / Bug Fixes: Contribute to the internal machine learning platform by adding features and resolving issues.New Model Enablements: Integrate and enable new machine learning models within the existing platform or client environments.Performance Optimizations: Enhance the performance, efficiency, and scalability of deployed models and applications.Partnership Enablement: Collaborate closely with partners to facilitate joint AI solutions and ensure effective collaboration.

Mar 5, 2026

Create account — see all 210 results

Browse all companies, explore by city & role, or SEO search pages. View directory listings: all jobs, search results, or location & role pages.