Machine Learning Platform Engineer jobs in San Francisco – Browse 5,611 openings on RoboApply Jobs

Machine Learning Platform Engineer jobs in San Francisco

Open roles matching “Machine Learning Platform Engineer” with location signals for San Francisco. 5,611 active listings on RoboApply Jobs.

5,611 jobs found

1 - 20 of 5,611 Jobs
Apply
companyFoxglove logo
Full-time|On-site|San Francisco, CA

Join us in creating the backbone of data infrastructure for real-world robotic operations.As robotics transitions from research labs to real-world applications across factories, warehouses, vehicles, and field deployments, understanding the intricacies of robotic performance becomes critical. When robots encounter failures or unexpected behaviors, data analysis is key to deciphering the underlying issues.At Foxglove, we are at the forefront of building tools for observability, visualization, and data infrastructure that empower robotics and autonomous systems teams to manage, analyze, and derive insights from vast amounts of multimodal sensor data collected from operational systems and production fleets.Role OverviewWe are seeking a passionate ML Platform Engineer with robust infrastructure expertise to design, deploy, and scale our data platform systems. This platform-centric role will allow you to take charge of the infrastructure layer that facilitates machine learning in production environments, going beyond just the models themselves.Your responsibilities will encompass ensuring the reliability, scalability, and performance of the ML platform, including areas such as inference serving, pipeline orchestration, training infrastructure, and evaluation frameworks. You will be tackling substantial challenges such as managing petabyte-scale multimodal robotics data and optimizing high-throughput retrieval and embedding pipelines in a hands-on infrastructure capacity.Key ResponsibilitiesDesign and operationalize production inference infrastructure, focusing on model serving, autoscaling, load balancing, and cost efficiency across cloud environments.Own the platform architecture for embedding and retrieval pipelines that enable semantic search across multimodal robotics data (image, video, point cloud, and time series).Develop and sustain the training and evaluation infrastructure that supports rapid model performance iteration, including job orchestration, experiment tracking, and dataset versioning.Lead decisions on cloud infrastructure (AWS/GCP) that affect latency, throughput, reliability, and scalability.Establish platform abstractions and internal tools that empower product engineers to deliver ML-enhanced features without managing infrastructure directly.Assess, integrate, and operationalize third-party ML infrastructure components while establishing clear build vs. buy frameworks for the team.

Apr 2, 2026
Apply
companytvScientific powered by Pinterest logo
Machine Learning Platform Engineer

tvScientific powered by Pinterest

Full-time|$123.7K/yr - $254.7K/yr|Remote|San Francisco, CA, US; Remote, US

tvScientific, powered by Pinterest, develops a connected TV (CTV) advertising platform designed for performance marketers. The platform combines media buying, optimization, measurement, and attribution to automate and improve TV advertising. Built by professionals in programmatic advertising, digital media, and ad verification, tvScientific aims to deliver measurable results for advertisers. Role overview As a Machine Learning Platform Engineer, you will join a team that operates where Site Reliability Engineering meets low-latency distributed systems. This team advances Pinterest’s real-time machine learning and measurement infrastructure, focusing on sub-millisecond decision-making and high-throughput data access. Seamless integration with Pinterest’s core stack is central to the work. What you will do Design and build systems to keep queries and RPCs fast and reliable, even during periods of heavy demand. Develop and enhance the foundation of the machine learning training and serving stack. Address challenges in storage, indexing, streaming, fan-out, and managing backpressure and failures across services and regions. Collaborate with software engineering, data infrastructure, and SRE teams to ensure systems are observable, debuggable, and ready for production. Key areas of focus I/O scheduling and batching Lock-free or low-contention data structures Connection pooling and query planning Kernel and network tuning On-disk layout and indexing strategies Circuit-breaking and autoscaling Incident response and failure management NixOS Defining and maintaining SLIs and SLOs This position is a strong fit for engineers interested in building and operating large-scale infrastructure, particularly those who enjoy working on real-time systems, observability, and reliability.

Apr 23, 2026
Apply
companyWhatnot logo
FullTime|On-site|San Francisco, CA

Be a Part of the Revolution in E-Commerce with Whatnot!Whatnot stands as the leading live shopping platform across North America and Europe, where you can buy, sell, and explore the items you cherish. We are transforming the landscape of e-commerce by merging community engagement, shopping, and entertainment into a unique experience tailored just for you. As a remote-first team, we are driven by innovation and firmly rooted in our core values. With operational hubs in the US, UK, Germany, Ireland, and Poland, we are collaboratively crafting the future of online marketplaces.From fashion and beauty to electronics and collectibles like trading cards, comic books, and live plants, our live auctions cater to a diverse audience.And this is just the beginning! As one of the fastest-growing marketplaces, we are on the lookout for innovative, forward-thinking problem solvers in all areas of our business. Stay updated with the latest from Whatnot through our news and engineering blogs, and join us in empowering individuals to transform their passions into successful ventures while fostering community through commerce. The RoleWe are seeking passionate builders—intellectually curious, entrepreneurial engineers who are ready to pioneer the future of AI and ML at Whatnot. You will be responsible for designing and scaling the foundational infrastructure that supports machine learning and self-hosted large language model applications throughout the organization. Collaborating closely with machine learning scientists, you will facilitate the deployment of cutting-edge models into production, creating entirely new product experiences. Your work will involve constructing systems that ensure advanced machine learning is reliable and efficient at scale—from low-latency model serving to distributed training and high-throughput GPU inference.Your Responsibilities:Lead the infrastructure that powers AI and ML models across vital business domains—enhancing growth, trust and safety, fraud detection, seller tools, and more.Prototype, deploy, and operationalize innovative ML architectures that significantly influence user experience and marketplace dynamics.Design and scale inference infrastructure capable of managing large models with minimal latency and maximal throughput.Construct distributed training and inference pipelines utilizing GPUs, as well as model and data parallelism.Push the boundaries of your expertise and explore new technologies and methodologies.

Feb 5, 2026
Apply
companyFaire logo
Full-time|$268K/yr - $368.5K/yr|On-site|San Francisco, CA

About FaireFaire is a transformative online wholesale marketplace, driven by the conviction that local businesses are the future. Independent retailers around the globe generate more revenue than massive corporations like Walmart and Amazon combined, yet individually, they remain small. At Faire, we harness technology, data, and machine learning to connect this vibrant community of entrepreneurs. Think of your favorite local boutique — we empower them to discover and sell the best products from around the world. With our innovative tools and insights, we aim to level the playing field, enabling small businesses to thrive against larger competitors.By championing the growth of independent businesses, Faire positively impacts local economies on a global scale. We’re in search of intelligent, resourceful, and passionate individuals to join us in fueling the shop local movement. If you value community, we invite you to be part of ours.About this RoleAs the Senior Staff Machine Learning Platform Engineer, you will spearhead the technical vision and evolution of Faire's ML platform. You will establish standards, influence organization-wide architecture, and lead intricate, cross-functional initiatives that enhance data science velocity at scale. This position is crucial for adapting ML workflows to leverage modern AI productivity tools. You will not only develop models but also design the systems that enable those models to empower tens of thousands of small retailers in competing and growing their local businesses.

Mar 4, 2026
Apply
companyAmbience Healthcare logo
Full-time|$250K/yr - $250K/yr|Hybrid|San Francisco

About Us:At Ambience Healthcare, we are not just another scribe; we are pioneering an AI intelligence platform that reinvigorates the human touch in healthcare while delivering significant ROI for health systems nationwide.Our innovative technology enables healthcare providers to concentrate on delivering exceptional care by alleviating the administrative burdens that detract from patient interactions and their most impactful work. Ambience provides real-time, coding-aware documentation and clinical workflow support in ambulatory, emergency, and inpatient settings across leading health systems in North America.Our team is driven by a relentless pursuit of excellence and extreme ownership, dedicated to crafting the best solutions for our health system partners. We champion transparency, positivity, and thoughtful engagement, holding each other accountable because we understand the significance of the challenges we tackle.Ambience has earned accolades such as being ranked #1 for Improving the Clinician Experience in the KLAS Research Emerging Solutions Top 20 Report, being recognized by Fast Company as one of the Next Big Things in Tech, and being named one of the best AI companies in healthcare by Inc. We were also selected as a LinkedIn Top Startup in 2024 and 2025. Our esteemed investors include Oak HC/FT, Andreessen Horowitz (a16z), OpenAI Startup Fund, and Kleiner Perkins — and our journey is just beginning.The Role:As a Staff Machine Learning Engineer, you will play a crucial role in advancing clinical AI that impacts millions of patient encounters across the largest health systems in the nation. Your contributions will directly influence the speed at which we enhance our AI capabilities through the platform you will oversee.You will design and implement evaluation and release processes that empower teams to deliver with confidence, create observability tools to identify quality issues pro-actively, and develop debugging tools that facilitate rapid issue reproduction. Additionally, you’ll work on the chart context retrieval layer that transforms patient history into model-ready inputs.Our goal is to enable teams to iterate on quality within days, not weeks, ensuring that every enhancement you implement adds value across all product teams each quarter.Please note that our engineering roles operate in a hybrid model from our San Francisco office (3 days per week).What You’ll Own:Evaluation & Release Infrastructure — Developing automated grading systems and release gates that function seamlessly across product teams, creating a unified evaluation dataset with version control to replace fragmented workflows. Implementing production-quality monitoring that includes end-to-end tracing, shared metrics, and automated alerts.Debugging Tools — Building encounter replay features that reconstruct precise inference inputs (including retrieved chart context, packed prompts, and model versions) to allow teams to troubleshoot issues without sifting through logs. Creating differential views to compare known good states with regressions.

Feb 2, 2026
Apply
companyWhatnot logo
Full-time|On-site|San Francisco, CA

Join Whatnot as a Machine Learning Platform Engineer, where you'll play a pivotal role in shaping the future of our AI-driven solutions. In this dynamic position, you will collaborate with cross-functional teams to design, implement, and optimize machine learning platforms that drive efficiency and innovation.Your expertise will be critical in enhancing our data processing capabilities and deploying robust machine learning models at scale. If you are passionate about leveraging cutting-edge technology to solve complex challenges, we want to hear from you!

Mar 3, 2026
Apply
companytvScientific logo
Full-time|Remote|San Francisco, CA, US; Remote, US

tvScientific seeks a Machine Learning Platform Engineer to help shape the company’s advertising technology. This position can be based in San Francisco, CA, or performed remotely from anywhere in the United States. Role overview This role focuses on building and refining machine learning models that drive the core of tvScientific’s advertising platform. The work combines technical skill with creative problem-solving to support the platform’s effectiveness. What you will do Develop and optimize machine learning models to enhance advertising performance Collaborate with team members to deliver solutions that balance innovation, scalability, and reliability Apply technical expertise to address challenges at the intersection of technology and creative thinking Location Candidates may work from San Francisco, CA, or remotely within the US.

Apr 23, 2026
Apply
company
Full-time|On-site|San Francisco

OverviewPluralis Research is at the forefront of Protocol Learning, innovating a decentralized approach to train and deploy AI models that democratizes access beyond just well-funded corporations. By aggregating computational resources from diverse participants, we incentivize collaboration while safeguarding against centralized control of model weights, paving the way for a truly open and cooperative environment for advanced AI.We are seeking a talented Machine Learning Training Platform Engineer to design, develop, and scale the core infrastructure that powers our decentralized ML training platform. In this role, you will have ownership over essential systems including infrastructure orchestration, distributed computing, and service integration, facilitating ongoing experimentation and large-scale model training.ResponsibilitiesMulti-Cloud Infrastructure: Create resource management systems that provision and orchestrate computing resources across AWS, GCP, and Azure using infrastructure-as-code tools like Pulumi or Terraform. Manage dynamic scaling, state synchronization, and concurrent operations across hundreds of diverse nodes.Distributed Training Systems: Design fault-tolerant infrastructure for distributed machine learning, including GPU clusters, NVIDIA runtime, S3 checkpointing, large dataset management and streaming, health monitoring, and resilient retry strategies.Real-World Networking: Develop systems that simulate and manage real-world network conditions—such as bandwidth shaping, latency injection, and packet loss—while accommodating dynamic node churn and ensuring efficient data flow across workers with varying connectivity, as our training occurs on consumer nodes and non-co-located infrastructure.

Apr 1, 2026
Apply
company
Full-time|On-site|San Francisco Office

Innovate Boldly. Shape Tomorrow. Our VisionCrafting everyday AGI. Reliable, consumer-friendly agents that transform human-AI synergy for millions. Our software is designed to act as a collaborator, enhancing your daily capabilities.Why Choose AGI, Inc.?We are a discreet collective of exceptional founders and AI pioneers, whose expertise spans Stanford, OpenAI, and DeepMind. Our team leads the way in mobile and computer-based agents, scaling these innovations for consumer use.With a foundation rooted in extensive research on agents, our AI prioritizes trustworthiness and reliability as fundamental principles.Backed by top-tier investors who previously supported the first wave of AI leaders, we are now positioned to create the next generation: everyday AGI. (Check out the demo)If you envision possibilities where others perceive restrictions, continue reading.Your RoleTraining Automation: Design and execute robust CI/CD pipelines tailored for machine learning workflows. Automate nightly and on-demand training sessions encompassing data ingestion, job orchestration, checkpointing, and artifact management, with a focus on reliability.Evaluation Infrastructure: Develop scalable evaluation frameworks that automatically benchmark models with each merge. Enhance latency and resource efficiency to ensure quick experimentation and immediate detection of performance regressions.Research Tooling: Create internal SDKs, CLIs, and lightweight UIs (e.g., Streamlit, Retool) empowering researchers to:Examine trajectories and tracesVisualize model failuresOrganize and oversee datasetsIterate seamlesslyYou'll facilitate a user-friendly experimentation process.Observability & Performance: Enforce comprehensive tracking for:Model latency, throughput, and error ratesGPU utilization, and more.

Mar 31, 2026
Apply
companyOrchard logo
Full-time|On-site|San Francisco

Join Orchard as a Machine Learning Engineer and play a pivotal role in transforming data into actionable insights. In this dynamic position, you will leverage your expertise in machine learning algorithms and data analysis to develop innovative solutions that enhance our products and services.We are looking for a proactive team player who thrives in a fast-paced environment and possesses strong problem-solving skills. You will collaborate with cross-functional teams, engage with large datasets, and contribute to the design and implementation of machine learning models.

Mar 14, 2026
Apply
companyCoinbase, Inc. logo
Full-time|$186.1K/yr - $225K/yr|Remote|Remote - USA

Are you ready to push the boundaries of what you believe you're capable of? At Coinbase, our vision is to enhance economic freedom globally. This is a grand, ambitious endeavor that challenges us to deliver our best every day as we construct the foundational onchain platform and shape the future of the global financial system.To drive our mission forward, we are in search of a unique candidate. We seek an individual who is not only passionate about our objective but also believes in the transformative power of cryptocurrency and blockchain technology to revolutionize the financial landscape. We are looking for someone eager to make a significant impact, who thrives under pressure while collaborating with a team of highly skilled professionals, and who actively seeks constructive feedback for continuous improvement. We want a problem-solver who embraces challenges head-on.Our work culture is intense and not suited for everyone. However, if you aspire to build the future alongside exceptional individuals and are ready to meet high expectations, this is the place for you.While many positions at Coinbase are remote-first, we are not solely remote. In-person engagements are expected throughout the year. We conduct team and company-wide offsites several times a year to promote collaboration, connection, and alignment. Your attendance is both expected and fully supported.We are looking for a Senior Machine Learning Platform Engineer to join our Machine Learning Platform team. This team is responsible for developing the core components for feature engineering, as well as training and serving ML models at Coinbase. Our platform plays a crucial role in combating fraud, personalizing user experiences, and analyzing blockchains. You will have the opportunity to leverage your engineering expertise across various aspects of large-scale ML development, including stream processing, distributed training, and highly available online services.

Feb 27, 2026
Apply
companyAffinity logo
Full-time|Remote|San Francisco, CA; USA (Remote)

Join Affinity as a Senior Machine Learning Engineer to shape the future of our AI Platform. In this role, you will leverage your expertise in machine learning to develop scalable solutions that drive innovation and enhance our platform's capabilities. Collaborate with cross-functional teams to implement advanced algorithms, optimize performance, and contribute to impactful projects that redefine industry standards.

Mar 13, 2026
Apply
companyHandshake logo
Full-time|On-site|San Francisco, CA

Join Handshake as a Machine Learning Engineer I, where you will have the opportunity to work on cutting-edge machine learning projects that drive our innovative solutions. Collaborate with a talented team to develop algorithms and models that enhance our product offerings and improve user experiences.

Apr 6, 2026
Apply
companyHive logo
Full-time|On-site|San Francisco

Join Our Innovative Team at HiveHive is at the forefront of cloud-based AI solutions, revolutionizing how organizations understand, search for, and generate content. Trusted by many of the world's largest and most groundbreaking companies, we empower developers with premier pre-trained AI models that handle billions of API requests monthly. Our turnkey software applications leverage proprietary AI models and datasets, driving transformative advancements in content moderation, brand protection, sponsorship measurement, and context-based ad targeting.With over $120M in funding from prominent investors like General Catalyst, 8VC, Glynn Capital, Bain & Company, and Visa Ventures, Hive is rapidly expanding. Our dynamic team of over 250 employees operates from our San Francisco, Seattle, and Delhi offices. If you are passionate about shaping the future of AI, we invite you to explore opportunities with us!About the Machine Learning Engineer RoleAs we strive to achieve our ambitious vision, we seek exceptional machine learning engineers to join our team. We are looking for enthusiastic developers who are eager to remain at the cutting edge of deep learning technology, designing and deploying state-of-the-art neural network models into production. Our ideal candidates thrive in working with large-scale datasets and demonstrate a keen interest in mastering new technologies across the machine learning spectrum. We value individuals who are proactive and take ownership of their projects, contributing innovative ideas and practical implementations. Experience in building machine learning applications from the ground up and designing scalable, maintainable data pipelines is essential.

Jan 15, 2021
Apply
companyBoomtrain logo
Full-time|On-site|San Francisco

Join our dynamic Personalization team at Boomtrain as a Machine Learning Engineer. We are in search of a skilled engineer who will play a pivotal role in developing and enhancing our recommendation systems that cater to a variety of customers.In this role, you will collaborate with a talented team dedicated to designing and implementing innovative models and systems that deliver personalized recommendations. You will have the opportunity to work on complex engineering challenges and contribute to generating hundreds of millions of recommendations daily.This position offers a unique chance to engage in end-to-end project work and make a significant impact on our personalization initiatives.Key Responsibilities:Research and propose advanced recommendation and optimization models to enhance our personalization systems.Develop and maintain offline model generation pipelines.Design and maintain online recommendation serving systems.

Jul 21, 2016
Apply
companyFaire logo
Full-time|$224K/yr - $308K/yr|On-site|San Francisco, CA

About FaireAt Faire, we are revolutionizing the wholesale marketplace with an unwavering commitment to local communities. Our platform empowers independent retailers globally, enabling them to thrive against larger competitors like Walmart and Amazon. By leveraging cutting-edge technology, data insights, and machine learning, we connect these vibrant entrepreneurs with the best products from around the world. We believe that with the right tools, small businesses can elevate their potential and compete on a grand scale.By nurturing independent businesses, Faire is making a significant positive impact on local economies worldwide. We are in search of intelligent, resourceful, and passionate individuals to join our mission of championing local commerce. If you resonate with our community-driven values, we'd love to welcome you to our team.About this roleAs a Staff Machine Learning Platform Engineer, you will play a pivotal role in shaping, enhancing, and managing a scalable machine learning platform designed to expedite model training, deployment, and governance. You will serve as the vital technical link between our data science and production engineering teams. Joining a small but integral team, you will amplify Faire’s capabilities to support tens of thousands of local businesses in an increasingly competitive retail landscape.

Mar 4, 2026
Apply
companyPulse logo
Full-time|On-site|San Francisco

OverviewPulse is revolutionizing data infrastructure by addressing the critical challenge of extracting accurate, structured information from complex documents on a large scale. Our innovative approach to document understanding integrates intelligent schema mapping with advanced extraction models, outperforming traditional OCR and parsing methods.As a dynamic and rapidly growing team of engineers based in San Francisco, we empower Fortune 100 companies, Y Combinator startups, public investment firms, and growth-oriented businesses. With the backing of top-tier investors, we are on an exciting growth trajectory.What sets our technology apart is our cutting-edge multi-stage architecture:Layout comprehension with specialized component detection modelsLow-latency OCR models designed for targeted data extractionAdvanced algorithms for determining reading order in complex formatsProprietary table structure recognition and parsing capabilitiesFine-tuned vision-language models for interpreting charts, tables, and figuresIf you are passionate about the convergence of computer vision, natural language processing, and data infrastructure, your contributions at Pulse will directly influence our customers and shape the future of document intelligence.

Jul 30, 2025
Apply
companyHandshake logo
Full-time|On-site|San Francisco, CA

Join Handshake as an Associate Machine Learning Engineer and embark on an exciting journey in the world of artificial intelligence and machine learning. In this role, you will collaborate with a talented team to develop innovative solutions that leverage cutting-edge technologies. You'll have the opportunity to contribute to real-world projects, enhancing your skills while driving impactful results.

Apr 2, 2026
Apply
companyAxiom Bio logo
Full-time|On-site|SF Global HQ

Charter:Join us as a pivotal member of a groundbreaking team dedicated to revolutionizing the field of toxicology by developing advanced AI systems that will replace traditional lab and animal experiments.What We Seek:We are on the lookout for exceptional individuals who can inspire those around them and drive the team towards greatness. Our ideal candidate is someone with high agency—able to identify priorities and take action. We value unique passions and hobbies that may seem niche but reveal a deep commitment and curiosity when explored. Candidates should approach challenges with both intentionality and a sense of wonder, embodying the spirit of exploration akin to an immigrant in a new land or a self-taught coder. A strong desire to learn and grow, coupled with technical excellence and a commitment to mastering one’s craft, is essential. We want those who are willing to tackle daunting challenges and derive satisfaction from the journey as much as the outcome.Your Responsibilities:Establish the foundational end-to-end ML/AI system, including wetlab data generation, data cleaning/processing, model architecture, training, inference, and deployment strategies.Lead innovative research and development initiatives focused on elucidating the interplay between chemistry and biology.Design and scale large models that are pretrained on paired chemistry and biological imagery.Conduct applied research aimed at optimizing, aggregating, and pooling embeddings.Become a thought leader in emerging and underexplored domains, such as molecular graph representations and generative diffusion for biological applications.Develop entrepreneurial skills alongside engineering expertise by creating impactful solutions that deliver substantial value for scientists.Deliver outstanding technology and products that redefine industry standards.Preferred Attributes:...

Nov 14, 2025
Apply
companyUnitX Labs logo
Full-time|On-site|HQ

Position: Machine Learning EngineerAbout Us:At UnitX, we are pioneering the development of cutting-edge physical AI systems designed to automate repetitive visual tasks within manufacturing environments. Our dynamic startup thrives on a diverse team of experts from renowned institutions such as Stanford, MIT, and Google. To date, we have successfully implemented over 1,000 mission-critical AI systems across more than 190 of the world's top manufacturing production lines. Annually, our AI inspection systems oversee the quality of products valued at $15 billion.Join us for a unique opportunity to contribute to groundbreaking computer vision technologies that are transforming global manufacturing efficiency.Your Responsibilities:Design and implement innovative algorithms to analyze raw sensor data for defect detection, focusing on pixel-level precision in high-resolution image and 3D data segmentation.Develop robust software solutions that operate continuously on production lines, executing our algorithms in real-time with decision-making latency under 20ms.Create metrics and tools for comprehensive model performance evaluation, enhancing system visibility and interpretability.Research and explore novel methodologies, pushing the boundaries of AI technology, including Stable Diffusion and SAM, to deliver critical applications in manufacturing.Who You Are:Bachelor's degree in Computer Science, Mathematics, Physics, or a related technical discipline, or equivalent experience showcasing solid mathematical foundations.A minimum of 2 years of experience developing machine learning models focused on computer vision applications in production settings.Deep understanding of Deep Learning theories and practical applications, with proficiency in frameworks such as PyTorch or TensorFlow. Strong Python programming skills for creating efficient, maintainable solutions within extensive codebases.Excellent communication and decision-making abilities, able to articulate experimental rationale and judiciously navigate between exploration and exploitation strategies.Demonstrated resilience and adaptability in complex, uncertain environments.Preferred Qualifications:Experience with large-scale data processing and algorithm optimization.Familiarity with tools for machine learning and data visualization.

Apr 3, 2026

Sign in to browse more jobs

Create account — see all 5,611 results

Tailoring 0 resumes

We'll move completed jobs to Ready to Apply automatically.