company

Machine Learning Inference Engineer

Rhoda AIPalo Alto
On-site Full-time

Clicking Apply Now takes you to AutoApply where you can tailor your resume and apply.


Unlock Your Potential

Generate Job-Optimized Resume

One Click And Our AI Optimizes Your Resume to Match The Job Description.

Is Your Resume Optimized For This Role?

Find Out If You're Highlighting The Right Skills And Fix What's Missing

Experience Level

Experience

Qualifications

3+ years of experience in ML infrastructure, MLOps, or backend systems. Experience deploying and managing ML inference workloads in production. Strong proficiency with Kubernetes and containerized deployment pipelines. Experience with cloud providers (e.g., AWS, GCP) and GPU orchestration. Familiarity with common ML frameworks (e.g., PyTorch, TensorFlow) and model serving tools (e.g., Triton, TorchServe, Ray Serve). Strong debugging instincts and ownership mentality — comfortable driving issues to resolution across the stack.

About the job

At Rhoda AI, we are pioneering the future of humanoid robotics by establishing a comprehensive stack that includes advanced, software-defined hardware along with foundational models and video world models to drive our innovations. Our robots are engineered to be versatile, capable of navigating complex real-world scenarios that extend beyond traditional training environments. Our interdisciplinary research team, featuring experts from prestigious institutions such as Stanford, Berkeley, and Harvard, is at the forefront of large-scale learning, robotics, and systems engineering. With over $400 million raised, we are making significant investments in research and development, hardware innovation, and scaling our manufacturing capabilities to bring our vision to life.

We are seeking a motivated Machine Learning Inference Engineer to join our team and contribute to the development and operation of the inference systems that power our automation stack. You will play a crucial role in ensuring the efficient and reliable execution of large foundation models, collaborating closely with our robotic platforms and internal task tools.

Key Responsibilities:

  • Develop and maintain infrastructure for model inference across both cloud and on-premises environments.

  • Optimize the latency, throughput, and reliability of deployed machine learning models.

  • Design and scale services for serving diverse foundation models in both research and production contexts.

  • Collaborate with research and robotics teams to enhance inference optimization and integration.

  • Create tools for model deployment, version control, and observability to facilitate rapid iteration cycles.

  • Contribute to the robustness and scalability of the inference stack as model complexity and deployment demands evolve.

Qualifications:

  • Minimum of 3 years of experience in machine learning infrastructure, MLOps, or backend systems.

  • Proven experience in deploying and managing machine learning inference workloads in production environments.

  • Excellent knowledge of Kubernetes and containerized deployment pipelines.

  • Familiarity with cloud service providers such as AWS and GCP, including GPU orchestration capabilities.

  • Experience with popular ML frameworks including PyTorch and TensorFlow, as well as model serving tools like Triton, TorchServe, and Ray Serve.

  • Strong debugging capabilities and a proactive ownership mindset, comfortable resolving issues across the technology stack.

About Rhoda AI

Rhoda AI is at the cutting edge of robotics, developing a comprehensive platform for humanoid robots that seamlessly integrates advanced hardware and machine learning models. Our mission is to revolutionize physical work through innovative robotics solutions.

Similar jobs

Tailoring 0 resumes

We'll move completed jobs to Ready to Apply automatically.