companyR1 logo

Software Engineer - MLOps

R1New York
On-site FullTime

Clicking Apply Now takes you to AutoApply where you can tailor your resume and apply.


Unlock Your Potential

Generate Job-Optimized Resume

One Click And Our AI Optimizes Your Resume to Match The Job Description.

Is Your Resume Optimized For This Role?

Find Out If You're Highlighting The Right Skills And Fix What's Missing

Qualifications

Minimum of 5 years of software engineering experience, with at least 2 years in MLOps. Strong understanding of production ML systems, platform engineering practices, system reliability protocols, and post-training lifecycle management.

About the job

About Phare & R1

At Phare, we are revolutionizing the healthcare industry with our groundbreaking Revenue Operating System. Our innovative platform leverages AI technology to simplify hospital billing and reimbursement, delivering accuracy and fairness. As part of R1, a leading healthcare claims management company serving hundreds of systems nationwide, we blend the agility of a startup with the resources of an established healthcare organization. Join us as we strive to create a more equitable and efficient model for healthcare payments.

The Role

As a Software Engineer focused on MLOps, you will be responsible for overseeing the production runtime of Phare’s machine learning stack. Your key tasks will include deploying, serving, and scaling models across various inference endpoints and managing batch/streaming workflows. You will create robust delivery pipelines with automated rollouts and rollbacks, ensure service level objectives for latency and availability, and implement comprehensive observability solutions. You will utilize Terraform, Kubernetes, and CI/CD to strengthen our platform and guarantee reproducible, auditable ML releases.

We are looking for candidates at various seniority levels, from mid-level to staff positions. A minimum of 5 years of software engineering experience, including at least 2 years in MLOps, is required.

This position requires in-person attendance in our SoHo office at least 3 days a week.

About You

You possess a solid background in managing ML systems at scale, where both uptime and efficient feedback loops are crucial alongside accuracy. Your experience includes:

  • Production ML: Proven expertise in deploying and operating models on GPUs in production environments, including APIs and batch/streaming inference.

  • Platform Engineering: Strong proficiency in Docker/Kubernetes, Infrastructure as Code (e.g., Terraform), and CI/CD processes for services and model artifacts, ensuring environment consistency, reproducible releases, and robust model/versioning with data lineage.

  • System Reliability: Experience in implementing progressive delivery with automated rollouts/rollbacks, and establishing end-to-end observability (metrics, logs, traces, and model telemetry for drift and regression), coupled with actionable alerting, runbooks, and incident response protocols.

  • Post-Training Lifecycles: Competence in managing model registries, stage gates, and designing scheduled or event-driven retraining processes.

About R1

Phare is at the forefront of innovation in healthcare technology, creating the first Revenue Operating System that streamlines hospital billing and reimbursement using advanced AI capabilities. As part of R1, we benefit from the expansive reach of a major healthcare claims management organization, allowing us to combine the dynamism of a startup with the stability of an established enterprise. Our mission is to transform healthcare payments into a faster and fairer process.

Similar jobs

Tailoring 0 resumes

We'll move completed jobs to Ready to Apply automatically.