Reflection AI logo

Research Program Manager - Model Evaluations and Safety

Reflection AISan FranciscoNew
On-site Full-time

Clicking Apply Now takes you to AutoApply where you can tailor your resume and apply.


Experience Level

Manager

Qualifications

The ideal candidate will have a strong background in research project management, with a proven ability to navigate complex environments and deliver results. Exceptional communication skills and the ability to work collaboratively with cross-functional teams are essential. A degree in a relevant field is preferred.

About the job

Our Mission

At Reflection AI, we are committed to creating open superintelligence that is accessible to everyone. Our team is dedicated to developing open weight models tailored for individuals, agents, enterprises, and nation states. Our diverse group of AI experts comes from prestigious organizations such as DeepMind, OpenAI, Google Brain, Meta, Character. AI, and Anthropic.

About the Role

As a Research Program Manager (RPM) at Reflection AI, you will play a pivotal role in leading and collaborating with our research and infrastructure teams to expedite the advancement of cutting-edge model development. You will not merely track projects; you will be a catalyst for clarity in uncertain situations, facilitate decision-making processes, and ensure cohesive integration across multiple teams.

This is a crucial position where you will spearhead the establishment of model evaluations and safety protocols from the ground up. You will define evaluation frameworks, construct the operational infrastructure for model safety, and create processes that seamlessly connect evaluations within the model development lifecycle. You will be laying the foundation for how Reflection AI interacts with the broader safety ecosystem. This is quintessential 0-to-1 work.

Possessing a proactive, first-responder mindset, you will take initiative to address challenges head-on, assess situations, and drive resolutions collaboratively.

What You'll Do

  • Develop the essential infrastructure for model evaluations and safety. Formulate evaluation frameworks, outline tooling requirements, and establish operational processes that will guide our assessment of model capabilities, risks, and readiness for deployment.

  • Establish model safety operations as a core function, including setting workflows, review schedules, and decision-making frameworks that link safety evaluations to the model development and release processes.

  • Collaborate with research and engineering leads throughout the pre-training, mid-training, and post-training phases to integrate safety and evaluation checkpoints into the development workflow in a manner that is thorough yet efficient.

  • Lead the scoping and prioritization of evaluation science and infrastructure investments, partnering with technical leads to determine which aspects to develop internally and which to adopt from external sources.

About Reflection AI

Reflection AI is at the forefront of AI research, dedicated to building open superintelligence. Our mission is to democratize access to advanced AI technologies, ensuring that they are available to everyone, from individuals to large organizations. Our team comprises industry leaders from top-tier tech companies, creating an innovative environment for groundbreaking AI development.

Similar jobs

Browse all companies, explore by city & role, or SEO search pages.

Tailoring 0 resumes

We'll move completed jobs to Ready to Apply automatically.