companyOpenAI logo

Researcher in Alignment Science

OpenAISan FranciscoNew
Hybrid Full-time

Clicking Apply Now takes you to AutoApply where you can tailor your resume and apply.


Unlock Your Potential

Generate Job-Optimized Resume

One Click And Our AI Optimizes Your Resume to Match The Job Description.

Is Your Resume Optimized For This Role?

Find Out If You're Highlighting The Right Skills And Fix What's Missing

Experience Level

Entry Level

Qualifications

The ideal candidate will possess a strong background in machine learning and artificial intelligence, with hands-on experience in reinforcement learning. A proven ability to design experiments and analyze data critically is essential. Excellent communication skills and the ability to work collaboratively in a team-oriented environment are crucial.

About the job

Team focus

The Alignment Science team at OpenAI works on intent alignment for artificial intelligence. Their goal is to develop models that accurately interpret and follow user requests, while maintaining high standards for safety and transparency. As AI models become more advanced, the team prioritizes keeping them honest about their capabilities and limitations, ensuring close alignment with user intent.

Research spans both theoretical and applied domains. The team shares findings publicly and integrates new alignment techniques into OpenAI's deployed models. Recent efforts have targeted model honesty, studying how models admit mistakes, avoid generating false information, and resist manipulation. The team is looking for scalable solutions to improve instruction following and reliability in AI systems.

Quantitative research is a core part of this work, especially reinforcement learning and related training and evaluation methods that support safer, more reliable AI interactions.

Role overview

This Researcher in Alignment Science position (which may be titled Research Engineer or Research Scientist) centers on designing and running experiments to improve how models follow user intent. Responsibilities include developing training protocols, building evaluation frameworks, and strengthening research infrastructure to support effective alignment in new models.

The job is based in San Francisco, CA, with a hybrid schedule requiring three days per week in the office. OpenAI provides relocation support for new hires. Exceptional remote candidates who can work independently and collaborate closely with the team will also be considered.

Main responsibilities

  • Design and conduct experiments on alignment techniques, including intent following, honesty, calibration, and robustness.
  • Train and assess models using reinforcement learning and other empirical machine learning approaches.
  • Develop evaluation metrics for failure modes such as hallucination, compliance gaps, reward exploitation, and covert actions.
  • Investigate methods to encourage models to self-verify and report limitations honestly, including confession-style training objectives.
  • Create monitoring tools and interventions at inference time to help models act as intended.

About OpenAI

OpenAI is at the forefront of artificial intelligence research and development, aiming to ensure that AI benefits all of humanity. Our work involves creating cutting-edge technologies that are safe, interpretable, and aligned with human values. By prioritizing transparency and ethical considerations, OpenAI strives to lead in the responsible advancement of AI.

Similar jobs

Tailoring 0 resumes

We'll move completed jobs to Ready to Apply automatically.