About the job
About Our Team
The Codex team at OpenAI is at the forefront of creating cutting-edge AI systems designed to empower users by writing code, understanding software logic, and functioning as intelligent agents for both developers and non-developers. Our mission is to redefine the landscape of code generation and intelligent reasoning, deploying these innovations into real-world applications such as ChatGPT and our API, as well as future tools tailored for intelligent coding. We engage deeply in research, engineering, product development, and infrastructure management, overseeing the entire lifecycle of experimentation, deployment, and iterative improvements on advanced coding functionalities.
About the Position
As a key member of the Codex team, you will enhance the capabilities, performance, and reliability of AI coding models through rigorous research, innovative experimentation, and systematic optimization. Collaborating with top-tier researchers and engineers, you will develop and deploy robust systems that enable millions to code more efficiently and effectively, ensuring these systems are not only powerful but also cost-efficient and ready for production.
We seek individuals who possess a blend of deep curiosity, strong technical skills, and a commitment to impactful work. Whether your expertise lies in machine learning research, systems engineering, or performance optimization, you will be instrumental in advancing the state-of-the-art and translating these breakthroughs into user-friendly applications.
This position is located in San Francisco, CA, and follows a hybrid work model requiring three days per week in the office. We also provide relocation assistance for new hires.
In This Role, You Will:
Design and conduct experiments aimed at enhancing code generation, reasoning, and agentic behaviors in Codex models.
Generate research insights to improve model training, alignment, and evaluation processes.
Identify and rectify inefficiencies throughout the Codex system stack—from agent behavior to large language model inference to container orchestration—paving the way for significant performance enhancements. Develop tools to measure, profile, and optimize system performance on a large scale.
Collaborate across the technical stack to prototype new features, troubleshoot complex issues, and deliver improvements to production environments.
You Will Excel in This Role If You:
Are enthusiastic about exploring and advancing the capabilities of large language models, particularly in software reasoning and code generation.
Possess robust software engineering skills and enjoy rapidly transforming concepts into functional prototypes.
Take a holistic view of performance, effectively balancing speed, efficiency, and quality.

