Research Scientist, LLM Post-Training

Lila SciencesCambridge, MA USA; San Francisco, CA USA

Hybrid Full-time $176K/yr - $304K/yr

Clicking Apply Now takes you to AutoApply where you can tailor your resume and apply.

Unlock Your Potential

Generate Job-Optimized Resume

One Click And Our AI Optimizes Your Resume to Match The Job Description.

Is Your Resume Optimized For This Role?

Find Out If You're Highlighting The Right Skills And Fix What's Missing

Experience Level

Experience

Qualifications

Essential QualificationsRobust experience in the training and deployment of LLMs. Hands-on research experience in scalable computational techniques. Publications or contributions to open-source projects are highly valued. Preferred QualificationsExperience using LLMs for scientific or technical datasets. Collaboration in cross-functional machine learning teams.

About the job

Your Contribution at Lila

As a Machine Learning Research Scientist I/II specializing in LLM Inference, you will spearhead research initiatives focused on the training and deployment of large language models for scientific applications.

Your Responsibilities

Develop and refine post-training strategies for LLMs, including Supervised Fine-Tuning (SFT), Reinforcement Learning from Human Feedback (RLHF), and Reinforcement Learning with verifiers.
Design efficient inference mechanisms and compute strategies for complex tool utilization in various environments.
Create scalable evaluation metrics to assess LLM performance in scientific reasoning tasks.
Investigate the boundaries of cutting-edge LLM methodologies for scientific challenges and analyze their limitations.

About Lila Sciences

Lila Sciences is pioneering the first scientific superintelligence platform and autonomous lab dedicated to life sciences, chemistry, and materials science. We are at the forefront of a transformative era of limitless discovery, harnessing AI to revolutionize every facet of the scientific method. Our mission is to introduce scientific superintelligence to address humanity's most pressing challenges, empowering researchers to deliver solutions in human health, climate change, and sustainability at an unprecedented pace. Discover more about our mission at www.lila.ai

Similar jobs

1 - 20 of 796 Jobs

Search for Researcher Agentic Post Training

796 results

Select all on this page (20)

Apply

Researcher, Agentic Post-Training

OpenAI

Full-time|On-site|San Francisco

Role overview OpenAI is looking for a Researcher focused on Agentic Post-Training, based in San Francisco. This role centers on analyzing and improving how AI systems behave after their initial training. The goal is to broaden the capabilities of AI and refine how models respond in complex situations. What you will do Study and assess agentic behaviors in trained AI models Create new approaches to strengthen these behaviors after training Collaborate with a talented team on projects that shape the future of artificial intelligence research Collaboration and impact This position involves hands-on research with other specialists at OpenAI. The work directly supports the advancement of AI capabilities and helps define new benchmarks for agentic performance in artificial intelligence.

Apr 23, 2026

Apply

Post-Training Researcher

Thinking Machines Lab

Full-time|$350K/yr - $475K/yr|On-site|San Francisco

At Thinking Machines Lab, our mission is to empower humanity by advancing collaborative general intelligence. We strive to build a future where everyone has access to the knowledge and tools essential for making AI work effectively for their unique objectives.Our team comprises scientists, engineers, and innovators who have contributed to some of the most widely adopted AI products, including ChatGPT and Character.ai, as well as notable open-weight models like Mistral and popular open-source projects such as PyTorch, OpenAI Gym, Fairseq, and Segment Anything.About the RoleThe Post-Training Researcher position is pivotal to our roadmap. It serves as a crucial connection between raw model intelligence and a system that is genuinely beneficial, safe, and collaborative for human users.This role uniquely combines fundamental research with practical engineering, as we do not differentiate between these functions internally. Candidates will be expected to produce high-performance code and analyze technical reports. This position is ideal for individuals who relish both deep theoretical inquiry and hands-on experimentation, aiming to influence the foundational aspects of AI learning.Note: This position is classified as an 'evergreen role', meaning we continuously accept applications in this research domain. Given the high volume of applications, an immediate match for your skills and experience may not always be available. However, we encourage you to apply; we regularly review submissions and reach out as new opportunities arise. You are welcome to apply again after gaining more experience, but we ask that you refrain from applying more than once every six months. Additionally, specific postings for singular roles may be available for distinct projects or team needs, in which case you are welcome to apply directly in conjunction with this evergreen role.What You’ll DoDevelop and Optimize Recipes: Refine post-training recipes, encompassing various datasets, training stages, and hyperparameters, while assessing their impact on multiple performance metrics.Iterate on Evaluations: Engage in a continuous process of defining evaluation metrics, optimizing them, and recognizing their limitations. You will be accountable for enhancing performance metrics and ensuring they are meaningful.Debug and Analyze: During the fine-tuning of training configurations, you may encounter results that appear inconsistent. You will be responsible for troubleshooting and cultivating a deeper understanding to apply to subsequent challenges.Scale and Investigate: Assess and expand the capabilities of our models while exploring potential improvements.

Nov 23, 2025

Apply

Post-Training Researcher

Thinking Machines Lab

Full-time|$350K/yr - $475K/yr|On-site|San Francisco

At Thinking Machines Lab, our mission is to empower humanity by advancing collaborative general intelligence. We envision a future where everyone can harness the knowledge and tools necessary for AI to serve their unique needs and aspirations. Our team comprises scientists, engineers, and builders who have developed some of the most widely utilized AI products, such as ChatGPT and Character.ai, as well as open-weight models like Mistral and popular open-source projects including PyTorch, OpenAI Gym, Fairseq, and Segment Anything.About the RoleThe role of a Post-Training Researcher is pivotal to our strategic vision. This position serves as the essential link between raw model intelligence and a practical, safe, and collaborative system for human users.Our research in post-training data sits at the intersection of human insights and machine learning. By integrating human and synthetic data techniques alongside innovative methodologies, we capture the subtleties of human behavior to inform and guide our models. We investigate and model the mechanisms that derive value for individuals, enabling us to articulate, predict, and enhance human preferences, behaviors, and satisfaction. Our objective is to translate research concepts into actionable data through meticulously planned data labeling and collection initiatives, while also understanding the science behind high-quality data that effectively trains our models. Additionally, we develop and assess quantitative metrics to evaluate the success and impact of our data and training strategies.Beyond execution, we explore new paradigms for human-AI interaction and scalable oversight, experimenting with optimal ways for humans to supervise, guide, and collaborate with models. This interdisciplinary role merges research, data operations, and technical implementation, pushing the boundaries of aligned, human-centered AI systems.This position combines foundational research and practical engineering, as we do not differentiate between these roles internally. You will be expected to write high-performance code and comprehend technical reports. This role is perfect for individuals who thrive on deep theoretical exploration and hands-on experimentation, eager to shape the foundational aspects of AI learning.Note: This is an evergreen role that we maintain continuously to express interest in this research area. We receive a high volume of applications, and while there may not always be an immediate fit for your skills and experience, we encourage you to apply. We regularly review applications and reach out to candidates as new opportunities arise. You are welcome to reapply after gaining more experience, but please limit applications to once every six months. You may also notice postings for specific roles for targeted positions.

Nov 23, 2025

Apply

Software Engineer - Post-Training Research

OpenAI

Full-time|On-site|San Francisco

OpenAI is hiring a Software Engineer for Post-Training Research in San Francisco. This position centers on improving the performance and capabilities of advanced machine learning models after their initial training phase. Role overview Work closely with a skilled team to explore new ways of strengthening AI systems. The focus is on researching and developing methods that push the boundaries of what these models can achieve once training is complete. Collaboration Expect to contribute to ongoing research efforts and share insights with colleagues who are passionate about advancing AI. Teamwork and knowledge exchange are key parts of this role. Location This position is based in San Francisco.

Apr 29, 2026

Apply

Post-Training Research Scientist

Generalist

Full-time|On-site|San Francisco Bay Area (San Mateo) or Boston (Somerville)

About the RoleIn the realm of machine learning, pretraining lays the foundation for a general model, while post-training refines that model, enhancing its utility, controllability, safety, and performance in real-world applications. As a Post-Training Research Scientist, you will transform large pretrained robot models into production-ready systems through methodologies such as fine-tuning, reinforcement learning, steering, human feedback, task specialization, evaluation, and on-robot validation at scale. This position offers a unique opportunity for individuals from diverse backgrounds to evolve into full-stack ML roboticists, adept at swiftly identifying challenges across machine learning and control domains. This is where innovative research converges with practical implementation.Your Responsibilities Include:Crafting fine-tuning and adaptation strategies tailored for specific robotic tasks and embodiments.Developing methodologies to enhance reliability, robustness, and controllability of robotic systems.Establishing evaluation frameworks to assess real-world robot performance beyond just offline metrics.Collaborating with ML infrastructure teams to optimize inference-time performance, including latency, stability, and memory usage.Utilizing advanced techniques such as imitation learning, reinforcement learning, distillation, synthetic data, and curriculum learning.Bridging the gap between model outputs and tangible outcomes in the physical world.You Might Excel in This Role If You:Possess experience in fine-tuning large models for downstream applications, including RLHF, imitation learning, reinforcement learning, distillation, and domain adaptation.Have a background in embodied AI, robotics, or real-world machine learning systems.Demonstrate a strong commitment to evaluation, benchmarking, and failure analysis.Are comfortable troubleshooting and debugging across the entire ML stack, from analyzing loss curves to understanding robot behavior.Enjoy rapid iteration and thrive on real-world feedback loops.Aspire to connect foundational models with practical deployment scenarios.About GeneralistAt Generalist, we are dedicated to realizing the vision of general-purpose robots. We envision a future where industries and homes benefit from collaborative interactions between humans and machines, enabling us to achieve more than ever before. Our focus is on building embodied foundation models, starting with dexterity, and advancing the frontiers of data, models, and hardware to empower robots to intelligently engage with their environments.

Feb 12, 2026

Apply

Post-Training Research Engineer

Baseten

Full-time|On-site|San Francisco

Join Baseten as a Post-Training Research Engineer and contribute to groundbreaking advancements in machine learning and AI. In this role, you will leverage your engineering skills to analyze and enhance models post-training, ensuring optimal performance and efficiency.

Mar 23, 2026

Apply

Post-Training Research Scientist

Baseten

Full-time|On-site|San Francisco

Join Baseten as a Post-Training Research Scientist, where you will play a vital role in advancing our machine learning capabilities. In this position, you will have the opportunity to conduct innovative research, analyze data, and contribute to the development of cutting-edge technologies. Your work will directly impact our projects and enhance the performance of our models.

Mar 17, 2026

Apply

Post-Training Research Engineer / Scientist

Letta

Full-time|On-site|San Francisco Office

Advancing Self-Improving SuperintelligenceAt Letta, we are on a mission to revolutionize artificial intelligence by creating self-improving agents that learn and adapt like humans. Unlike current AI systems that are often rigid and brittle, our innovative approach aims to build adaptable AI that continually evolves through experience.Founded by the visionaries behind MemGPT at UC Berkeley's Sky Computing Lab, the birthplace of Spark and Ray, we are backed by notable figures in AI infrastructure, including Jeff Dean and Clem Delangue. Our agents are already enhancing production systems for industry leaders such as 11x and Bilt Rewards, continually learning and improving in real-time.Join our elite team of researchers and engineers dedicated to tackling AI's most significant challenges: creating machines that can reason, remember, and learn as humans do.This position requires in-person attendance (no hybrid options) at our downtown San Francisco office, five days a week.

Feb 4, 2025

Apply

Machine Learning Research Scientist / Research Engineer - Post-Training

Scale AI

Full-time|$252K/yr - $315K/yr|On-site|San Francisco, CA; Seattle, WA; New York, NY

At Scale AI, we collaborate with leading AI laboratories to supply high-quality data and foster advancements in Generative AI research. We seek innovative Research Scientists and Research Engineers with a strong focus on post-training techniques for Large Language Models (LLMs), including Supervised Fine-Tuning (SFT), Reinforcement Learning from Human Feedback (RLHF), and reward modeling. This position emphasizes optimizing data curation and evaluation processes to boost LLM performance across text and multimodal formats. In this pivotal role, you will pioneer new methods to enhance the alignment and generalization of extensive generative models. You will work closely with fellow researchers and engineers to establish best practices in data-driven AI development. Additionally, you will collaborate with top foundation model labs, providing critical technical and strategic insights for the evolution of next-generation generative AI models.

Mar 26, 2026

Apply

Post-Training Researcher at Cartesia | San Francisco, CA

Cartesia

Full-time|On-site|*HQ - San Francisco, CA

Join Cartesia: Pioneering AI InnovationAt Cartesia, we are on a mission to redefine the landscape of artificial intelligence. Our goal is to create the next generation of AI that is interactive, ubiquitous, and capable of continuous reasoning across vast streams of audio, video, and text data. With an impressive foundation built on our pioneering work in State Space Models (SSMs) at the Stanford AI Lab, our team is uniquely positioned to advance model architectures that will make on-device reasoning a reality.Backed by prominent investors like Index Ventures and Lightspeed Venture Partners, along with a network of 90+ advisors, including top experts in AI, we are committed to pushing the boundaries of model innovation and systems engineering.About the RoleWe believe that the next significant advancement in model intelligence will stem from enhanced post-training methods and alignment strategies. As a Post-Training Researcher, you will be at the forefront of developing systems and methodologies that ensure our multimodal models are not just adaptive, but also aligned with human intentions.In this role, you will collaborate across machine learning research, alignment, and infrastructure, crafting innovative techniques for preference optimization, model evaluation, and feedback-driven learning. You will investigate how feedback signals can enhance reasoning capabilities across various modalities while establishing the necessary infrastructure to scale and improve these processes.Your contributions will be pivotal in shaping the learning and improvement trajectory of Cartesia’s foundational models, ultimately enhancing their connection with users.Your ImpactLead research initiatives aimed at enhancing the capabilities and alignment of multimodal models.Create cutting-edge post-training methods and evaluation frameworks to assess model advancements.Collaborate closely with research, product, and platform teams to establish best practices for specialized model development.Design, debug, and scale experimental systems to ensure reliability and reproducibility throughout training cycles.Convert research insights into production-ready systems that enhance model reasoning, consistency, and alignment with human values.

Oct 21, 2025

Apply

Staff Machine Learning Research Engineer, Agent Post-training - Enterprise GenAI

Scale AI

Full-time|$218.4K/yr - $273K/yr|On-site|San Francisco, CA; New York, NY

Artificial Intelligence is increasingly becoming a pivotal element across all sectors of society. At Scale AI, we are committed to accelerating the evolution of AI applications. For nearly a decade, we have been the premier AI data foundry, propelling groundbreaking advancements in areas such as generative AI, defense applications, and autonomous vehicles. Following our recent investment from Meta, we are intensifying our efforts to develop advanced post-training algorithms that are essential for sophisticated agents in enterprises worldwide.The Enterprise ML Research Lab is at the forefront of this AI revolution, leveraging a suite of proprietary research, tools, and resources to support our enterprise clients. As a Staff Machine Learning Research Engineer focusing on Agent Post-training, you will be instrumental in creating our next-generation Agent Reinforcement Learning training platform. Your work will enable the training of top-tier Agents that deliver state-of-the-art results in real-world enterprise applications.You will incorporate cutting-edge research into our training framework, empowering ML Research Engineers on the Enterprise AI team to deploy use cases ranging from next-generation AI cybersecurity firewalls to training foundational healthtech search models. If you are passionate about shaping the future of the GenAI movement, we welcome your application!

Mar 26, 2026

Apply

Post-Training Research Scientist

AfterQuery

Full-time|$250K/yr - $450K/yr|On-site|San Francisco

About AfterQuery AfterQuery builds training data and evaluation frameworks used by leading AI labs around the world. The team partners with advanced research groups to create high-quality datasets and run detailed evaluations that go beyond standard benchmarks. As a small, post-Series A company based in San Francisco, every team member plays a key role in shaping how future AI models learn and improve. Role Overview The Post-Training Research Scientist focuses on proving the impact of AfterQuery's datasets. This work involves designing and running training experiments to isolate how specific data influences model performance. Projects span Supervised Fine-Tuning (SFT) and Reinforcement Learning (RL) post-training, with an emphasis on measuring effects on capability, generalization, and alignment. Working closely with partner labs, the scientist turns data into clear, verifiable results: showing exactly how a dataset leads to measurable improvements under defined conditions. The work is experimental and directly shapes the value of AfterQuery's products. What You Will Do Run controlled SFT and RL experiments to measure how datasets affect model outcomes. Quantify gains in areas like reasoning, tool use, long-horizon tasks, and specialized workflows. Share findings with partner labs to support sales and demonstrate value. Work with internal subject matter experts to improve data quality based on experimental results. What We Look For Strong background in LLM training and evaluation methods. Curiosity about how data structure, selection, and quality shape model behavior. Skill in designing experiments, executing quickly, and drawing practical insights from complex results. Comfort working across fields such as finance, software engineering, and policy. Focus on real-world implementation, not just theory. Research experience at the undergraduate or master's level is preferred; a PhD is not required. Compensation $250,000 - $450,000 total compensation plus equity

Apr 14, 2026

Apply

Research Engineer - LLM Post-Training at Lila Sciences | Cambridge, MA / San Francisco, CA

Lila Sciences

Full-time|$116K/yr - $170K/yr|Hybrid|Cambridge, MA USA; San Francisco, CA USA

Your Role at Lila SciencesWe are in search of a talented Machine Learning Research Engineer with a focus on LLM post-training. In this pivotal role, you will architect and oversee large-scale training systems, enhance the performance of extensive models, and incorporate state-of-the-art methodologies to boost efficiency and throughput.Key ResponsibilitiesDevelop Ray-based distributed training infrastructure for LLMs and multi-modal models.Implement performance optimizations for large-scale model training, including training and optimization workflows such as SFT, MoE, and long-context scaling.Manage the orchestration of leading-edge and open-source LLMs alongside intricate compute-intensive tools.Create scalable pipelines for data preprocessing and experiment orchestration, utilizing tools for efficient data loading, pipeline parallelism, and optimizer tuning.Establish system-level performance benchmarks and debugging utilities.

Mar 4, 2026

Apply

Post-Training Applied Researcher

Baseten

Full-time|Remote|San Francisco

Join Baseten as a Post-Training Applied Researcher, where you will be at the forefront of innovative research applications. Your expertise will help bridge the gap between training and real-world applications, making a tangible impact in the industry.

Mar 17, 2026

Apply

Research Engineer / Research Scientist, Post-Training

OpenAI

Full-time|Hybrid|San Francisco

About the TeamJoin the innovative Post-Training team at OpenAI, where we focus on refining and elevating pre-trained models for deployment in ChatGPT, our API, and future products. Collaborating closely with various research and product teams, we conduct crucial research that prepares our models for real-world deployment to millions of users, ensuring they are safe, efficient, and reliable.About the RoleAs a Research Engineer / Scientist, you will spearhead the research and development of enhancements to our models. Our work intersects reinforcement learning and product development, aiming to create cutting-edge solutions.We seek passionate individuals with robust machine learning engineering skills and research experience, particularly with innovative and powerful models. The ideal candidate will be driven by a commitment to product-oriented research.This position is located in San Francisco, CA, and follows a hybrid work model requiring three days in the office each week. Relocation assistance is available for new employees.In this role, you will:Lead and execute a research agenda aimed at enhancing model capabilities and performance.Work collaboratively with research and product teams to empower customers to optimize their models.Develop robust evaluation frameworks to monitor and assess modeling advancements.Design, implement, test, and debug code across our research stack.You may excel in this role if you:Possess a deep understanding of machine learning and its applications.Have experience with relevant models and methodologies for evaluating model improvements.Are adept at navigating large ML codebases for debugging purposes.Thrive in a fast-paced and technically intricate environment.About OpenAIOpenAI is a pioneering AI research and deployment organization dedicated to ensuring that general-purpose artificial intelligence benefits all of humanity. We are committed to pushing the boundaries of AI capabilities while prioritizing safety and human-centric values in our products. Our mission is to embrace diverse perspectives, voices, and experiences that represent the full spectrum of humanity, as we strive for a future where AI is a powerful ally for everyone.

Dec 1, 2025

Apply

Researcher, Training

OpenAI

Full-time|Hybrid|San Francisco

Join Our Innovative TeamAt OpenAI, our Training team is at the forefront of developing advanced language models that drive our research and products, getting us closer to achieving Artificial General Intelligence (AGI). This mission demands a blend of cutting-edge research to enhance our architecture, datasets, and optimization methods, alongside strategic long-term initiatives that boost the efficiency and capabilities of future models. We ensure that our models, including recent breakthroughs like GPT-4-Turbo and GPT-4o, adhere to the highest standards of excellence.Your RoleAs an integral member of our architecture team, you will spearhead architectural advancements for OpenAI’s leading models, enhancing their intelligence and efficiency while introducing novel capabilities. Your expertise in large language model (LLM) architectures and model inference will be crucial as you adopt a hands-on, empirical approach to problem-solving. Whether brainstorming creative breakthroughs, refining foundational systems, designing evaluations, or diagnosing performance issues, your diverse skill set will be invaluable.This position is located in San Francisco, where we embrace a hybrid work environment of three days in the office each week, and we provide relocation support for new hires.Your Key Responsibilities:Innovate, prototype, and upscale new architectures to elevate model intelligence.Conduct and evaluate experiments both independently and collaboratively.Analyze, debug, and enhance both model performance and computational efficiency.Contribute to the development of training and inference infrastructure.Who You Are:You possess experience with significant contributions to major LLM training projects.You excel at independently evaluating and enhancing deep learning architectures.You are driven to responsibly implement LLMs in real-world applications.You are knowledgeable about state-of-the-art transformer modifications aimed at improving efficiency.About OpenAIOpenAI is a pioneering AI research and deployment organization committed to ensuring that artificial general intelligence benefits humanity. We focus on developing safe and effective AI technologies that empower individuals and organizations across the globe.

May 14, 2025

Apply

Post-Training Data Researcher

Cognition

Full-time|On-site|San Francisco Bay Area

Join Cognition as a Post-Training Data Researcher and play a vital role in enhancing our data-driven insights. You will engage in post-training analysis, contributing to the development of our innovative solutions. Your analytical skills will help in interpreting data trends and improving our methodologies. This position is ideal for detail-oriented individuals who are excited about making an impact through data research.

Apr 8, 2026

Apply

Post-Training Applied AI Researcher

Distyl AI

Full-time|$130K/yr - $250K/yr|On-site|San Francisco

About Distyl AIDistyl AI is at the forefront of developing production-grade AI systems that enhance core operational workflows for Fortune 500 companies. Our innovative solutions, powered by a strategic alliance with OpenAI and bolstered by in-house software accelerators, deliver AI systems with rapid time-to-value, often within just a quarter.Our cutting-edge products have successfully transformed the operations of Fortune 500 clients across a multitude of sectors, including insurance, consumer packaged goods, and non-profit organizations. As a member of our team, you will be instrumental in assisting companies to identify, construct, and unlock the potential of their GenAI investments, frequently for the first time. We pride ourselves on being customer-centric, addressing client challenges directly, and holding ourselves accountable for generating financial impact while enhancing the experiences of end-users.Led by distinguished leaders from prestigious organizations such as Palantir and Apple, Distyl is also supported by renowned investors including Lightspeed, Khosla, Coatue, Dell Technologies Capital, Nat Friedman (former CEO of GitHub), and Brad Gerstner (Founder and CEO of Altimeter), alongside board members from over a dozen Fortune 500 companies.

Oct 16, 2025

Apply

Research Scientist, LLM Post-Training

Lila Sciences

Full-time|$176K/yr - $304K/yr|Hybrid|Cambridge, MA USA; San Francisco, CA USA

Your Contribution at LilaAs a Machine Learning Research Scientist I/II specializing in LLM Inference, you will spearhead research initiatives focused on the training and deployment of large language models for scientific applications.Your ResponsibilitiesDevelop and refine post-training strategies for LLMs, including Supervised Fine-Tuning (SFT), Reinforcement Learning from Human Feedback (RLHF), and Reinforcement Learning with verifiers.Design efficient inference mechanisms and compute strategies for complex tool utilization in various environments.Create scalable evaluation metrics to assess LLM performance in scientific reasoning tasks.Investigate the boundaries of cutting-edge LLM methodologies for scientific challenges and analyze their limitations.

Mar 4, 2026

Apply

Post-Training Research Scientist at Genmo | San Francisco

Genmo

Full-time|On-site|San Francisco HQ

Genmo is a pioneering research laboratory dedicated to advancing cutting-edge models for video generation, with the mission of unlocking the creative potential of Artificial General Intelligence (AGI). We invite you to be a part of our innovative team, where you can contribute to shaping the future of AI and expanding the horizons of video generation technology.Role Overview:We are on the lookout for a talented Research Scientist to join our dynamic team, specializing in alignment and post-training methodologies for large-scale video generation models. In this pivotal role, you will be instrumental in ensuring our diffusion-based video models consistently deliver high-quality, physically accurate, and safe outputs that align with human values and preferences.Key Responsibilities:Lead groundbreaking research initiatives in alignment and post-training strategies for video generation models, prioritizing enhanced quality, reliability, and alignment with human intent.Design and implement supervised fine-tuning and reinforcement learning from human feedback (RLHF) pipelines for video generation models.Establish robust evaluation frameworks to assess model alignment, safety, and output quality.Create and optimize data collection pipelines for capturing human feedback and preferences.Conduct experiments to validate alignment techniques and their scalability.Collaborate with cross-functional teams to incorporate alignment enhancements into our production workflow.Stay abreast of the latest developments by reviewing academic literature in generative AI and alignment.Mentor junior researchers and promote a culture of responsible AI development.Partner closely with product teams to ensure that alignment methods enhance model capabilities.Qualifications:Ph.D. in Computer Science, Artificial Intelligence, Machine Learning, or a closely related field.Demonstrated excellence with a strong publication record in top-tier conferences (e.g., NeurIPS, ICML, ICLR) focusing on reinforcement learning, alignment, or generative models.Extensive experience in implementing and optimizing large-scale training pipelines utilizing PyTorch.In-depth understanding of reinforcement learning techniques, especially RLHF.Proficient in distributed training systems and conducting large-scale experiments.Proven ability to design and implement robust evaluation strategies for models.

Feb 22, 2026

Create account — see all 796 results