About the job
Why Join Nebius?
Nebius is at the forefront of revolutionizing cloud computing to empower the global AI economy. We develop cutting-edge tools and resources that enable our clients to tackle real-world challenges and revolutionize industries, all while minimizing infrastructure costs and reducing the necessity for large in-house AI/ML teams. Our workforce operates at the forefront of AI cloud infrastructure, collaborating with some of the most experienced and innovative leaders and engineers in the industry.
Our Work Environment
Based in Amsterdam and publicly listed on Nasdaq, Nebius boasts a global presence with R&D centers across Europe, North America, and Israel. Our diverse team of over 1400 professionals includes more than 400 highly skilled engineers excelling in both hardware and software development, complemented by an in-house AI R&D team.
The Role
This opportunity is part of Nebius AI R&D, a team dedicated to applied research in AI. Recent applied research published by our team includes:
- Applying reinforcement learning for agent training in long-context multi-turn scenarios
- Dramatically scaling task data collection to enhance reinforcement learning for software engineering (SWE) agents
- Developing a decontaminated evaluation system for SWE agents that is consistently updated
- Investigating test-time guided search methods to create more powerful agents
The outcomes of our research frequently pave the way for collaboration with adjacent teams, allowing practical applications of our findings.
We are currently seeking experienced ML engineers at the senior and staff levels to engage in research projects that include:
- Guided search and reinforcement learning for agentic systems
- Reinforcement learning applications for reasoning models
- Web-scale data collection for agent training
- Efficient model distillation techniques
Key Responsibilities:
- Conducting experiments to identify effective methods for training large language models based on interaction traces with various environments
- Exploring guided generation and search methodologies within trajectory spaces
- Devising innovative strategies for mining relevant data at web scale

