companyDatadog logo

Senior Software Engineer, AI Platform - Evaluation & Annotation

DatadogParis, France; Sophia Antipolis, France
Hybrid Full-time

Clicking Apply Now takes you to AutoApply where you can tailor your resume and apply.


Unlock Your Potential

Generate Job-Optimized Resume

One Click And Our AI Optimizes Your Resume to Match The Job Description.

Is Your Resume Optimized For This Role?

Find Out If You're Highlighting The Right Skills And Fix What's Missing

Experience Level

Senior

Qualifications

What You’ll Do: Design and scale robust evaluation systems to assess the performance and reliability of LLMs and AI agents within Datadog’s product ecosystem. Lead initiatives to develop human-in-the-loop and automated annotation pipelines for model evaluation, ensuring high-quality training and feedback data. Define and implement continuous evaluation workflows in CI/CD and production settings to monitor model behavior in real time. Analyze model outputs for correctness, bias, safety, and reliability, translating insights into actionable enhancements. Collaborate cross-functionally with Applied Scientists, Researchers, product managers, and platform engineers to establish best practices for responsible AI. Mentor team members and contribute to a long-term technical strategy focused on AI quality, trust, and safety. Who You Are: You possess 6+ years of experience in developing large-scale distributed systems or machine learning systems in production. You have experience designing infrastructure to support AI/ML model evaluation, annotation, or benchmarking workflows. Your engineering approach emphasizes system design, long-term maintainability, and reliability. You possess a robust understanding of AI/ML concepts, including evaluation metrics, prompt analysis, and trust and safety challenges. You excel in cross-functional team environments and communicate effectively with both technical and non-technical stakeholders.

About the job

The AI Platform team at Datadog is at the forefront of innovation, developing the infrastructure that fuels next-generation generative AI features across our diverse product range.

As a Senior Software Engineer on the Evaluation and Annotation team, you will play a pivotal role in designing and enhancing the systems that define and evaluate AI quality at scale. Your responsibilities will include creating evaluation pipelines, monitoring model performance, and establishing annotation workflows that assess correctness, safety, bias, and reliability in real-world applications.

Your contributions will directly influence how Datadog develops and sustains reliable AI capabilities. You will collaborate closely with product, machine learning, and infrastructure teams to set quality standards, integrate evaluation systems with our observability platform, and build human-in-the-loop feedback mechanisms that ensure continuous improvement of model behavior.

At Datadog, we cherish our office culture, valuing the relationships we build, the creativity we foster, and the collaboration that arises from working together. Our hybrid workplace model enables employees to cultivate a work-life balance that suits their individual needs.

About Datadog

Datadog is a leading monitoring and security platform for cloud applications, providing observability into your entire infrastructure and application stack.

Similar jobs

Tailoring 0 resumes

We'll move completed jobs to Ready to Apply automatically.