About MistralAt Mistral AI, we harness the transformative power of artificial intelligence to streamline workflows, enhance creativity, and promote efficient learning. Our innovative technology seamlessly integrates into everyday operations, delivering tangible benefits.We are committed to democratizing AI through high-performance, optimized, and open-source models and solutions. Our comprehensive AI platform caters to enterprise requirements, whether in on-premises or cloud environments. Our flagship product, le Chat, serves as an AI assistant tailored for both personal and professional use.As a dynamic and collaborative team, we are passionate about AI's potential to revolutionize society. Our diverse workforce excels in competitive environments and is dedicated to fostering innovation. We have teams located across France, the USA, the UK, Germany, and Singapore, sharing a culture that values creativity, humility, and teamwork.Join us in shaping the future of AI at a pioneering company. Together, we can make a significant difference. Discover more about our culture at https://mistral.ai/careers.About The RoleAs an Evaluation Engineer within the Applied AI team, you will be a key member of Mistral's customer-facing technical organization. Our mission is to collaborate directly with enterprise clients, guiding them from pre-sales through implementation to deploy advanced AI solutions that yield measurable business outcomes. Your role will blend deep machine learning expertise with robust customer engagement, functioning like a startup CTO who oversees complete project execution.In the realm of AI, many brilliant ideas remain unmeasured or unrealized. As our inaugural Evaluation Engineer, you will establish the methodology, develop the necessary infrastructure, and clarify what constitutes 'production-ready' across various sectors and applications.Your responsibilities will include designing and implementing evaluation systems that empower our clients to assess model performance tailored to their unique use cases. You'll be instrumental in building a resilient evaluation infrastructure and collaborating closely with both research and client-facing teams.While our research teams develop evaluations for cutting-edge capabilities, our clients seek practical, domain-specific, risk-aware evaluations that are production-ready. You will create frameworks that assess whether a medical summarization model is reliable or if a legal assistant can accurately interpret complex legal texts.
Jan 21, 2026