Freelance AI Evaluation Engineer

MindriftRemote — Stuttgart, Baden-Württemberg, Germany

Remote Contract $50/hr - $50/hr

Clicking Apply Now takes you to AutoApply where you can tailor your resume and apply.

Unlock Your Potential

Generate Job-Optimized Resume

One Click And Our AI Optimizes Your Resume to Match The Job Description.

Is Your Resume Optimized For This Role?

Find Out If You're Highlighting The Right Skills And Fix What's Missing

Experience Level

Mid to Senior

Qualifications

This role is well-suited for experienced developers, software engineers, or test automation specialists seeking part-time, non-permanent project engagements. Ideal candidates will possess:A degree in Computer Science, Software Engineering, or related disciplines. Over 5 years of experience in software development, predominantly in Python (experience with FastAPI, pytest, async/await, subprocess, file operations). A background in full-stack development, including experience in building React-based interfaces (JavaScript/TypeScript) and robust back-end systems. Proficiency in writing tests (functional, integration, and not merely executing them). Experience with Docker containers and familiarity with infrastructure tools (Postgres, Kafka, Redis). An understanding of CI/CD processes (specifically GitHub Actions regarding triggers, labels, and result interpretation). English proficiency at a professional level.

About the job

Please submit your CV in English and indicate your English proficiency level.

Mindrift connects experienced specialists with project-based AI work for technology companies. Assignments focus on testing, evaluating, and improving AI systems. This freelance, project-based position does not offer permanent employment.

Role overview

As a Freelance AI Evaluation Engineer, the primary focus is building a dataset to assess AI coding agents using real-world developer tasks. The work involves designing detailed tasks and evaluation methods in realistic simulated environments.

Main responsibilities

Create virtual companies from high-level plans, including codebases, infrastructure, and realistic context such as conversations, documentation, and tickets that reflect authentic development history.
Develop and refine tasks for different stages of the virtual company. This includes writing prompts, setting evaluation criteria, and ensuring tasks are solvable and assessments are fair.
Design assignments for isolated environments that mimic a developer's workstation, using a Linux machine with development tools (terminal, CLI), MCP servers (repository, task tracker, messenger, documentation), and a real web application codebase.
Build tests that accept all valid solutions and reject incorrect ones, aiming for balanced strictness.
Work with an AI agent to confirm that tests detect real issues, do not overlook errors, and validate correct solutions.
Review code generated by agents, analyze why solutions succeed or fail, and invent edge cases and adversarial scenarios.
Incorporate feedback from expert QA reviewers to improve your work and meet quality standards.

Scope clarifications

This position does not include data labeling.
This position does not cover prompt engineering.
Writing code from scratch is not required. The AI agent handles most coding; your focus is on guidance and evaluation.

Much of the work involves collaborating directly with AI systems, as designing challenges for advanced models requires hands-on interaction with those models.

About Mindrift

Mindrift specializes in connecting skilled professionals with cutting-edge AI projects, focusing on enhancing and evaluating AI systems for prominent technology firms. We prioritize collaboration and innovation, making us a valuable partner in the evolving landscape of artificial intelligence.

Similar jobs

1 - 20 of 1,286,053 Jobs

Select all on this page (20)

Apply

Associate People Solutions

Delivery Hero

Contract|On-site|Maadi

Join Delivery Hero as an Associate in People Solutions and play a pivotal role in enhancing our workforce's experience. We are looking for enthusiastic individuals eager to contribute to our innovative HR solutions and help shape a positive work environment. This position is ideal for those who are passionate about people management and want to grow their careers within a dynamic and supportive team.

Apr 30, 2026

Apply

Senior Quality Assurance Engineer - Frontend Development Team

Acumatica

Full-time|On-site|Belgrade

Join our dynamic Frontend Development Team at Acumatica as a Senior Quality Assurance Engineer. In this pivotal role, you will ensure the highest standards of quality for our innovative software solutions. You will collaborate closely with developers and product managers to identify and rectify bugs, optimize performance, and enhance user experiences. Your expertise will play a crucial role in maintaining our commitment to excellence and customer satisfaction.

Apr 30, 2026

Apply

Senior Product Manager - Retail Selection

Coupang

Full-time|On-site|Taipei, Taiwan

Company Overview Coupang is revolutionizing the shopping experience, aiming to impress each customer from the moment they open our app to the instant their order arrives at their doorstep. Our services in Taiwan include "Rocket Delivery," which guarantees next-day delivery on an extensive range of products at competitive prices, and "Rocket Oversea," which provides free international shipping on millions of top-selling items from Korea, the U.S., and other regions. We are on the lookout for talented individuals to join us in driving Coupang's growth in Taiwan. This is a unique chance to be part of our journey and to create a world where customers ask themselves, "How did I ever live without Coupang?"

Apr 30, 2026

Apply

Strategic Commercial Planning Manager

NielsenIQ

Full-time|On-site|Cotia

As a Strategic Commercial Planning Manager at NielsenIQ, you will be at the forefront of driving strategic initiatives that shape the commercial landscape of our operations. You will collaborate with cross-functional teams to develop and implement robust commercial plans that align with our business objectives. Your analytical skills will be crucial in interpreting market trends and customer insights, enabling you to make informed decisions that enhance our competitive edge.

Apr 30, 2026

Apply

Training Associate, Driver Operations

Lalamove

Full-time|On-site|Dhaka

Join Lalamove, a pioneering force in the logistics industry, where we revolutionize the connection between customers and drivers through cutting-edge technology. Our platform provides a swift and seamless booking experience for delivery and moving services, whether users are at home, at work, or on the move. At Lalamove, we don't just talk about O2O - we embody it!As a prominent global on-demand delivery platform, we have millions of delivery partners fulfilling countless orders every day. With over 1,600 dedicated employees across Southeast Asia and Latin America, our company has achieved unicorn status since 2018, backed by leading venture capitalists and consistently expanding at a remarkable pace.Our core values drive our success: Passion for serving local communities, empowering SMEs and driver partners; Execution and Grit, which allow us to stand out by persevering and pursuing excellence; and Humility, fostering a culture of continuous learning and improvement.We are committed to community engagement. Every day, millions of drivers and customers leverage our technology to connect and facilitate the movement of essential goods. Our mission is to enhance urban living by enabling the rapid and convenient flow of goods. We strive to accomplish this vision with a ‘glocal’ approach, building a strong operations team that tailors our products to local business networks and delivery contractors, while also aiming to enhance our international brand presence.We are on the lookout for a motivated Training Associate, Driver Operations to lead our training initiatives and elevate service quality, contributing to our overall business growth.Are you ready to take on this challenge? Apply now!Only shortlisted candidates will be notified.

Apr 30, 2026

Apply

Manager/Senior Manager – Governance

frpadvisory

Full-time|On-site|London

We are seeking a proactive and experienced Manager/Senior Manager to lead our Governance team at frpadvisory. In this pivotal role, you will be responsible for overseeing governance frameworks, ensuring compliance with regulations, and driving strategic initiatives that promote organizational integrity and effectiveness.Your expertise will enable you to collaborate with cross-functional teams, provide guidance on governance best practices, and contribute to the continuous improvement of our governance processes. If you are passionate about governance and possess strong leadership skills, we want to hear from you!

Apr 30, 2026

Apply

Sample Job Title for Eurofins in Freiberg

Eurofins Scientific

Full-time|On-site|Freiberg

Join Eurofins, a global leader in bioanalytical testing, as a Sample Entry Employee (m/f/d) in our Freiberg location. In this dynamic role, you will be responsible for the intake and processing of various samples, ensuring that quality standards and protocols are met. You will work closely with our laboratory team to contribute to the efficiency and accuracy of our testing services.

Apr 30, 2026

Apply

Mitarbeiter (m/w/d) für Probenregistrierung bei Eurofins

Eurofins Scientific

Full-time|On-site|Freiberg

Werden Sie Teil eines führenden Unternehmens in der Labordiagnostik! Als Mitarbeiter (m/w/d) für die Probenregistrierung bei Eurofins spielen Sie eine entscheidende Rolle in unserem Team. Ihre Aufgaben umfassen die effiziente und präzise Registrierung von Proben, die Unterstützung bei der Datenverwaltung sowie die Koordination mit verschiedenen Abteilungen, um einen reibungslosen Arbeitsablauf sicherzustellen. Wir suchen eine motivierte Person, die gerne im Team arbeitet und ein Auge für Details hat.

Apr 30, 2026

Apply

AV Technical Engineer - Remote Opportunity

Kinly

Full-time|€3.2K/mo - €4.2K/mo|Remote|Remote job

Join our dynamic team at Kinly as an AV Technical Engineer, a role that offers the opportunity to work remotely while delivering top-notch audiovisual and IT solutions to our clients in the Randstad region. Your expertise will be instrumental in ensuring the successful installation and execution of a variety of AV/IT projects.In this pivotal role, you will:Oversee the installation and delivery of AV/IT projects, implementing necessary configuration and software changes.Be responsible for the comprehensive execution of AV/IT installations, including setup, commissioning, operation, assembly, and disassembly of systems of varying complexity.Configure AV/IT systems and implement modifications as required.Ensure project delivery aligns with agreed timelines, budgets, and quality standards.Resolve technical issues and challenges promptly.

Apr 30, 2026

Apply

SEND 1:1 Buddy Support - Paxton

Junior Adventures Group

Full-time|On-site|LONDON

Join our dedicated team as a SEND 1:1 Buddy Support professional in Paxton, where you will play a pivotal role in providing tailored support to children with Special Educational Needs and Disabilities (SEND). Your compassionate approach will help foster an inclusive and engaging environment, ensuring every child can thrive.In this role, you will work closely with children to assist with their daily activities, promote social interactions, and support their educational development. Your insights and dedication will contribute significantly to their growth and well-being.

Apr 30, 2026

Apply

Level 1 Helpdesk Support Technician

Dijital Team Pty Ltd

Full-time|On-site|Colombo

About the Role: Join our dynamic support team as a Level 1 Helpdesk Support Technician. This hands-on position is perfect for a customer-focused individual at the beginning of their technical career, eager to assist users, troubleshoot issues, and establish a solid foundation in telephony and cloud communications. Client Overview: Our client is a premier technology firm based in Australia, specializing in innovative telephony, unified communications, and contact center solutions. Their advanced platforms are designed to integrate seamlessly with top-tier vendors such as Microsoft Teams, Avaya, and Cisco, providing dependable and client-centric communication solutions.

Apr 30, 2026

Apply

Credit Control Clerk

Syscogb

Full-time|On-site|Ashford

Join Syscogb as a Credit Control Clerk, where you will play a crucial role in managing and overseeing our credit control processes. Your responsibilities will include monitoring customer accounts, ensuring timely payments, and assisting with collections. We are looking for a detail-oriented individual who can maintain our financial integrity and contribute to our company's success.

Apr 30, 2026

Apply

Junior Customer Support Analyst

NielsenIQ

Full-time|On-site|Pune

Join our team as a Junior Customer Support Analyst, where you will play a vital role in providing exceptional service and support to our clients. You will be responsible for troubleshooting issues, responding to customer inquiries, and ensuring a positive customer experience. If you are passionate about helping others and eager to grow your career in customer support, we want to hear from you!

Apr 30, 2026

Apply

Research Associate – Legal

Scalable GmbH

Full-time|On-site|München

Join Scalable GmbH as a Research Associate in the Legal department, where you will play a pivotal role in conducting in-depth legal research and analysis. Your contributions will support our team in providing high-quality legal services and solutions to our clients. This position is perfect for individuals with a passion for law and research, looking to kick-start their career in a dynamic environment.

Apr 30, 2026

Apply

Senior Advisory AI Foundry Architect

ServiceNow

Full-time|On-site|Bangalore

Join ServiceNow as a Senior Advisory AI Foundry Architect, where you will play a crucial role in shaping AI-driven solutions for our clients. Leverage your expertise in AI technologies to architect and implement innovative solutions that transform business processes and enhance customer experiences.

Apr 30, 2026

Apply

Compliance Expert (m/w/d) at scalablegmbh | Berlin

scalablegmbh

Full-time|On-site|Berlin

Join our dynamic team at scalablegmbh as a Compliance Expert (m/w/d) in Berlin. In this pivotal role, you will ensure our operations adhere to regulations and industry standards, contributing to our mission of delivering exceptional service and maintaining trust with our clients.

Apr 30, 2026

Apply

Cloud Architect

Avaloq

Full-time|On-site|Bioggio

Join Avaloq as a Cloud Architect and take your career to new heights! In this exciting role, you will leverage your expertise in cloud technologies to design and implement innovative cloud solutions that meet our clients' needs. You will collaborate with cross-functional teams to ensure seamless integration and deployment of cloud services.

Apr 30, 2026

Apply

Service Planner

Faac Technologies

Full-time|On-site|Utrecht, Utrecht, Nederland

Join Faac Technologies as a Service Planner and play a pivotal role in ensuring efficient project execution and resource allocation. You will collaborate with cross-functional teams to optimize workflows and enhance service delivery. Your expertise in planning and organization will help us achieve our strategic goals.

Apr 30, 2026

Apply

Compliance Specialist (m/w/d)

scalablegmbh

Full-time|On-site|München

Join our dynamic team at scalablegmbh as a Compliance Specialist. In this pivotal role, you will ensure that our operations adhere to industry regulations and standards, playing a crucial part in safeguarding our organization’s integrity. You will collaborate with various departments to develop and implement compliance strategies that promote ethical practices and mitigate risks.

Apr 30, 2026

Apply

Advisory AI Foundry Architect

ServiceNow

Full-time|On-site|Bangalore

Join the innovative team at ServiceNow as an Advisory AI Foundry Architect, where you will leverage your expertise in artificial intelligence to drive transformative projects and solutions. You will work closely with clients to understand their challenges and architect AI-driven solutions that enhance operational efficiency and effectiveness.

Apr 30, 2026

Create account — see all 1,286,053 results