1 - 20 of 57 Jobs

Search for Freelance AI Trainer Specializing in Mathematics and Python

57 results

Apply
company
Contract|Remote|Remote — Uruguay

Role Overview Toloka AI is looking for a Freelance AI Trainer with a strong background in Mathematics and Python. This remote contract role is open to candidates based in Uruguay. What You Will Do Develop clear, practical training materials focused on Mathematics and Python for AI applications Lead virtual workshops for learners at different skill levels Help students work through complex AI topics and answer their questions Impact Your work will support learners worldwide and help shape the next generation of AI education.

Apr 20, 2026
Apply
company
Part-time|$17/hr - $17/hr|Remote|Remote — Uruguay

Please submit your CV in English and indicate your English proficiency level. About Mindrift Mindrift connects skilled professionals with project-based AI work for top technology companies. Projects focus on testing, evaluating, and improving AI systems. Engagements are project-specific and do not represent permanent employment. Role Overview This freelance role involves designing and validating computational biology challenges for AI training. All work is remote and project-based. Key Responsibilities Create original computational biology problems that reflect real-world research workflows Develop challenges that require Python programming to solve, using tools such as Numpy, SciPy, and BioPython Ensure problems are computationally intensive and cannot be solved manually in a short timeframe Design tasks involving complex reasoning in areas like bioinformatics, systems biology, and molecular modeling Base challenges on real research questions or practical scenarios from biological practice Validate solutions in Python with standard computational biology libraries Document each problem clearly and provide verified correct answers Who Should Apply This project suits biology professionals with strong Python skills who are interested in flexible, part-time work. Preferred qualifications: Degree in Biology or a related field Proficiency in Python for numerical validation; experience with MATLAB, R, C, SQL, Numpy, Pandas, SciPy, or other relevant libraries is also welcome Minimum 2 years of experience in applied, research, or teaching roles Background in biological data analysis and algorithm development Familiarity with bioinformatics tools and computational methods Strong written English skills at C1 level or above How the Process Works Apply Complete qualifications Join a project Carry out assigned tasks Receive compensation Project Time Commitment During active project phases, tasks typically require 10–20 hours per week. Actual workload depends on project needs and is not guaranteed outside of active periods. Compensation Contributors can earn up to $17 per hour based on contribution level and pace. Pay may vary by project, depending on complexity and expertise required. Other projects on the platform may offer different rates according to their specific needs.

Apr 22, 2026
Apply
company
Part-time|$17/hr - $17/hr|Remote|Remote — Uruguay

Please submit your CV in English and indicate your English proficiency level. About Mindrift Mindrift connects skilled specialists with project-based roles in artificial intelligence for leading technology companies. The focus is on testing, evaluating, and improving AI systems. These are project-based assignments, not permanent positions. Role Overview This freelance opportunity is for physicists with Python experience who can contribute on a part-time, non-permanent basis. Projects vary, and contributors may: Create original computational physics problems that reflect real research processes Design problems requiring Python programming to solve, using libraries such as Numpy, SciPy, or Sympy Ensure problems are computationally demanding and not solvable by hand within a reasonable timeframe Develop challenges involving advanced reasoning in mechanics, electromagnetism, thermodynamics, and quantum mechanics Base tasks on actual research questions or practical physics applications Verify solutions in Python, using established physics simulation libraries Document problem statements clearly and provide accurate, validated answers What We Look For Degree in Physics (theoretical, experimental, or computational) or a closely related field Proficiency in Python for numerical problem-solving; experience with MATLAB, R, C, SQL, Numpy, Pandas, SciPy, Stata, or other relevant programming languages is also welcome At least 2 years of professional experience in research, teaching, or applied roles Background in numerical simulation techniques Ability to design problems that mirror real-world physics research Creativity in developing problems across multiple physics disciplines Familiarity with physics modeling and approximation methods Strong written English skills at C1 level or higher How Projects Work Apply Pass qualifications Join a project Complete assigned tasks Receive compensation Time Commitment During active phases, tasks typically require 10–20 hours per week. Actual workload may vary depending on project needs and is not guaranteed outside active project periods. Compensation Contributors can earn up to $17 per hour, depending on experience and task completion rate. Compensation varies by project based on scope, complexity, and required expertise. Other projects on the platform may offer different pay rates according to their specific requirements. Location This is a remote, project-based role open to candidates based in Uruguay.

Apr 22, 2026
Apply
company
Part-time|$17/hr - $17/hr|Remote|Remote — Uruguay

Please submit your CV in English and specify your English proficiency level. About Mindrift Mindrift connects specialists with project-based AI work for technology companies. Projects focus on testing, evaluating, and improving AI systems. This is freelance, project-based work and does not constitute permanent employment. Role Overview: Freelance AI Trainer – Statistics & Python Specialist Work remotely from Uruguay on a variety of unique project tasks. Assignments center on computational statistics and Python programming, with a focus on real-world mathematical research and problem solving. Key Responsibilities Create original computational statistics problems that reflect authentic mathematical research workflows. Design problems requiring Python programming to solve, using libraries such as NumPy, SciPy, and SymPy. Develop computationally intensive problems that are not feasible to solve manually within practical timeframes. Formulate challenges involving complex reasoning in areas like number theory, combinatorics, graph theory, and numerical analysis. Base problems on real research obstacles or practical mathematical applications. Validate solutions using Python and standard mathematical libraries. Document problem statements clearly and provide verified, correct answers. Desired Qualifications Degree in Statistics or a related field. Proficiency in Python for numerical validation. Experience with MATLAB, R, C, SQL, NumPy, Pandas, SciPy, or similar libraries is also valued. At least 2 years of professional experience in applied, research, or teaching roles. Strong written English skills (C1 level or higher). Professional certifications (such as CMME, SAS Certifications, CAP) and experience with international or applied projects are a plus. Application Process Apply Pass qualification(s) Join a project Complete assigned tasks Receive payment Time Commitment During active project phases, tasks typically require 10–20 hours per week. Actual workload depends on project needs and is not guaranteed outside of active periods. Compensation Earn up to $17 per hour, depending on contribution level and pace. Rates may vary by project based on complexity, scope, and required expertise. Other projects on the platform may offer different compensation levels based on their requirements.

Apr 22, 2026
Apply
company
Contract|Remote|Remote — Uruguay

Role Overview Toloka AI is looking for a Freelance AI Trainer with a background in Mechanical Engineering and Python programming. This remote contract is open to candidates based in Uruguay. What You Will Do Design and deliver training programs focused on AI applications in mechanical engineering. Develop course materials and hands-on exercises that highlight real-world use of Python in mechanical systems. Guide participants through practical projects and problem-solving sessions. Oversee the implementation of training sessions and support learners as they build new skills. What You Bring Professional experience in mechanical engineering. Strong Python programming skills. Ability to explain technical concepts clearly and create engaging educational content. Interest in AI technologies and their applications within engineering. Location This freelance role is remote and open to candidates located in Uruguay.

Apr 23, 2026
Apply
company
Contract|$17/hr - $17/hr|Remote|Remote — Uruguay

Please submit your CV in English and state your English proficiency level. About Mindrift Mindrift connects skilled professionals with project-based AI work for leading technology companies. The focus is on testing, evaluating, and improving AI systems. This is a freelance, project-based role, not a permanent position. Role Overview This freelance opportunity is designed for optical engineers with strong Python skills who want flexible, part-time work. Projects involve creating and validating computational physics problems that mirror real research and practical applications. What You Will Do Create original computational physics problems that reflect actual research workflows. Develop problems requiring Python programming for solutions, using libraries such as Numpy, SciPy, and Sympy. Ensure problems are computationally intensive and not solvable manually within a few days or weeks. Design challenges involving complex reasoning in mechanics, electromagnetism, thermodynamics, and quantum mechanics. Base problems on real research scenarios or practical physics applications. Validate solutions with Python and established physics simulation libraries. Document problem statements clearly and provide accurate solutions. What We Look For Degree in Physics (theoretical, experimental, or computational) or a related discipline. Proficiency in Python for numerical validation. Experience with MATLAB, R, C, SQL, Numpy, Pandas, SciPy, domain-specific libraries, Stata, or other programming languages is also valued. At least 2 years of professional experience in applied work, research, or teaching. Background in numerical simulation techniques. Ability to design problems that represent authentic research workflows in physics. Creative problem-solving across multiple areas of physics. Understanding of physics modeling and approximation methods. Excellent written English skills (C1 level or above). How the Process Works Apply → Pass qualifications → Join a project → Complete tasks → Get compensated Time Commitment During active projects, tasks typically require 10–20 hours per week. Actual workload depends on project needs and may vary. Compensation Earn up to $17 per hour, depending on expertise and task completion rate. Pay may differ by project scope, complexity, and required skills. Other projects on the platform may offer different rates. Location This is a remote position based in Uruguay.

Apr 22, 2026
Apply
company
Part-time|$17/hr - $17/hr|Remote|Remote — Uruguay

Please submit your CV in English and specify your English proficiency level. About Mindrift Mindrift connects skilled professionals with project-based AI work for leading tech companies. Projects focus on testing, evaluating, and improving AI systems. This is a contract-based role, not a permanent position. Role Overview As a Freelance Electrical Engineer & Python Specialist - AI Trainer, you will design and validate engineering challenges to help train AI models. This work is remote and open to candidates based in Uruguay. Main Responsibilities Create computational engineering problems that mirror real-world engineering tasks Develop Python programming assignments for engineering calculations and simulations Ensure each problem requires rigorous computational thinking, such as numerical methods or iterative solutions Design tasks centered on system design, optimization, and analysis Base challenges on authentic research questions or practical engineering scenarios Validate solutions using Python and standard engineering libraries Document problem statements clearly and provide verified solutions Requirements Degree in Electrical Engineering or a related discipline Strong Python skills for numerical validation; experience with MATLAB, R, C, SQL, Numpy, Pandas, SciPy, or other relevant libraries is also valued At least 2 years of relevant professional experience (applied, research, or teaching) Understanding of practical engineering constraints and approximations Excellent written English (C1+ level) Professional certifications (such as CMME, SAS Certifications, CAP) and experience with international or applied projects are a plus How Projects Work Apply Complete qualification(s) successfully Join a project Complete assigned tasks Receive payment Time Commitment During active projects, expect to spend about 10–20 hours per week, depending on project needs. This is an estimate, not a guaranteed workload. Compensation Earn up to $17 per hour, depending on expertise and pace. Actual rates may vary by project based on scope, complexity, and required experience. Other projects on the platform may offer different pay structures.

Apr 22, 2026
Apply
company
Contract|Remote|Remote — Uruguay

Toloka AI is looking for a freelance AI trainer who combines civil engineering experience with strong Python programming skills. This contract role is fully remote and open to professionals based in Uruguay. Role overview This position centers on supporting learners as they apply artificial intelligence to civil engineering projects. The focus is on practical instruction and guidance, helping others use Python effectively in real-world engineering scenarios. What you will do Train and mentor individuals or groups in the use of AI for civil engineering applications Share hands-on Python programming knowledge relevant to engineering tasks Contribute to the advancement of AI technologies by guiding learners and providing feedback Location This freelance role is remote and available to candidates residing in Uruguay.

Apr 28, 2026
Apply
company
Contract|$17/hr - $17/hr|Remote|Remote — Uruguay

Please submit your CV in English and specify your English proficiency level. About Mindrift Mindrift connects skilled specialists with project-focused AI initiatives for leading technology companies. The team works on testing, evaluating, and improving AI systems. This is a project-based, freelance role, not a permanent position. Role Overview Freelance Quantum Research Scientists & AI Trainers contribute to a range of projects, each with distinct challenges. Typical tasks include: Creating original computational physics problems modeled on authentic research methods Developing problems that require Python programming to solve, often using libraries like Numpy, SciPy, and Sympy Designing computationally intensive tasks that cannot be solved manually within a reasonable timeframe Formulating problems demanding advanced reasoning in mechanics, electromagnetism, thermodynamics, and quantum mechanics Basing tasks on real-world research scenarios or practical physics applications Validating solutions with Python and established physics simulation libraries Documenting problem statements clearly and providing verified solutions Who We’re Looking For This project suits quantum researchers with strong Python skills, interested in part-time, temporary assignments. Preferred qualifications: Degree in Physics (theoretical, experimental, or computational) or a closely related field Proficiency in Python for numerical validation; experience with MATLAB, R, C, SQL, Numpy, Pandas, SciPy, or similar languages/tools is also acceptable At least 2 years of professional experience, including research or teaching Background in numerical simulation techniques Ability to design problems that mirror real-world physics research workflows Creative approach to problem formulation across different physics domains Understanding of physics modeling and approximation methods Strong written English skills (C1 level or above) How to Apply Apply Complete qualification(s) Join a project Fulfill assigned tasks Receive compensation Time Commitment During active project periods, expect to spend about 10–20 hours per week, depending on project requirements. Actual workload may vary. Compensation Earn up to $17 per hour, depending on contribution and pace. Rates vary by project based on scope, complexity, and expertise required. Other projects on the platform may offer different compensation structures. Location: Remote , Uruguay

Apr 22, 2026
Apply
company
Part-time|$17/hr - $17/hr|Remote|Remote — Uruguay

Please submit your CV in English and specify your English proficiency level. About Mindrift Mindrift connects experienced specialists with project-based AI roles at leading technology companies. Work focuses on testing, evaluating, and improving AI systems. All roles are project-based and do not represent permanent employment. Role Overview As a freelance Material Scientist with Python expertise, contribute to projects that challenge your engineering and programming skills. Each assignment brings new technical problems to solve and opportunities to shape AI tools for engineering applications. Key Responsibilities Create original material engineering problems modeled after real-world workflows Design problems that require Python programming for engineering calculations and simulations Ensure problems involve computationally intensive tasks, numerical methods, or iterative solutions Formulate questions focused on system design, optimization, and analysis Base problems on actual research challenges or practical engineering scenarios Validate solutions using Python and standard engineering libraries Document problem statements and provide verified correct answers Who Should Apply This freelance, part-time role suits material scientists and engineers who are comfortable with Python and interested in flexible, project-based work. Degree in Material Science or a related field Strong Python skills for numerical validation; experience with MATLAB, R, C, SQL, Numpy, Pandas, SciPy, or similar tools is also valued At least 2 years of professional experience in applied, research, or teaching positions Solid grasp of practical engineering constraints and real-world approximations Excellent written English (C1 level or higher) Project Commitment Expect to spend approximately 10–20 hours per week during active project phases. Actual workload may vary depending on project requirements and timing. Compensation Earn up to $17 per hour, depending on experience and contribution speed. Compensation may differ for other projects on the platform, based on their complexity and requirements. How to Apply Apply with your English CV and proficiency level Complete qualification steps Join a project when selected Finish assigned tasks Receive payment Location: Remote , Uruguay

Apr 24, 2026
Apply
company
Contract|Remote|Remote — Uruguay

Role Overview Toloka AI is seeking a Chemistry and Python Specialist to help train AI models focused on chemistry. This freelance role is fully remote within Uruguay and offers flexible hours.

Apr 20, 2026
Apply
company
Part-time|$17/hr - $17/hr|Remote|Remote — Uruguay

Please submit your CV in English and include your English proficiency level. About Mindrift Mindrift connects skilled professionals with project-based AI work for leading technology companies. Projects focus on testing, evaluating, and improving AI systems. This is a project-based, non-permanent role. Role Overview As a Freelance Research Physicist with Python expertise, each project brings new challenges. Contributors may work on tasks such as: Designing original computational physics problems that reflect real research workflows Developing Python-based problems using libraries like Numpy, SciPy, and Sympy Ensuring problems are computationally intensive and not easily solved by hand Creating scenarios requiring advanced reasoning in mechanics, electromagnetism, thermodynamics, and quantum mechanics Drawing inspiration from real research questions or practical physics applications Validating solutions in Python with established physics simulation libraries Documenting problem statements clearly and providing verified solutions Who Should Apply This contract role suits physicists with Python experience who are interested in flexible, part-time projects. Typical qualifications include: Degree in Physics (theoretical, experimental, or computational) or a related field Proficiency in Python for numerical validation; experience with MATLAB, R, C, SQL, Numpy, Pandas, SciPy, or similar tools is also valued At least 2 years of professional experience in applied research or teaching Background in numerical simulation techniques Ability to design problems that mirror real-world research workflows Creative approach to problem design across multiple physics areas Familiarity with physics modeling and approximation methods Excellent written English (C1 level or higher) How to Get Started Apply, complete the required qualifications, join a project, and finish assigned tasks to receive payment. Project Details Tasks are estimated to require 10–20 hours per week during active phases, depending on project needs. This is an approximate range and not a guaranteed workload. Compensation Contributors may earn up to $17 per hour, depending on expertise and pace. Actual pay can vary by project scope, complexity, and required skills. Other projects on the platform may offer different rates based on their needs.

Apr 22, 2026
Apply
company
Part-time|$17/hr - $17/hr|Remote|Remote — Uruguay

Please submit your CV in English and specify your English proficiency level. About the Project Mindrift, in partnership with toloka-ai, connects skilled specialists to project-based AI work for leading technology companies. This role focuses on testing, evaluating, and improving AI systems. It is a freelance, project-based position, not a permanent job. What You Will Do Design original optical problems that mirror real-world physics research workflows Develop computationally intensive problems that cannot be solved manually within days or weeks Create scenarios requiring advanced reasoning in mechanics, electromagnetism, thermodynamics, and quantum mechanics Base problems on current research challenges or practical applications in optics and physics Document problem statements clearly and provide verified, correct answers Who We’re Looking For This freelance opportunity suits optical engineers interested in part-time, non-permanent projects. Ideal candidates will have: A degree in Physics (Theoretical, Experimental, or Computational) or a related field At least 2 years of experience in applied, research, or teaching roles Hands-on experience with numerical simulation techniques Skill in designing problems that reflect real research workflows in physics Creative thinking for problem design across different physics domains Familiarity with physics modeling and approximation methods Strong written English skills (C1 level or higher) How the Process Works Apply Pass qualifications Engage in a project Complete assigned tasks Receive compensation Time Commitment During active project phases, tasks typically require about 10–20 hours per week. This is an estimate and not a guaranteed workload. Hours apply only while the project is active. Compensation Contributors can earn up to $17 per hour, depending on experience and pace. Pay may vary across projects based on their scope, complexity, and required expertise. Other projects on the platform may offer different compensation levels depending on requirements. Location This is a remote role based in Uruguay.

Apr 24, 2026
Apply
company
Contract|Remote|Remote — Uruguay

Role overview Toloka AI seeks a Freelance Junior Journalist - AI Trainer to join projects that help advance artificial intelligence. This remote role is available to candidates living anywhere in Uruguay. The position involves producing and improving content that supports AI development and helps inform the wider community. What you will do Research topics connected to artificial intelligence and data training Write and edit articles aimed at educating and engaging Toloka AI’s audience Provide feedback and insights on AI training processes Requirements Interest in both journalism and technology Strong skills in writing and editing Ability to work independently from within Uruguay Curiosity about AI and its effects

Apr 27, 2026
Apply
company
Contract|Remote|Remote — Uruguay

Toloka AI seeks a Freelance English Writer - AI Trainer to join its remote team in Uruguay. This contract position focuses on producing and refining English-language content that supports the development of AI training programs. Writers in this role help shape the quality and clarity of AI models by working closely with a collaborative team. Key responsibilities Create and edit English-language materials designed for AI training purposes Work with fellow writers and subject matter experts to ensure content is accurate and clear Offer feedback and suggestions to help improve the performance of AI models Location This is a remote role for candidates based in Uruguay.

Apr 27, 2026
Apply
company
Part-time|$21/hr - $21/hr|Remote|Remote — Uruguay

Company: MindriftLocation: Remote , Uruguay Role Overview Mindrift is looking for a Freelance Python Data Scraping Engineer to support the Tendem project. This part-time remote role focuses on building and refining data scraping workflows for a hybrid AI and human system. As an AI Pilot, you will collaborate with Tendem Agents who handle repetitive tasks, while you bring technical expertise and critical thinking to ensure data quality and actionable results. What You Will Do Manage complex data extraction from challenging websites, delivering accurate and structured datasets. Use internal tools such as Apify and OpenRouter, as well as custom workflows, to collect, validate, and process data according to project requirements. Adjust scraping strategies to handle dynamic or interactive sites, including those with JavaScript-rendered content and frequent changes. Apply rigorous data validation and consistency checks across multiple sources before delivering datasets. Scale scraping operations for large volumes using batching or parallel processing, monitor for failures, and maintain stability when site structures change. What Mindrift Offers Mindrift connects technical experts with AI initiatives from leading technology companies. The mission is to advance Generative AI by tapping into real-world expertise worldwide. Compensation Earn up to $21 per hour, depending on experience, project complexity, and contribution speed. Compensation may vary for other projects based on their scope and required skills. Requirements At least 3 years of experience in data engineering, web scraping, or automation. How to Apply Submit an application to be considered for projects that fit your background and schedule. Whether your strengths are in coding, automation, or refining AI outputs, your contributions will help strengthen AI capabilities for real-world use.

Apr 24, 2026
Apply
company
Contract|Remote|Remote — Uruguay

Role Overview Toloka AI is looking for a Freelance Legal Consultant with deep knowledge of US law. This remote position is based in Uruguay. The consultant will use legal expertise to support the training of AI systems, helping improve their understanding of legal concepts and frameworks relevant to the United States. What You Will Do Apply US legal knowledge to guide the development and training of AI models Collaborate with teams from different disciplines to ensure accuracy and relevance in legal data Offer insights on legal nuances and frameworks to help shape AI-driven legal technology

Apr 23, 2026
Apply
company
Contract|$50/hr - $50/hr|Remote|Remote — Uruguay

Mindrift connects experienced professionals with project-based AI roles for leading technology companies. The team focuses on testing, evaluating, and improving AI systems to support innovation and efficiency. Role Overview The Freelance Supply Chain Consultant - AI Trainer will use hands-on procurement and supply chain expertise to help shape and assess AI-driven solutions. This includes designing realistic disruption scenarios, outlining expected outcomes, and evaluating AI recommendations for accuracy and relevance. What You Will Do Develop supply chain disruption scenarios based on real-world manufacturing and procurement challenges, such as supplier delays, order changes, logistics issues, and quality failures. Define expected results and practical mitigation strategies for each scenario. Review and assess AI-generated recommendations, checking alignment with established business logic. Evaluate outputs for accuracy and relevance within ERP systems, especially Microsoft Dynamics 365, Coupa, Jaggaer, and Ariba (SAP). Contribute to structured data creation and validation, following established guidelines. What We Look For At least 4 years of hands-on experience in procurement, supply chain, or purchasing, ideally in manufacturing. Strong understanding of procurement workflows: purchase orders, vendor management, inventory control, and production planning. Practical experience with ERP systems such as SAP, Oracle, or Microsoft Dynamics 365. Ability to design and analyze supply chain disruption scenarios and create mitigation strategies. Knowledge of disruption types, including delays, shortages, quality issues, and logistics challenges. Familiarity with Incoterms and transportation or logistics management. Understanding of Bill of Materials (BOM) structures and production planning. Experience tracking supplier performance metrics like OTIF, lead times, and quality scores. Analytical mindset for evaluating AI outputs against business logic. Background in data validation, structured data entry, or annotation tasks. Excellent written English communication skills. How the Process Works Apply Pass qualifications Join a project Complete tasks Get compensated Time Commitment Projects typically require about 10-20 hours per week during active phases. Actual workload may vary and is not guaranteed outside of these periods. Compensation Contributors can earn up to $50 per hour, depending on experience and project details. Location This is a remote role based in Uruguay.

Apr 24, 2026
Apply
company
Part-time|$10/hr - $10/hr|Remote|Remote — Uruguay

Submit your resume in English, including your proficiency level. Toloka offers freelance online tasks that contribute to the development of artificial intelligence. The company operates remotely and collaborates with major technology firms, engaging skilled individuals in Generative AI projects from around the world. Toloka Annotators participate in projects that require human input and careful attention to detail. The team supports AI progress by working directly with leading tech companies on a variety of assignments. Role overview The Freelance AI Trainer - English Annotator helps AI systems improve their understanding of language and context. This remote role is based in Uruguay and involves taking part in online projects. Typical tasks include: Evaluating AI-generated content Checking for factual accuracy Comparing different AI responses when required Key responsibilities Review data sets, which may include text, images, or videos Label and categorize content according to detailed project guidelines Identify and flag content that is factually incorrect, sensitive, inappropriate, or unclear Important: This is project-based freelance work. Tasks are available only when projects are active, and invitations to participate depend on your profile and current needs. Compensation Rates vary by project complexity and required skills. AI trainers can earn up to $10 per hour. For more information, visit https://toloka.ai/.

Apr 28, 2026
Apply
company
Contract|$21/hr - $21/hr|Remote|Remote — Uruguay

Please submit your CV in English and include your English proficiency level. Mindrift connects skilled professionals with project-based AI roles at leading tech companies, focusing on the assessment, testing, and improvement of AI systems. This is a project-based contract, not a permanent position. Role Overview As a Freelance AI Agent Evaluation Engineer, you will help build a dataset that measures how well AI coding agents perform real-world software development tasks. The work centers on designing complex tasks and evaluation criteria within detailed simulated environments. What You Will Do Create virtual companies from high-level blueprints, including codebases, infrastructure, and realistic context (conversations, documentation, tickets) to simulate authentic development environments with history. Curate and adjust tasks at various stages of the virtual company. This includes developing prompts, defining evaluation criteria, and ensuring tasks are solvable and fairly assessed. Design challenges in isolated settings that mimic a developer's workstation: a Linux environment with development tools (terminal, CLI), MCP servers (repository, task tracker, messenger, documentation), and a real web application codebase. Develop tests that reliably accept all correct solutions and reject incorrect ones, aiming for a balance between strictness and fairness. Work alongside an AI agent on these tests, making sure the agent catches real issues, does not accept poor solutions, and passes valid ones. Review code generated by AI agents, analyze both successes and failures, and design edge cases and adversarial scenarios to further challenge the models. Iterate on your approach based on feedback from expert QA reviewers who assess your work for quality. What This Role Does Not Include Data labeling Prompt engineering Writing code from scratch (the AI agent will handle most coding; your focus is on guiding and evaluating) Much of the work involves close collaboration with advanced AI models, crafting tasks that push their capabilities.

Apr 24, 2026

Sign in to browse more jobs

Create account — see all 57 results

Tailoring 0 resumes

We'll move completed jobs to Ready to Apply automatically.