Tailoring 0 resumes…

We'll move completed jobs to Ready to Apply automatically.

Freelance Agent Evaluation Engineer at Toloka AI | Remote | RoboApply Jobs

Freelance Agent Evaluation Engineer

Toloka AIRemote — United States

Remote Contract $80/hr - $80/hr

Clicking Apply Now takes you to AutoApply where you can tailor your resume and apply.

Unlock Your Potential

Generate Job-Optimized Resume

One Click And Our AI Optimizes Your Resume to Match The Job Description.

Is Your Resume Optimized For This Role?

Find Out If You're Highlighting The Right Skills And Fix What's Missing

Experience Level

Experience

Qualifications

To thrive in this role, candidates should have a degree in Computer Science or a related field, complemented by at least 5 years of software development experience, primarily in Python. A solid understanding of both front-end and back-end development is essential, along with the ability to write effective tests and manage Docker containers. Familiarity with CI/CD processes, especially GitHub Actions, is also required. Proficiency in English, at least at the B2 level, is necessary for effective communication.

About the job

Please submit your CV in English and indicate your level of English proficiency.

At Mindrift, we specialize in connecting talented professionals with innovative, project-based AI opportunities from leading tech companies, focusing on the evaluation, testing, and enhancement of AI systems. This role is project-based and operates under a freelance agreement, which does not establish an employment relationship with Toloka or our clients.

Role Overview

As a Freelance Agent Evaluation Engineer, you will design intricate coding test cases that challenge AI coding systems to their maximum potential:

Critically assess and enhance realistic coding tasks grounded in actual production codebases, considering realistic parameters and information sources.
Develop comprehensive functional tests that validate true end-to-end functionality and edge cases, moving beyond basic checks.
Design “fair but challenging” problems where the AI has all necessary context, yet must diligently piece information from various files and external sources, requiring complex reasoning.
Evaluate AI failures to discern the model's strengths and weaknesses.
Iterate on your work based on evaluations from expert QA reviewers who assess your contributions against seven quality criteria.

Qualifications

This role suits experienced developers, software engineers, and test automation specialists who are open to part-time, non-permanent projects. The ideal candidates will possess:

A degree in Computer Science, Software Engineering, or a related discipline.
5+ years of experience in software development, predominantly in Python (including pytest, async/await, subprocess, and file operations).
A strong background in Full-Stack development, with equal expertise in developing React-based interfaces and robust back-end systems.
Proficiency in writing tests (functional and integration), not just executing them.
Experience with Docker containers (for running evaluations locally).
Understanding of CI/CD processes, particularly with GitHub Actions (triggers, labels, and result analysis).
English proficiency at B2 level or higher.

Application Process

To apply, submit your application → complete qualification assessments → select a project → manage tasks at your convenience within project deadlines → receive payment for your contributions.

Project Time Expectations

Estimated project tasks will take around 20 hours to complete, depending on complexity. This is an estimate; you determine your schedule. All tasks must be submitted by the deadlines and meet acceptance criteria for approval.

Compensation

Freelance contributions compensated, with rates potentially reaching up to $80/hour* (project and task-based).
Compensation may be fixed per project or vary by task.
Some projects may offer incentive payments.

*Note: Rates may vary based on expertise, skills assessment, location, and experience.

About Toloka AI

Mindrift connects experts with cutting-edge, project-based AI opportunities within top-tier technology firms, dedicated to the testing, evaluation, and enhancement of AI systems.

Freelance Agent Evaluation Engineer

Toloka AIRemote — United States

Remote Contract $80/hr - $80/hr

Clicking Apply Now takes you to AutoApply where you can tailor your resume and apply.

Unlock Your Potential

Generate Job-Optimized Resume

One Click And Our AI Optimizes Your Resume to Match The Job Description.

Is Your Resume Optimized For This Role?

Find Out If You're Highlighting The Right Skills And Fix What's Missing

Experience Level

Experience

Qualifications

About the job

Please submit your CV in English and indicate your level of English proficiency.

Role Overview

As a Freelance Agent Evaluation Engineer, you will design intricate coding test cases that challenge AI coding systems to their maximum potential:

Critically assess and enhance realistic coding tasks grounded in actual production codebases, considering realistic parameters and information sources.
Develop comprehensive functional tests that validate true end-to-end functionality and edge cases, moving beyond basic checks.
Design “fair but challenging” problems where the AI has all necessary context, yet must diligently piece information from various files and external sources, requiring complex reasoning.
Evaluate AI failures to discern the model's strengths and weaknesses.
Iterate on your work based on evaluations from expert QA reviewers who assess your contributions against seven quality criteria.

Qualifications

This role suits experienced developers, software engineers, and test automation specialists who are open to part-time, non-permanent projects. The ideal candidates will possess:

A degree in Computer Science, Software Engineering, or a related discipline.
5+ years of experience in software development, predominantly in Python (including pytest, async/await, subprocess, and file operations).
A strong background in Full-Stack development, with equal expertise in developing React-based interfaces and robust back-end systems.
Proficiency in writing tests (functional and integration), not just executing them.
Experience with Docker containers (for running evaluations locally).
Understanding of CI/CD processes, particularly with GitHub Actions (triggers, labels, and result analysis).
English proficiency at B2 level or higher.

Application Process

To apply, submit your application → complete qualification assessments → select a project → manage tasks at your convenience within project deadlines → receive payment for your contributions.

Project Time Expectations

Compensation

Freelance contributions compensated, with rates potentially reaching up to $80/hour* (project and task-based).
Compensation may be fixed per project or vary by task.
Some projects may offer incentive payments.

*Note: Rates may vary based on expertise, skills assessment, location, and experience.

About Toloka AI

Mindrift connects experts with cutting-edge, project-based AI opportunities within top-tier technology firms, dedicated to the testing, evaluation, and enhancement of AI systems.

Freelance Agent Evaluation Engineer

Unlock Your Potential

Experience Level

Qualifications

About the job

About Toloka AI

Executive Chef at Relais & Châteaux | Baltimore

Territory Sales Manager

Physical Therapist at Integrity Rehab Group | Fayetteville, NC

Team Leader at Greene King | Greenwich

Part-Time P&C Administrator at Primark | Bury

Team Leader at Greene King | Soho

Talent Partner - In-house Recruiter at ennovationHUB | Barcelona

Part-Time Chef at Greene King | Walnut Tree

Part-Time Chef at Greene King | Walnut Tree

Cooks

Desk Investigator Officer

Senior Scheduling Coordinator

Commercial Cleaner

Anti-Fraud Officer - Transaction Monitoring

Project Management Information Systems Specialist

Merchant Relations Officer - Bekasi

Associate Talent Acquisition - 6 Month Contract

Bar & Waiting Staff at Greene King | Chichester

Bar and Waiting Staff at Greene King | Chichester

Door Attendant at Raffles The Red Sea | Umluj

Freelance Agent Evaluation Engineer

Unlock Your Potential

Experience Level

Qualifications

About the job

About Toloka AI

Freelance Agent Evaluation Engineer

Unlock Your Potential

Experience Level

Qualifications

About the job

About Toloka AI

Freelance Agent Evaluation Engineer

Unlock Your Potential

Experience Level

Qualifications

About the job

About Toloka AI