Staff Machine Learning Research Scientist - LLM Evaluations

Scale AISan Francisco, CA; Seattle, WA; New York, NY

On-site Full-time $280K/yr - $380K/yr

Clicking Apply Now takes you to AutoApply where you can tailor your resume and apply.

Unlock Your Potential

Generate Job-Optimized Resume

One Click And Our AI Optimizes Your Resume to Match The Job Description.

Is Your Resume Optimized For This Role?

Find Out If You're Highlighting The Right Skills And Fix What's Missing

Experience Level

Mid to Senior

Qualifications

You will: Lead investigations into the effectiveness and limitations of current LLM evaluation techniques. Design and implement innovative evaluation benchmarks for large language models, focusing on instruction adherence, factual accuracy, robustness, and fairness. Build and maintain strong relationships with clients and cross-functional teams to drive collaborative projects. Work alongside internal teams and external partners to refine evaluation metrics and develop standardized protocols. Create scalable and reproducible evaluation pipelines utilizing modern machine learning frameworks. Publish findings in prestigious AI conferences and contribute to open-source benchmarking efforts. Mentor and lead research scientists and engineers, providing technical guidance across various projects. Engage actively with the ML research community to stay updated on emerging developments and contribute to the advancement of LLM evaluation science. Excel in a dynamic, fast-paced startup environment and commit to achieving impactful results.

About the job

At Scale AI, we are the premier partner for data and evaluation in the rapidly evolving field of artificial intelligence. Our commitment to advancing the assessment and benchmarking of large language models (LLMs) positions us at the forefront of AI innovation. We are dedicated to creating leading-edge LLM evaluation methodologies that set new benchmarks for model performance.

Our research teams collaborate with the top AI laboratories in the industry to provide high-quality data, accelerate progress in generative AI research, and inform what excellence looks like in this domain. As a Staff Machine Learning Research Scientist on our LLM Evals team, you will spearhead the creation of novel evaluation methodologies, metrics, and benchmarks to assess the strengths and weaknesses of cutting-edge LLMs. Your work will shape our internal strategies and influence the broader AI research community, making this role essential for establishing best practices in data-driven AI development.

About Scale AI

Scale AI is recognized as a leader in providing data and evaluation solutions for next-generation AI technologies. Our mission is to enhance the evaluation and benchmarking of large language models, ensuring fairness, scalability, and rigor in assessment methodologies.

Staff Machine Learning Research Scientist - LLM Evaluations

Scale AISan Francisco, CA; Seattle, WA; New York, NY

On-site Full-time $280K/yr - $380K/yr

Clicking Apply Now takes you to AutoApply where you can tailor your resume and apply.

Unlock Your Potential

Generate Job-Optimized Resume

One Click And Our AI Optimizes Your Resume to Match The Job Description.

Is Your Resume Optimized For This Role?

Find Out If You're Highlighting The Right Skills And Fix What's Missing

Experience Level

Mid to Senior

Staff Machine Learning Research Scientist - LLM Evaluations

Unlock Your Potential

Experience Level

Qualifications

About the job

About Scale AI

SAP Functional Consultant

SAP ABAP Developer at Accenture Federal Services | Washington, DC

Customer Service Cashier at Pilot Company | Yucca

Chef.fe de Service Entretien

Senior Systems Engineer - Foreign Object Detection

Service Food Manager

Associate - Architectural Design

Chef.fe de Service Alimentaire

Quantity Surveyor at AECOM | Singapore

IT Security Consultant (Cyber Access Management, IAM, PAM, CyberArk)

Global Client Executive - LinkedIn Talent Solutions

Lead Service Delivery Manager

Dynamic Retail Team Member at JB Hi-Fi | Warrawong

Client Partner

Retail & Kitchen Team Member at Ampol | Dandenong

Subway Sandwich Artist at Pilot Company | Yucca

Assistant Store Manager Opportunity in Jurien Bay

Janitorial Maintenance Associate

Casual Child & Family First Aid Facilitator

Internal Sales Specialist / Storeperson

Staff Machine Learning Research Scientist - LLM Evaluations

Unlock Your Potential

Experience Level

Qualifications

About the job

About Scale AI

Staff Machine Learning Research Scientist - LLM Evaluations

Unlock Your Potential

Experience Level

Qualifications

About the job

About Scale AI

Staff Machine Learning Research Scientist - LLM Evaluations

Unlock Your Potential

Experience Level

Qualifications

About the job

About Scale AI