Qualifications

QualificationsStrong foundation in machine learning and software engineering principles. Experience with evaluation frameworks and performance metrics. Proficiency in programming languages such as Python or Java. Familiarity with cloud-based platforms and tools. Excellent problem-solving skills and attention to detail. Ability to work collaboratively in a team-oriented environment. Desire to learn and grow within a fast-paced startup atmosphere.

About the job

Fieldguide develops software that streamlines work for assurance and audit professionals, with a focus on cybersecurity, privacy, and financial auditing. The team aims to automate and improve workflows, helping experts deliver more value while reducing complexity in their daily tasks.

Based in San Francisco, Fieldguide operates as a remote-first company, with employees contributing from locations across the United States. The company is supported by investors such as Goldman Sachs Alternatives, Bessemer Venture Partners, 8VC, Floodgate, Y Combinator, DNX Ventures, Global Founders Capital, and several notable individuals.

Diversity and inclusion are central values at Fieldguide. Team members bring a wide range of backgrounds and experiences, united by a shared commitment to improving audit and advisory work. The company values ambition, humility, and mutual support, seeking people who care about both high standards and each other’s growth.

At this early stage, Fieldguide offers the chance to contribute to the future of business trust. The company’s solutions can cut audit workloads significantly, supporting better work-life balance and a sense of purpose for those who share its values.

Role overview

The AI Engineer in Quality Assurance (Evals) will help build and maintain the evaluation infrastructure for Fieldguide’s AI agents. These agents support complex audit and advisory tasks, and are used by over 50 of the top 100 accounting and consulting firms. Fieldguide serves a large and evolving market, backed by leading investors.

This role centers on creating a unified platform for evaluating AI models, automating pipelines, and setting up systems to collect production feedback. The goal is to enable rapid testing of new models against important workflows, supporting fast iteration and enterprise-level reliability.

Work in this position sits at the intersection of machine learning engineering, observability, and quality assurance. Collaboration with other engineering teams is key to maintaining the quality that clients expect from Fieldguide’s products.

Requirements

Interest in building and maintaining evaluation infrastructure for AI agents
Ability to collaborate closely with engineering teams
Openness to candidates at all experience levels; role seniority will be matched to background and goals
Location: San Francisco, CA (in-person encouraged) or remote within the USA

AI Engineer in Quality Assurance (Evals)

Experience Level

Qualifications

About the job

Role overview

Requirements

About Fieldguide

Similar jobs