About the job
At LILT, we are creating a worldwide network of domain specialists dedicated to delivering high-quality AI evaluations for training, benchmarking, red-teaming, and continuous model monitoring. We invite automotive experts to lend their insights to enhance human-in-the-loop AI evaluation workflows utilized by leading enterprises and hyperscalers.
This position is ideal for individuals who possess in-depth knowledge of automotive systems, vehicle operations, manufacturing processes, and mobility technologies in practical environments. Your expertise will be instrumental in evaluating, assessing, and refining multilingual AI systems.
Your contributions will directly impact the quality, safety, and readiness for deployment of multilingual AI models.
This role has two distinct tracks based on experience and responsibility.
Track A: Automotive AI Rater
Raters are responsible for executing structured evaluation tasks guided by well-defined rubrics and instructions.
Responsibilities
Evaluate AI outputs pertaining to automotive engineering, vehicles, and mobility subjects.
Engage in structured scoring, comparison, classification, and judgment tasks.
Assess technical accuracy, safety considerations, clarity, and adherence to automotive standards.
Identify inaccuracies, unsafe guidance, incorrect specifications, or misleading automotive data.
Consistently apply domain-specific automotive guidelines across evaluations.
Ideal Background
Professionals such as automotive engineers, manufacturing specialists, vehicle systems experts, or mobility practitioners.
Familiarity with vehicle systems, automotive production, maintenance, or mobility technologies.
Meticulous attention to detail and the ability to work with structured evaluation criteria.
Track B: Automotive AI Evaluator (Senior Track)
Evaluators provide advanced domain oversight and influence evaluation methodologies.
Responsibilities
Validate and enhance evaluation rubrics and handle edge cases.
Conduct adjudication in instances of disagreement among raters.
Perform error analysis and qualitative assessments of model behavior.
