Aleph Alpha logoAleph Alpha logo

AI Software Engineer - Model Evaluation (f/m/d)

Aleph AlphaHeidelberg
On-site Full-time

Clicking Apply Now takes you to AutoApply where you can tailor your resume and apply.


Experience Level

Senior

Qualifications

Your ProfileTo excel in this position, you should possess a robust foundation in AI model evaluation, programming proficiency, and a deep understanding of machine learning principles. A PhD or equivalent experience in a related field is preferred. Familiarity with statistical analysis, evaluation metrics, and benchmarking methodologies is essential. Excellent communication skills and the ability to work collaboratively within a dynamic team are critical.

About the job

Our Mission

Aleph Alpha stands at the forefront of foundation model pre-training in Europe. Our clients across finance, manufacturing, and public administration require models that not only understand the German language but also comply with European regulations and operate effectively in high-stakes environments. Our innovative journey is based in Heidelberg.

As we expand our pre-training team, we seek a dedicated professional to take charge of model evaluation, shaping the metrics that define success, developing the systems for measurement, and providing our training team the insights necessary for confident iterations.

The Role

As a Senior AI Engineer focused on Pre-training Evaluation, you will engage in comprehensive evaluation processes, from designing methodologies to implementation and analysis. Your work will involve tasks such as benchmark curation, evaluating the effectiveness of various metrics and their predictive capabilities for downstream performance, and optimizing evaluation pipelines and dashboards.

We are searching for a candidate who blends extensive research experience with exceptional engineering skills. Your evaluations will significantly influence our training direction, data priorities, and resource allocation, allowing you to impact the models we deliver directly.

This role is part of Aleph Alpha Research.

Your Responsibilities

  • Own benchmarks end-to-end: Select, implement, and maintain the evaluation suite used during pre-training, encompassing dataset curation, scoring infrastructure, and result analysis.
  • Build evaluation infrastructure: Develop and optimize pipelines for evaluations against training checkpoints, ensuring speed, reliability, and reproducibility.
  • Design aggregation and reporting: Define how benchmark results inform training decisions, building tools for result interpretation.
  • Close capability gaps: Collaborate with product and post-training teams to pinpoint areas for improvement and create benchmarks to measure progress.
  • Own German evaluation: Guarantee thorough assessments of German language capabilities, integral to our value proposition.
  • Correlate signals: Identify which pre-training metrics predict downstream and system-level performance.

About Aleph Alpha

Aleph Alpha is a pioneering company in Europe dedicated to advanced foundation model pre-training, focusing on delivering powerful solutions that meet the unique linguistic and regulatory demands of various industries.

Similar jobs

Browse all companies, explore by city & role, or SEO search pages. View directory listings: all jobs, search results, location & role pages.

Tailoring 0 resumes

We'll move completed jobs to Ready to Apply automatically.