About the job
About Pathway
Pathway is at the forefront of AI innovation, developing the first post-transformer frontier model that addresses the fundamental memory challenges in artificial intelligence. Unlike traditional transformers that reset to the same state, our groundbreaking architecture allows for genuine continuous learning, infinite context reasoning, and real-time adaptation. We are not merely optimizing existing technology; we are pioneering the future beyond transformers.
Our state-of-the-art architecture surpasses Transformers and provides enterprises with unparalleled insight into model functionality. By integrating our foundational model with the fastest data processing engine available, Pathway empowers organizations to transcend incremental improvements and embrace genuinely contextualized, experience-driven intelligence. We have earned the trust of prestigious organizations such as NATO, La Poste, and Formula 1 racing teams.
Led by co-founder & CEO Zuzanna Stamirowska, a complexity scientist who has assembled a team of AI trailblazers, including CTO Jan Chorowski, who was among the first to apply Attention mechanisms to speech and collaborated with Nobel laureate Geoff Hinton at Google Brain, along with CSO Adrian Kosowski, an esteemed computer scientist and quantum physicist who earned his PhD at the age of 20.
Pathway is supported by top investors and advisors, including Lukasz Kaiser, co-author of the Transformer and a leading researcher behind OpenAI’s reasoning models. Our headquarters are in Palo Alto, California.
The Opportunity
As an AI Benchmark & Datasets Engineer/Researcher, you will be instrumental in designing and implementing rigorous benchmarks while establishing dataset standards. Collaborating closely with our R&D team, you will create the evaluation infrastructure that informs the evolution of Pathway’s post-transformer models.
Your Responsibilities
- Identify, prioritize, and curate relevant benchmarks driven by public and client needs across various target markets.
- Assess potential benchmarks for clarity, data quality, evaluation methodology, and alignment with our model roadmap.
- Execute benchmarks against baseline models to validate setups, reveal edge cases, and mitigate risks during R&D phases.
- Prepare and deliver “benchmark-ready” packages to R&D, including specifications, data, evaluation scripts, expected metrics, and constraints.
- Maintain a unified vocabulary and documentation for benchmarks, datasets, and evaluation formats accessible to both GTM and R&D teams.
- Track and organize benchmark results, model leaderboards, and best practices for various customers and scenarios.
- Contribute to demonstrations and public-facing proof points based on benchmark results.
Your contributions will be crucial in defining and steering the benchmarking process for AI model evaluation, directly influencing our product development and market communication strategies.
