About the job
About Pathway
At Pathway, we are pioneers in AI technology, developing the first post-transformer frontier model that addresses the fundamental memory challenges faced by traditional AI systems. Unlike conventional transformers that reset each time, our innovative architecture allows for genuine continuous learning, extensive contextual reasoning, and real-time adaptability. We are not just enhancing existing technology; we are creating the future of AI.
Our groundbreaking architecture, detailed in our research paper, significantly outperforms standard Transformer models, granting enterprises comprehensive visibility into operational mechanics. By integrating our foundational model with the fastest data processing engine available, Pathway empowers organizations to transcend mere incremental improvements and achieve truly contextualized, experience-driven intelligence. We proudly serve esteemed clients such as NATO, La Poste, and Formula 1 racing teams.
Founded by Zuzanna Stamirowska, a leading complexity scientist, our team comprises AI visionaries, including CTO Jan Chorowski, a key figure in applying Attention mechanisms to speech processing, who has collaborated with Nobel laureate Geoff Hinton at Google Brain, and CSO Adrian Kosowski, an accomplished computer scientist and quantum physicist who earned his PhD at just 20 years old.
Supported by prominent investors and advisors, including Lukasz Kaiser, co-author of the Transformer architecture, we are headquartered in Palo Alto, California.
The Opportunity
We are inviting applications for the role of AI Benchmark & Dataset Engineering Interns to assist in the design and implementation of benchmarking processes critical to model evaluation.
Your Responsibilities
- Proactively identify, prioritize, and curate relevant public and client-driven benchmarks tailored to our target use cases and markets.
- Assess candidate benchmarks for clarity, data quality, evaluation methodology, and alignment with our model roadmap.
- Conduct benchmarks using baseline models to validate setups, reveal edge cases, and mitigate risks in R&D initiatives.
- Prepare and deliver “benchmark-ready” packages (specifications, data, evaluation scripts, expected metrics, constraints) to R&D.
- Maintain consistent terminology and documentation regarding benchmarks, datasets, and evaluation formats for use by GTM and R&D teams.
- Organize and track benchmark results, model leaderboards, and define standards of excellence for various customers and scenarios.
- Contribute to demonstrations and public-facing materials based on benchmark results.
This role is essential in shaping and advancing the benchmarking process for AI model evaluation, directly impacting our development, communication, and customer engagement strategies.
