About the job
About Pathway
Pathway is pioneering the next generation of AI with our innovative post-transformer frontier model that addresses critical memory challenges in artificial intelligence. Unlike traditional transformers that reset to the same state repeatedly, our architecture facilitates genuine continuous learning, infinite context reasoning, and real-time adaptability. We are not merely enhancing existing technology; we are forging the future of AI.
Our groundbreaking architecture surpasses traditional Transformer models, providing enterprises with complete transparency into their functionality. By merging our foundational model with the swiftest data processing engine available, Pathway empowers organizations to transition from gradual improvements to a new realm of contextualized, experience-driven intelligence. Esteemed institutions such as NATO, La Poste, and Formula 1 racing teams trust Pathway.
Founded by Zuzanna Stamirowska, a complexity scientist, Pathway’s team includes AI trailblazers like CTO Jan Chorowski, known for applying Attention mechanisms to speech and collaborating with Nobel laureate Geoff Hinton at Google Brain, and CSO Adrian Kosowski, a distinguished computer scientist and quantum physicist who earned his PhD at just 20 years old.
Supported by prominent investors and advisors, including Lukasz Kaiser, co-author of the Transformer model and a pivotal researcher for OpenAI’s reasoning frameworks, Pathway is headquartered in Palo Alto, California.
The Opportunity
We are looking for enthusiastic AI Benchmark & Dataset Engineering interns to assist in defining and executing benchmarking processes for model evaluation.
Your Responsibilities
- Identify, prioritize, and curate pertinent public and client-driven benchmarks across our targeted use cases and markets.
- Assess potential benchmarks for clarity, data integrity, evaluation methods, and alignment with our model development roadmap.
- Conduct benchmarks with baseline models to validate configurations, uncover edge cases, and mitigate risks in R&D initiatives.
- Deliver “benchmark-ready” packages to R&D, including specifications, data, evaluation scripts, expected metrics, and constraints.
- Maintain a shared glossary and documentation related to benchmarks, datasets, and evaluation formats for both GTM and R&D teams.
- Organize and track benchmark results, model leaderboards, and performance expectations for various customers and scenarios.
- Contribute to demonstrations and public-facing evidence based on benchmark results.
Your contributions will be vital in shaping and propelling the benchmarking process for AI model evaluation, directly influencing our development, communication, and customer engagement.
