About the job
Cerebras Systems is at the forefront of AI technology, having developed the world's largest AI chip, which is 56 times larger than traditional GPUs. Our innovative wafer-scale architecture delivers the AI computing power equivalent to dozens of GPUs on a single chip, simplifying programming to a single device. This revolutionary design enables Cerebras to provide unmatched training and inference speeds, empowering machine learning practitioners to seamlessly execute large-scale ML applications without the complexities of managing multiple GPUs or TPUs.
Our clientele includes elite model labs, global corporations, and pioneering AI-native startups. Notably, OpenAI recently entered into a multi-year partnership with Cerebras to deploy 750 megawatts of scale, significantly enhancing key workloads with ultra high-speed inference.
Thanks to our groundbreaking wafer-scale architecture, Cerebras Inference offers the fastest Generative AI inference solution globally, achieving speeds over 10 times faster than GPU-based hyperscale cloud inference services. This substantial speed boost is transforming user experiences in AI applications by enabling real-time iterations and enhancing intelligence through additional agentic computation.
- Provide hands-on technical leadership, owning the technical vision and roadmap for kernel-centric reliability concerning both internal and customer-facing systems.

