About the job
Cerebras Systems is at the forefront of AI technology, developing the world’s largest AI chip, which is 56 times larger than conventional GPUs. Our innovative wafer-scale architecture delivers AI compute power equivalent to dozens of GPUs, all integrated onto a single chip. This unparalleled design allows us to achieve industry-leading training and inference speeds, enabling machine learning practitioners to run extensive ML applications seamlessly without the complexities of managing multiple GPUs or TPUs.
We proudly serve a diverse clientele, including leading model laboratories, global corporations, and pioneering AI startups. Notably, OpenAI has entered into a multi-year partnership with us to deploy 750 megawatts of our technology, revolutionizing key workloads with ultra-high-speed inference.
Our wafer-scale architecture powers the fastest Generative AI inference solution available, boasting speeds over ten times faster than traditional GPU-based hyperscale cloud inference services. This significant increase in speed is set to enhance the user experience of AI applications, allowing for real-time adjustments and improved intelligence through advanced agentic computation.

