companyCerebras Systems logo

Compute Server Platform Architect

Cerebras SystemsSunnyvale CA or Toronto Canada
On-site Full-time

Clicking Apply Now takes you to AutoApply where you can tailor your resume and apply.


Unlock Your Potential

Generate Job-Optimized Resume

One Click And Our AI Optimizes Your Resume to Match The Job Description.

Is Your Resume Optimized For This Role?

Find Out If You're Highlighting The Right Skills And Fix What's Missing

Experience Level

Mid to Senior

Qualifications

ResponsibilitiesLead the architectural design for all server roles within Cerebras clusters, including the definition of server types, configurations, and lifecycle strategies. Establish and maintain server formulas (counts and ratios) for optimal performance.

About the job

Cerebras Systems is at the forefront of AI technology, creating the largest AI chip in the world, which is 56 times larger than traditional GPUs. Our innovative wafer-scale architecture provides AI compute power equivalent to dozens of GPUs on a single chip, while ensuring the programming simplicity of a single device. This unique approach enables Cerebras to achieve industry-leading training and inference speeds, empowering machine learning practitioners to run extensive ML applications without the complexities of managing multiple GPUs or TPUs.

Our clientele includes leading model labs, global corporations, and pioneering AI-native startups. Recently, OpenAI announced a multi-year collaboration with Cerebras to utilize 750 megawatts of scale, revolutionizing important workloads with ultra-high-speed inference.

With our groundbreaking wafer-scale architecture, Cerebras Inference delivers the fastest Generative AI inference solution globally, outperforming GPU-based hyperscale cloud inference services by over tenfold. This remarkable speed enhancement is reshaping the user experience of AI applications, facilitating real-time iterations and amplifying intelligence through enhanced agentic computation.

About The Role

As a Compute / Server Platform Architect within the Cluster Architecture Team, you will be responsible for the server-side platform architecture that empowers Cerebras CS3-based AI clusters (for both training and inference), ensuring predictable performance, scalability, and reliability. Our accelerators are network-attached, making the x86 server fleet an integral component of the end-to-end system. This system supports critical runtime functions such as orchestration, prompt caching, and IO/control services, necessitating co-design with software to optimize token-level latency, throughput, and cost efficiency. You will translate workload behaviors into requirements for CPU, memory, IO, PCIe, and host networking, lead platform evaluations with vendors, and provide technical direction through qualification and production adoption in close collaboration with other leaders and technical project managers.

About Cerebras Systems

Cerebras Systems is revolutionizing AI technology with the world's largest AI chip, designed to streamline machine learning processes and enhance performance significantly. Committed to innovation, Cerebras collaborates with top organizations to push the boundaries of AI capabilities.

Similar jobs

Tailoring 0 resumes

We'll move completed jobs to Ready to Apply automatically.