About the job
At Cerebras Systems, we are pioneering the future of artificial intelligence with the development of the world's largest AI chip, which is an astonishing 56 times larger than traditional GPUs. Our innovative wafer-scale architecture combines the computational power of numerous GPUs into a single chip, simplifying programming and enhancing efficiency. This unique approach enables us to achieve unparalleled training and inference speeds, empowering machine learning practitioners to run extensive ML applications seamlessly, without the complexities of juggling multiple GPUs or TPUs.
Our clientele includes leading model labs, global corporations, and groundbreaking AI-focused startups. Notably, OpenAI has recently partnered with Cerebras to harness 750 megawatts of scale, revolutionizing critical workloads with ultra-fast inference capabilities.
Thanks to our cutting-edge wafer-scale technology, Cerebras Inference delivers the fastest Generative AI inference solutions available, exceeding GPU-based hyperscale cloud services by over ten times. This significant leap in speed is revolutionizing user interactions with AI applications, facilitating real-time adjustments and enhancing intelligence through advanced computational capabilities.
About The Role
As the security lead for Cerebras's AI cluster product, you will be at the forefront of ensuring the security of our large-scale AI clusters, which consist of hundreds of wafer-scale accelerator systems, thousands of high-performance servers, and numerous networking ports, including switches. This will also involve managing network-attached storage within a vast data center.
Your primary responsibility will be to implement security measures based on established best practices and first principles, ensuring the protection of Cerebras's extensive AI clusters. These clusters comprise intricate hardware components, networking systems, and a fully integrated cluster management software stack that ranges from bare-metal deployments to sophisticated management systems that enable multi-tenant training and inference services across these expansive clusters.
You will focus on guaranteeing end-to-end security and privacy for various cluster applications, developing security engineering solutions incorporating robust network access controls, user access management, and an exceptional multi-tenancy framework.

