About the job
At Thinking Machines Lab, our mission is to empower humanity by advancing collaborative general intelligence. We are dedicated to building a future where everyone can access the knowledge and tools necessary to harness AI for their unique needs and objectives.
We are a team of scientists, engineers, and builders who have developed some of the most widely used AI products, including ChatGPT and Character.ai, and contributed to open-weight models like Mistral, along with popular open-source projects such as PyTorch, OpenAI Gym, Fairseq, and Segment Anything.
About the Role
We are seeking an Infrastructure Engineer to take charge of evolving the security infrastructure that supports our foundational models. In this pivotal role, you will collaborate across computing, storage, networking, and data platforms to ensure our systems remain secure, reliable, and scalable. You will design controls, architecture, and tooling that embed security into the platform's core functionalities. Working closely with research and product teams, you will enable them to operate swiftly while safeguarding our models, data, and environments.
Note: This is an "evergreen role" that we maintain for ongoing interest. While we receive numerous applications, there may not always be an immediate position that perfectly matches your skills and experience. We encourage you to apply, as we continuously assess applications and reach out to candidates when new opportunities arise. Feel free to reapply if you gain more experience, but please refrain from applying more than once every six months. Additionally, we occasionally post openings for specific roles to meet project or team-specific needs, and in those cases, you are welcome to apply directly in conjunction with this evergreen role.
What You’ll Do
- Design security patterns for platforms and services, including network segmentation, service-to-service authentication, RBAC, and policy enforcement in Kubernetes and cloud environments.
- Oversee identity, access, and secrets management for users and services: workload and cross-cloud identity, least-privilege IAM, and secrets management.
- Create secure platforms for data ingestion, processing, and curation, encompassing classification, encryption, access controls, and safe sharing practices across teams.
- Develop threat models and review designs with researchers and engineers to facilitate safe and scalable feature launches.
- Automate security checks and implement guardrails: policy-as-code, secure infrastructure baselines, CI/CD validation, and tools that streamline secure operations.

