About the job
About Sygaldry Technologies
Sygaldry Technologies develops quantum-accelerated AI servers in San Francisco, focusing on faster AI training and inference. By combining quantum technology with artificial intelligence, the team addresses challenges in computing costs and energy efficiency. Their AI servers integrate multiple qubit types within a fault-tolerant system, aiming for a balance of cost, scalability, and speed. The company values optimism, rigor, and a drive to solve complex problems in physics, engineering, and AI.
Role Overview: ML Infrastructure Engineer
The ML Infrastructure Engineer joins the AI & Algorithms team, which includes research scientists, applied mathematicians, and quantum algorithm specialists. This role centers on building and maintaining the compute infrastructure that powers advanced research. The systems you build will support reliable GPU access, reproducible experiments, and scalable workloads, so researchers can focus on their core work without needing deep cloud expertise.
Expect to design and manage compute platforms for a range of tasks, including quantum circuit simulation, large-scale numerical optimization, model training, tensor network contractions, and high-throughput data generation. These workloads span multiple cloud providers and on-premises GPU servers.
Key Responsibilities
- Develop compute abstractions for diverse workloads, such as GPU-accelerated simulations, distributed training, high-throughput CPU jobs, and interactive analyses using frameworks like PyTorch and JAX.
- Set up infrastructure to support experiment tracking and reproducibility.
- Create developer tools that make cloud computing feel local, streamlining environment setup, job submission, monitoring, and artifact management.
- Scale experiments from single-GPU prototypes to large, multi-node production runs.
Multi-Cloud GPU Orchestration
- Design orchestration strategies for workloads across multiple cloud providers, optimizing job routing for cost, availability, and capability.
- Monitor and improve cloud spending, keeping track of credit balances, burn rates, and expiration dates.

