Hatch logoHatch logo

Senior Cloud Infrastructure Engineer

HatchUnited States
On-site Full-time $158K/yr - $216K/yr

Clicking Apply Now takes you to AutoApply where you can tailor your resume and apply.


Experience Level

Senior

Qualifications

Qualifications5+ years of hands-on experience in DevOps, SRE, or platform engineering roles. Proficiency with cloud platforms such as AWS and GCP. Expertise in infrastructure-as-code tools, especially Terraform and Ansible. Experience in managing ML workflows and data processing pipelines. Strong background in observability, logging, and incident response. Ability to work collaboratively in a fast-paced, team-oriented environment.

About the job

Senior Cloud Infrastructure Engineer

About the Role

Join Hatch’s dynamic engineering team as a Senior Cloud Infrastructure Engineer, where you will play a pivotal role in architecting resilient, secure, and scalable cloud infrastructure that supports our primary platform and cutting-edge AI products. Collaborating with engineers, machine learning experts, and product leaders, you will ensure that our systems can grow rapidly and effectively to meet our ambitious goals.

About Hatch
Hatch is an innovative team dedicated to solving real-world challenges through artificial intelligence. We embrace speed, accountability, and a strong commitment to delivering impactful results. Our engineering culture emphasizes operational excellence, clean architectural practices, and rapid execution while maintaining reliability. If you thrive on scaling infrastructure that drives AI workflows from end-to-end, this opportunity is tailored for you.

What You’ll Do
Infrastructure at Scale
• Enhance our cloud infrastructure (AWS & GCP) using infrastructure-as-code tools such as Terraform or Ansible.
• Create systems that cater to the compute and storage demands of machine learning and data processing workflows.
• Oversee scalable, secure, and cost-effective environments across development, staging, and production.
• Participate in a rotational on-call schedule.


ML Platform Support
• Collaborate with ML engineers to operationalize models and manage workflows throughout training, testing, and deployment.
• Establish infrastructure for versioning, orchestrating, and monitoring ML models in production using tools like Kubeflow, SageMaker, VertexAI, or custom pipelines.
• Optimize data pipelines and model serving infrastructure to achieve low-latency and high-throughput performance.


Reliability & Observability
• Formulate strategies for observability, logging, and alerting across distributed systems.
• Lead incident response initiatives, root cause analyses, and system enhancements for sustained resiliency.
• Implement infrastructure security best practices, container hardening, and robust network architecture.


Platform Enablement
• Collaborate with engineering teams to integrate DevOps best practices throughout the development lifecycle.
• Develop tools and automation that enhance developer efficiency, release stability, and system visibility.

What We’re Looking For
• 5+ years of experience in DevOps, SRE, or platform engineering roles within fast-paced environments.

About Hatch

Hatch is a forward-thinking team that leverages AI to solve pressing business challenges. We prioritize speed, ownership, and impactful outcomes in our work. Our engineering ethos champions operational rigor and clean architecture, ensuring that we deliver reliable solutions efficiently.

Similar jobs

Browse all companies, explore by city & role, or SEO search pages.

Tailoring 0 resumes

We'll move completed jobs to Ready to Apply automatically.