About the job
- Define and spearhead the infrastructure and reliability strategy across our innovative platform.
- Collaborate with engineering teams to design scalable and resilient systems.
- Streamline build, testing, and deployment processes to enhance speed and stability.
- Establish and maintain best practices for CI/CD, monitoring, and observability.
- Lead incident response efforts and champion continuous improvement following incidents.
- Automate workflows to minimize operational toil and mitigate risks.
- Guide and mentor engineers, fostering a culture of operational excellence.
- Make strategic build-vs-buy decisions that balance speed, quality, and sustainability.
