About the job
Why Join Harvey?
At Harvey, we are revolutionizing the landscape of legal and professional services with a holistic approach. By integrating advanced AI technology, a robust enterprise platform, and extensive industry knowledge, we are redefining how essential knowledge work is conducted for years to come.
This is a unique opportunity to contribute to the foundation of a transformative company at a pivotal moment in its journey. With over 1000 clients across more than 58 countries, a solid product-market fit, and outstanding investor backing, we are rapidly expanding and creating a new category in real-time. The challenges are significant, expectations are high, and the potential for personal, professional, and financial development is unparalleled.
Our team comprises driven, intelligent individuals who are deeply passionate about our mission. We prioritize speed, intensity, and accountability in addressing challenges — from initial ideation to long-term solutions. By maintaining close relationships with our clients, from executives to engineers, we collaboratively address pressing issues with urgency and care. If you excel in uncertain environments, strive for excellence, and wish to shape the future of work alongside a team that raises the bar, we invite you to build alongside us.
At Harvey, we are currently writing the future of professional services — and we are just getting started.
Your Role
As a Senior Software Engineer on the Site Reliability team at Harvey, your mission will be to uphold the reliability, scalability, and performance of our innovative legal AI platform. You will become part of a high-impact team that operates at the crossroads of infrastructure and product, taking ownership of the systems that ensure our platform remains fast, secure, and continuously available. From scaling operations across 50+ regions to automating critical processes, your efforts will fortify Harvey's resilience as we expand. If you are enthusiastic about constructing robust systems and simplifying complexity through automation, we would love to collaborate with you.
This position is situated in San Francisco, CA, and we adhere to an in-person work model, providing relocation assistance to new employees.
Your Responsibilities
Design, implement, and oversee monitoring, alerting, and infrastructure resources (compute, storage, networking) across 50+ global regions.
Lead incident management processes, including postmortems, root cause analyses, and driving actionable enhancements.
Automate operational tasks and workflows by developing tools and processes for capacity planning, seamless rollouts, and secure data access to maintain high reliability and minimize manual intervention.
Collaborate across teams to drive solutions that enhance system performance and reliability.

