About the job
Title: Site Reliability Engineer - I, Product Area Focus
Location: Noida (Hybrid)
Role Overview
As a Site Reliability Engineer, you will take ownership of product availability, the most critical feature, by consistently aiming for operational excellence in Sumo Logic’s comprehensive observability and security products. Collaborate with a worldwide SRE team to implement projects aligned with your specific product area reliability roadmap. Your efforts will focus on optimizing operations, enhancing the efficiency of cloud resource utilization and developer time, fortifying security measures, and accelerating feature development for our engineering teams.
Key Responsibilities
- Support engineering teams within your product area by developing and executing a reliability roadmap that identifies opportunities for improvement in reliability, maintainability, security, efficiency, and velocity.
- Work in collaboration with development infrastructure, Global SRE, and product area engineering teams to establish and refine your reliability roadmap continually.
- Engage in defining, evolving, and managing Service Level Objectives (SLOs) across multiple teams within your product area.
- Participate in on-call rotations to gain insights into operational workloads, enabling you to enhance the on-call experience and reduce operational burdens associated with microservices and related components.
- Lead projects aimed at optimizing the on-call experience for engineering teams.
- Develop code and automation solutions to alleviate operational workloads, boost efficiency, enhance security, eliminate toil, and empower Sumo’s developers to expedite feature delivery.
- Collaborate closely with developer infrastructure teams to accelerate the adoption of tools that support your reliability roadmap by identifying needs within supported engineering teams and contributing features and bug fixes as required.
- Scale systems sustainably through automation and advocate for changes that enhance reliability and development velocity.
- Drive root cause analysis and issue resolution collaboratively with teams.
- Thrive in a fast-paced iterative environment.

