companyVeeam Software logo

Staff Software Engineer - Reliability

Veeam SoftwarePune, India
On-site Full-time

Clicking Apply Now takes you to AutoApply where you can tailor your resume and apply.


Unlock Your Potential

Generate Job-Optimized Resume

One Click And Our AI Optimizes Your Resume to Match The Job Description.

Is Your Resume Optimized For This Role?

Find Out If You're Highlighting The Right Skills And Fix What's Missing

Experience Level

Senior

Qualifications

To excel in this role, candidates should possess a strong background in software engineering and site reliability principles. Experience in mentoring and guiding teams is essential, along with a proven ability to influence product development and operational practices. Familiarity with SLIs, SLOs, and incident management is highly desirable. Candidates should demonstrate a commitment to building resilient systems and a passion for fostering a culture of reliability and continuous improvement.

About the job

Veeam Software is recognized as a leader in data management, ensuring that organizations harness the full potential of their data and AI solutions while maintaining security and resilience. With a global presence across 30 countries and protecting over 550,000 customers, Veeam empowers businesses to navigate the complexities of data security and AI risk management. Our mission is to drive innovation and impact for some of the world's top brands as we advance together.

About the Role

We are on the lookout for a Staff Site Reliability Engineer to take a pivotal role in our SRE team. In this position, you will be a hands-on technical leader, mentoring senior engineers, influencing product development, and ensuring our systems are designed for reliability, scalability, and observability from the ground up.

Your leadership will be crucial in driving strategic initiatives, mentoring others in SRE practices, and establishing architectural best practices across our platform. This role is essential for aligning teams, enforcing high standards, and scaling SRE principles throughout Veeam.

What You’ll Do

Reliability Engineering & Resilience:

  • Serve as a technical authority in your field, mentoring senior engineers and guiding design choices that enhance service reliability and resilience.
  • Lead the definition and enforcement of Service Level Indicators (SLIs), Service Level Objectives (SLOs), and error budgets; ensure adherence across engineering teams.
  • Collaborate with Staff peers across teams to align strategies and advocate for shared reliability standards and objectives.
  • Work closely with development and product teams to proactively design for failure, build resilient architectures, and operationalize reliability from the outset.

Observability & Operational Excellence:

  • Champion the company-wide adoption of observability best practices and tools.
  • Ensure that metrics, logs, and traces deliver deep, actionable insights across systems.
  • Lead complex incident responses, conduct postmortems, and drive systemic reliability improvements.
  • Promote a culture of learning and continuous improvement through a blameless approach.

About Veeam Software

Veeam Software is a global leader in data management and protection, known for its innovative solutions that ensure data resilience and security for organizations worldwide. With a commitment to driving safe AI at scale, Veeam empowers businesses to operate with confidence in an increasingly complex digital landscape.

Similar jobs

Tailoring 0 resumes

We'll move completed jobs to Ready to Apply automatically.