companyVeeam Software logo

Staff Software Engineer, Reliability

Veeam SoftwareBangalore, India
On-site Full-time

Clicking Apply Now takes you to AutoApply where you can tailor your resume and apply.


Unlock Your Potential

Generate Job-Optimized Resume

One Click And Our AI Optimizes Your Resume to Match The Job Description.

Is Your Resume Optimized For This Role?

Find Out If You're Highlighting The Right Skills And Fix What's Missing

Experience Level

Senior

Qualifications

QualificationsProven experience in reliability engineering with a strong understanding of SRE principles. Expertise in designing and implementing scalable systems and architectures. Strong programming skills in languages such as Python, Go, or similar. Familiarity with monitoring and observability tools (e.g., Prometheus, Grafana). Excellent communication skills and the ability to collaborate effectively across teams.

About the job

Veeam is a leading provider of data and AI solutions, dedicated to helping organizations protect and manage their data effectively. Recognized as a pioneer in data resilience and security posture management, we empower businesses to navigate the complexities of identity, data, security, and AI risk. With our headquarters in Seattle and operations in over 30 countries, Veeam proudly safeguards the operations of more than 550,000 customers globally. Join our dynamic team and be part of a transformative journey as we advance together, fostering growth, learning, and making a significant impact for renowned brands around the world.

About the Role

As a Staff Site Reliability Engineer, you will take on a pivotal role as a hands-on technical leader within our Site Reliability Engineering (SRE) team. Your expertise will guide senior engineers, influence product development efforts, and ensure our systems are constructed to be reliable, scalable, and observable from the ground up.

You will spearhead strategic initiatives, mentor peers in SRE practices, and help define architectural best practices across our platform. This role is crucial for aligning teams, enforcing high standards, and scaling SRE principles globally at Veeam.

What You’ll Do

Reliability Engineering & Resilience:

  • Serve as a technical authority, mentoring senior engineers and guiding design decisions to enhance service reliability and resilience.
  • Lead the establishment and enforcement of Service Level Indicators (SLIs), Service Level Objectives (SLOs), and error budgets; ensure adherence across engineering teams.
  • Collaborate with fellow staff members across teams to unify strategy and promote shared reliability standards and objectives.
  • Engage with development and product teams to proactively design for failure, construct resilient architectures, and operationalize reliability from inception.

Observability & Operational Excellence:

  • Promote the organization-wide adoption of observability best practices and tools.
  • Ensure that metrics, logs, and traces yield deep, actionable insights throughout systems.
  • Lead complex incident responses, conduct postmortems, and drive systemic reliability enhancements.
  • Encourage and uphold a blameless culture of learning and continuous improvement.

About Veeam Software

Veeam Software is a premier provider in the field of data and AI solutions, helping organizations protect and manage their critical information. With a commitment to innovation and excellence, Veeam stands at the forefront of data resilience and security, serving a global clientele and making a substantial impact across industries.

Similar jobs

Tailoring 0 resumes

We'll move completed jobs to Ready to Apply automatically.