About the job

Job Reference: P-1485

At Databricks, we are dedicated to empowering data teams to address the world’s most complex challenges — from transforming future transportation to spearheading medical advancements. We accomplish this by developing and managing the premier data and AI infrastructure platform, which enables our clients to extract profound insights from data and elevate their business strategies. Founded by engineers with a strong customer focus, we eagerly embrace every opportunity to tackle technical hurdles, whether it's designing cutting-edge UI/UX for data interaction or scaling our services across millions of virtual machines. Our journey has just begun.

In the role of Incident Manager, you will spearhead Databricks’ most pivotal production incidents, ensuring clear, precise, and timely communication with customers, executives, and engineers. Serving as both the incident commander and reliability engineer, you will orchestrate cross-team responses, provide real-time status updates, and collaborate with engineering to analyze and avert future failures. Your contributions will be instrumental in maintaining Databricks' technical resilience and building customer and stakeholder trust during critical events.

This position merges operational leadership, technical systems expertise, and outstanding communication skills. You will be positioned at the nexus of engineering acumen and operational transparency, guaranteeing that every significant incident is managed with accuracy, openness, and a commitment to ongoing enhancement.

Your Impact:

Lead Critical Incidents: Coordinate cross-disciplinary response efforts across Databricks’ cloud services to swiftly mitigate impacts and restore normal operations.
Drive Technical Root Cause Analysis and Reliability Improvements:
- Collaborate with engineering teams to trace and document underlying causes across distributed systems, services, and data stores.
- Summarize key learnings, communicate action items clearly, and ensure the implementation of technical and procedural enhancements.
Own Incident Communications: Provide regular, high-quality updates to internal stakeholders (executives, engineering leadership, support) and craft customer-facing notifications that are accurate, timely, and empathetic.
Mentor and Train Peers: Enhance the overall quality of Databricks’ incident response by mentoring peers in incident communication and technical response disciplines.

About the job

Job Reference: P-1485

Your Impact:

Lead Critical Incidents: Coordinate cross-disciplinary response efforts across Databricks’ cloud services to swiftly mitigate impacts and restore normal operations.
Drive Technical Root Cause Analysis and Reliability Improvements:
- Collaborate with engineering teams to trace and document underlying causes across distributed systems, services, and data stores.
- Summarize key learnings, communicate action items clearly, and ensure the implementation of technical and procedural enhancements.
Own Incident Communications: Provide regular, high-quality updates to internal stakeholders (executives, engineering leadership, support) and craft customer-facing notifications that are accurate, timely, and empathetic.
Mentor and Train Peers: Enhance the overall quality of Databricks’ incident response by mentoring peers in incident communication and technical response disciplines.

About the job

Job Reference: P-1485

Your Impact:

Lead Critical Incidents: Coordinate cross-disciplinary response efforts across Databricks’ cloud services to swiftly mitigate impacts and restore normal operations.
Drive Technical Root Cause Analysis and Reliability Improvements:
- Collaborate with engineering teams to trace and document underlying causes across distributed systems, services, and data stores.
- Summarize key learnings, communicate action items clearly, and ensure the implementation of technical and procedural enhancements.
Own Incident Communications: Provide regular, high-quality updates to internal stakeholders (executives, engineering leadership, support) and craft customer-facing notifications that are accurate, timely, and empathetic.
Mentor and Train Peers: Enhance the overall quality of Databricks’ incident response by mentoring peers in incident communication and technical response disciplines.

About the job

Job Reference: P-1485

Your Impact:

Lead Critical Incidents: Coordinate cross-disciplinary response efforts across Databricks’ cloud services to swiftly mitigate impacts and restore normal operations.
Drive Technical Root Cause Analysis and Reliability Improvements:
- Collaborate with engineering teams to trace and document underlying causes across distributed systems, services, and data stores.
- Summarize key learnings, communicate action items clearly, and ensure the implementation of technical and procedural enhancements.
Own Incident Communications: Provide regular, high-quality updates to internal stakeholders (executives, engineering leadership, support) and craft customer-facing notifications that are accurate, timely, and empathetic.
Mentor and Train Peers: Enhance the overall quality of Databricks’ incident response by mentoring peers in incident communication and technical response disciplines.

Senior Incident Manager

Experience Level

Qualifications

About the job

Your Impact:

About Databricks

JIC-R&D Software Prototype Engineer (External)

Senior E-Games Project Manager

Part-Time Sales Associate - 15 Hours in Barcelona

Director of Strategy

Director of Strategy

Director of Strategy

Director of Strategy

Service Account Coordinator at Alphabe Insight Inc | Atlanta

Software Development Engineer II - HD Maps

Senior Fullstack Software Engineer

Senior Fullstack Software Engineer

Revenue Operations Specialist

Full-Time Server Assistant

Store Director - Relocation Opportunity in New York City

Dynamic Marketing Assistant

Closing Shift Lead at Domino's | Kaysville

Event Coordinator

Communications Coordinator

Pastry Sous Chef at 1887 by André | Singapore

Brand Promoter

Senior Incident Manager

Experience Level

Qualifications

About the job

Your Impact:

About Databricks

Senior Incident Manager

Experience Level

Qualifications

About the job

Your Impact:

About Databricks

Senior Incident Manager

Experience Level

Qualifications

About the job

Your Impact:

About Databricks