About the job
DigitalOcean is growing its Seattle team and seeking a Hardware Sustaining Engineer to help maintain and improve our large-scale data center hardware. This role supports the infrastructure that powers our cloud services, allowing customers to focus on what matters most.
Role Overview
This position reports to the Manager of Infra::Machines::Design. As part of the Sustaining Engineering team, the Hardware Sustaining Engineer helps ensure reliable operations for server hardware, cabling, and networking equipment. The work involves hands-on troubleshooting, process improvement, and close collaboration with other teams as we expand our data center capabilities and adopt new technologies.
Main Responsibilities
- Work as a member of the Sustaining Engineering team within the Infra::Machines::Design organization.
- Support server hardware, cabling, and networking hardware throughout their operational lifecycle.
- Monitor the #machines channel and MACHINES JIRA project for issues, and drive them to resolution.
- Participate in a 24/7 on-call rotation with other team members.
- Serve as Tier 2 escalation for Datacenter Operations (DCOPS) and Cloud Operations (CloudOps) on hardware and firmware matters.
- Develop and maintain standards and practices for hardware operations at DigitalOcean.
- Collaborate with teams to resolve issues related to tooling, firmware packages, hardware components, and operational concerns.
- Assist with the development of tooling and runbooks to improve operational capabilities around hardware and firmware.
- Coordinate with Operations teams on monitoring thresholds, failure modes, and alerting strategies.
- Diagnose failure causes and implement preventive measures for future incidents.
- Identify and integrate industry best practices to enhance the quality of our cloud infrastructure.
Location
This position is based in Seattle.
