About the job
Unifonic operates as a remote-first company in the CPaaS sector, providing communication solutions to over 5,000 businesses. With a team of 500, Unifonic supports clients in building stronger customer connections.
The Engineering team at Unifonic is responsible for designing, building, and maintaining the systems that power the company’s products. Team members collaborate closely with other departments to ensure technology aligns with customer needs. Creativity and new ideas are encouraged across the group.
Role overview
The Senior Site Reliability Engineer joins the Production Operations (Live) team. This role centers on ensuring the reliability, scalability, and resilience of Unifonic’s cloud infrastructure and distributed messaging platforms. The SRE team works to keep systems running smoothly at all times and continually seeks ways to improve performance and stability.
What you will do
- Maintain the reliability, uptime, and scalability of key production services around the clock.
- Participate in the on-call rotation, respond to incidents, troubleshoot live production issues, and lead post-incident reviews.
- Create and update operational playbooks and escalation paths to help reduce Mean Time to Detect (MTTD) and Mean Time to Resolve (MTTR).
- Monitor service level objectives (SLOs), conduct chaos testing, plan for capacity, and address reliability risks as they arise.
