companyDoctolib logo

Senior Site Reliability Engineer - Observability (m/f/x)

DoctolibBerlin, Berlin, Germany; Paris, Paris, France
On-site Full-time

Clicking Apply Now takes you to AutoApply where you can tailor your resume and apply.


Unlock Your Potential

Generate Job-Optimized Resume

One Click And Our AI Optimizes Your Resume to Match The Job Description.

Is Your Resume Optimized For This Role?

Find Out If You're Highlighting The Right Skills And Fix What's Missing

Experience Level

Senior

Qualifications

The ideal candidate will have: A robust foundation in site reliability engineering principles. Exceptional problem-solving skills and a proactive approach to challenges. The ability to work collaboratively in a fast-paced, agile environment. Excellent communication abilities to effectively convey complex technical concepts.

About the job

At Doctolib, we pride ourselves on fostering a dynamic engineering environment where innovation thrives. Our mission is to enhance the lives of healthcare professionals and patients alike. We are seeking a Senior Site Reliability Engineer to ensure our production systems operate seamlessly, playing a crucial role in supporting the rapid expansion of Doctolib's services.

Your Responsibilities

As a Senior Site Reliability Engineer within the Core Reliability & Observability team, you will be instrumental in defining the company's observability strategy and maintaining the reliability, debuggability, and scalability of our platform. This position bridges infrastructure, developer experience, and product engineering, focusing on developing and enhancing the core elements of logging, metrics, tracing, and alerting across our organization.

  • Lead the implementation of an observability strategy across the platform, emphasizing scalable, developer-friendly logging and tracing solutions.
  • Identify and spearhead cross-functional reliability initiatives to enhance incident detection, response, and postmortem analysis capabilities.
  • Participate in the on-call rotation and actively work on improving our on-call experience by optimizing alerting, minimizing noise, and providing actionable telemetry.

Who You Are

You could be our next teammate if you possess:

  • A minimum of 3 years of hands-on experience with large-scale production platforms.
  • Demonstrated proficiency with cloud platforms such as AWS, Azure, or Google Cloud.
  • A strong understanding of containerization and orchestration technologies (Docker and Kubernetes).
  • A deep knowledge of Helm for managing Kubernetes manifests and ArgoCD for GitOps workflows.
  • Extensive expertise in observability tooling and architecture, including:
    • Logging: Fluent Bit, OpenTelemetry, Loki, Elasticsearch, Logstash, Vector.
    • Tracing: OpenTelemetry or proprietary APMs.
    • Metrics: Prometheus, Thanos, Datadog, or equivalent.
  • Proficiency in at least one programming language (e.g., Ruby, Python, Go, Java) and a strong grasp of infrastructure as code principles.
  • Experience with monitoring and observability tools.

About Doctolib

Doctolib is a leading digital health platform dedicated to improving healthcare for patients and professionals. By combining cutting-edge technology with a commitment to excellence, we are revolutionizing the way healthcare is delivered across Europe.

Similar jobs

Tailoring 0 resumes

We'll move completed jobs to Ready to Apply automatically.