Senior Site Reliability Engineer - Observability (m/f/x)

DoctolibBerlin, Berlin, Germany; Paris, Paris, France

On-site Full-time

Clicking Apply Now takes you to AutoApply where you can tailor your resume and apply.

Unlock Your Potential

Generate Job-Optimized Resume

One Click And Our AI Optimizes Your Resume to Match The Job Description.

Is Your Resume Optimized For This Role?

Find Out If You're Highlighting The Right Skills And Fix What's Missing

Experience Level

Senior

Qualifications

The ideal candidate will have: A robust foundation in site reliability engineering principles. Exceptional problem-solving skills and a proactive approach to challenges. The ability to work collaboratively in a fast-paced, agile environment. Excellent communication abilities to effectively convey complex technical concepts.

About the job

Your Responsibilities

As a Senior Site Reliability Engineer within the Core Reliability & Observability team, you will be instrumental in defining the company's observability strategy and maintaining the reliability, debuggability, and scalability of our platform. This position bridges infrastructure, developer experience, and product engineering, focusing on developing and enhancing the core elements of logging, metrics, tracing, and alerting across our organization.

Lead the implementation of an observability strategy across the platform, emphasizing scalable, developer-friendly logging and tracing solutions.
Identify and spearhead cross-functional reliability initiatives to enhance incident detection, response, and postmortem analysis capabilities.
Participate in the on-call rotation and actively work on improving our on-call experience by optimizing alerting, minimizing noise, and providing actionable telemetry.

Who You Are

You could be our next teammate if you possess:

A minimum of 3 years of hands-on experience with large-scale production platforms.
Demonstrated proficiency with cloud platforms such as AWS, Azure, or Google Cloud.
A strong understanding of containerization and orchestration technologies (Docker and Kubernetes).
A deep knowledge of Helm for managing Kubernetes manifests and ArgoCD for GitOps workflows.
Extensive expertise in observability tooling and architecture, including:

Logging: Fluent Bit, OpenTelemetry, Loki, Elasticsearch, Logstash, Vector.
Tracing: OpenTelemetry or proprietary APMs.
Metrics: Prometheus, Thanos, Datadog, or equivalent.

Proficiency in at least one programming language (e.g., Ruby, Python, Go, Java) and a strong grasp of infrastructure as code principles.
Experience with monitoring and observability tools.

About Doctolib

Doctolib is a leading digital health platform dedicated to improving healthcare for patients and professionals. By combining cutting-edge technology with a commitment to excellence, we are revolutionizing the way healthcare is delivered across Europe.

Similar jobs

1 - 20 of 4,249 Jobs

Search for Senior Cloud Site Reliability Engineer Network At Scalablegmbh Berlin

4,249 results

Select all on this page (20)

Apply

Senior Cloud Site Reliability Engineer (Network) at scalablegmbh | Berlin

scalablegmbh

Full-time|On-site|Berlin

Role Overview scalablegmbh is looking for a Senior Cloud Site Reliability Engineer with a focus on network systems. This position is based in Berlin. What You Will Do Maintain and improve the reliability, performance, and scalability of cloud infrastructure. Work closely with engineering teams to optimize network services and resolve technical challenges. Contribute to developing solutions that strengthen network systems. Support a culture of ongoing improvement across the organization. About You Bring expertise in cloud technologies and network systems. Enjoy solving complex problems and collaborating with others. Ready to make an impact in a growing company.

Apr 14, 2026

Apply

Senior Analytics Engineer at scalablegmbh | Berlin

scalablegmbh

Full-time|On-site|Berlin

scalablegmbh is looking for a Senior Analytics Engineer to join the team in Berlin. This position centers on building and refining data models that help unlock valuable insights for the business. Role overview The Senior Analytics Engineer works closely with colleagues from different departments to shape data-driven solutions. The job involves developing scalable data models, improving query performance, and ensuring that analytics efforts support key business goals. What you will do Collaborate with cross-functional teams to understand data needs Design and maintain scalable data models Optimize queries for efficiency and reliability Translate data into clear, actionable insights for decision-making Location This role is based in Berlin.

Apr 29, 2026

Apply

Senior Site Reliability Engineer at Scout24 | Berlin

Scout24 AG

Full-time|Hybrid|Berlin

Why Join Scout24?Scout24 is the proud home of ImmoScout24, Germany's premier platform for real estate. For over 25 years, we have been at the forefront of transforming the real estate market in Germany and Austria. Our mission is to create a digital ecosystem that unites homeowners, seekers, and agents, making the journey to find the perfect home a seamless experience. Your career is as vital as finding the right property; hence, #WorkingatScout24 means you will be part of a vibrant, diverse team of around 1,100 colleagues from 58 nationalities. We celebrate individuality and foster a culture of open-mindedness and authenticity, enabling true learning and personal growth. Mistakes are viewed as opportunities for growth and innovation. Together, we proactively strive for improvement and take responsibility, discussing both successes and challenges with mutual respect because we are #oneteam.If this resonates with you, we would love to welcome you on board! Even if you don't meet every requirement, we encourage you to share how you can contribute to our team. Grow with us! Welcome home!Beyond our outstanding company culture, we offer exceptional benefits that make Scout24 a fantastic workplace!

Dec 10, 2025

Apply

Android Engineer (m/f/x) at scalablegmbh | Berlin

scalablegmbh

Full-time|On-site|Berlin

scalablegmbh is looking for an Android Engineer to help build and improve mobile applications in Berlin. This position centers on developing software that aims to deliver a strong user experience. Role overview As an Android Engineer, the focus will be on designing and implementing features for mobile products. Collaboration with a skilled team is part of the day-to-day work, with a shared goal of producing reliable and effective applications. What you will do Develop and maintain Android applications Work closely with other engineers and professionals to deliver high-quality software Contribute to improving user experiences through thoughtful design and implementation Location This role is based in Berlin.

Apr 29, 2026

Apply

AI Software Engineer – Python (m/f/x) at scalablegmbh | Berlin

scalablegmbh

Full-time|On-site|Berlin

scalablegmbh is seeking an AI Software Engineer with a strong background in Python to join the team in Berlin. This position centers on building AI solutions that support a range of projects and drive progress within the company. Role overview The AI Software Engineer will use Python to design and implement AI-driven features. Collaboration with colleagues is essential, as projects often require joint problem-solving and creative thinking. The work involves addressing technical challenges and contributing to the ongoing development of AI technologies at scalablegmbh. What you will do Develop AI solutions using Python Work closely with team members to deliver project goals Contribute ideas and technical expertise to shape future AI initiatives Requirements Experience with Python in AI-focused projects Enjoy working in a collaborative setting Motivation to solve complex problems and advance AI technology

Apr 29, 2026

Apply

Senior Android Engineer at scalablegmbh | Berlin

scalablegmbh

Full-time|On-site|Berlin

scalablegmbh is seeking a Senior Android Engineer to join the team in Berlin. This position centers on designing and building Android applications that improve user experience and offer advanced functionality. What you will do Design and develop Android applications with a focus on usability and performance Work closely with colleagues from different disciplines to deliver cohesive mobile solutions Contribute to architectural choices and help shape the direction of mobile development Support the creation of scalable products that can grow with user needs Who we are looking for Experience building Android apps and a strong interest in mobile technology Comfort working on complex technical challenges Enjoys collaborating in a team setting and sharing ideas Motivated to have a meaningful impact on product quality and user satisfaction This role is based in Berlin and offers the chance to help shape the future of mobile solutions at scalablegmbh.

Apr 29, 2026

Apply

Compliance Expert (m/w/d) at scalablegmbh | Berlin

scalablegmbh

Full-time|On-site|Berlin

scalablegmbh is looking for a Compliance Expert (m/w/d) to join the team in Berlin. This position plays a key part in keeping our operations aligned with relevant regulations and industry standards. Role overview This role focuses on monitoring company activities to ensure compliance with legal and regulatory requirements. The Compliance Expert helps maintain client trust by supporting processes that uphold our commitment to quality and integrity. What you will do Oversee adherence to applicable laws and industry guidelines Support efforts to deliver reliable service to clients Contribute to a culture of compliance across the organization Location This position is based in Berlin.

Apr 30, 2026

Apply

Senior Site Reliability Engineer at redcare-pharmacy | Berlin

redcare-pharmacy

Full-time|On-site|Berlin

Join redcare-pharmacy as a Senior Site Reliability Engineer in Berlin. We are seeking a talented and experienced individual who can enhance our infrastructure and ensure the reliability and performance of our systems. This role will involve collaboration with development teams to build scalable systems and improve our operational practices.

Jan 29, 2026

Apply

Site Reliability Engineer at Sony Interactive Entertainment | Berlin, Germany

Sony Interactive Entertainment

Full-time|On-site|Germany, Berlin

About PlayStation and Sony Interactive Entertainment PlayStation, part of Sony Interactive Entertainment and a subsidiary of Sony Group Corporation, is known worldwide for delivering leading entertainment experiences. Our portfolio includes PlayStation®5, PlayStation®4, PlayStation®VR, PlayStation®Plus, and acclaimed titles from PlayStation Studios. We value diversity and inclusion, working to create an environment where employees feel empowered and supported. Our teams bring together people who are curious about technology and eager to shape the future of gaming. Role Overview: Site Reliability Engineer Based in Berlin, this Site Reliability Engineer role sits within the Gaming Developer & Future Technology Group (GDFT). The group drives cloud gaming innovation, delivering console-quality experiences to players across TVs, mobile devices, and more. The SRE team plays a central part in maintaining and improving the stability of our cloud gaming services. This position involves shaping both design and operational strategies, owning production systems, ensuring code quality, and managing deployments. SREs here contribute to decisions at multiple levels and work closely with teams throughout the software development lifecycle to support operational readiness and service stability. Main Responsibilities Lead and participate in technical discussions to improve reliability and scalability within the team. Contribute to High-Level Design (HLD) documents for new products and platforms. Mentor junior SREs, providing guidance and support for their growth. Take charge of incident response and post-mortem analysis within the assigned service area. Work with cross-functional groups to drive operational efficiency.

Apr 20, 2026

Apply

Incident Manager (m/f/x) at scalablegmbh | Berlin

scalablegmbh

Full-time|On-site|Berlin

scalablegmbh is looking for an Incident Manager to help keep services reliable and operations running smoothly. This position is based in Berlin and centers on coordinating responses to incidents as they arise. Role overview The Incident Manager leads efforts to resolve service disruptions quickly and efficiently. The job involves working with teams across the company to address immediate issues, minimize downtime, and restore normal operations. Key responsibilities Coordinate incident response and manage the resolution process Work with cross-functional teams to find root causes of incidents Support the implementation of preventive measures to avoid future disruptions Promote ongoing improvements in incident management and service quality Collaboration This role involves regular interaction with different departments, encouraging teamwork and a proactive approach to problem solving.

Apr 29, 2026

Apply

Site Reliability Engineer / DevOps at Almedia | Berlin

Almedia

Full-time|Remote|Berlin

Join Almedia, a pioneering company on a mission to revolutionize marketing by rewarding a community of over 60 million users for their engagement with global brands. Here, you can accelerate your career in an exciting environment aiming to become Germany's next bootstrapped unicorn, recognized as Europe's #3 fastest-growing company in 2025 (FT1000).We are seeking a passionate and skilled Site Reliability Engineer / DevOps to help us maintain the performance and reliability of our high-traffic platform.

Feb 3, 2026

Apply

Site Reliability Engineer, Infrastructure at Superhuman | Berlin

Superhuman Platform Inc.

Full-time|Hybrid|Hub - Berlin

Superhuman embraces a dynamic hybrid working model for this position, offering team members the ideal balance of focused work and in-person collaboration that nurtures trust, innovation, and a vibrant team culture.About SuperhumanSuperhuman is at the forefront of AI productivity, empowering individuals to reach their superhuman potential. As the proud home of Grammarly, our suite of applications integrates seamlessly with over 1 million platforms, enhancing productivity through intelligent features. Our offerings include Grammarly's writing assistance, Coda's collaborative spaces, and Go, an AI assistant that proactively provides contextual support. Since our inception in 2009, we have transformed the workflows of more than 40 million users, 50,000 organizations, and 3,000 educational institutions globally. Discover more at superhuman.com.The OpportunityIn pursuit of our ambitious goals, we seek a Site Reliability Engineer (SRE) to strengthen our infrastructure team. This pivotal role involves developing software to enhance the reliability of our backend systems, collaborating closely with engineers, and strategizing for future scalability. You will engage with our existing production engineering teams in the EU as we transition away from the “you build it, you own it” approach.The engineers and researchers at Superhuman are given the freedom to innovate and drive breakthroughs, subsequently influencing our product roadmap. As we expand our interfaces, algorithms, and infrastructure, the complexity of our technical challenges continues to grow. Learn more about our technical endeavors on our technical blog.As an SRE, your responsibilities will include:Scaling our Kubernetes-based control plane that processes billions of events daily.Enhancing our automation systems that respond to workload demands.Deploying machine learning systems company-wide.

Feb 19, 2026

Apply

Founding DevOps Engineer (Site Reliability Engineer) at TechBiz Global | Berlin

TechBiz Global

Full-time|Hybrid|Berlin, Berlin, Germany

Join TechBiz Global as we empower our prestigious clients by providing exceptional recruitment services. We are currently on the lookout for a Founding DevOps Engineer (SRE) to become an integral part of our client's team. If you are eager to advance your career in a cutting-edge environment, this opportunity could be perfect for you.Berlin • Cybersecurity & AI Startup • Recently FundedOur client, an innovative cybersecurity startup based in Berlin, is seeking a DevOps Engineer to join as a founding member and contribute to the development of the core security, identity, and enforcement frameworks of a pioneering AI-driven risk management platform.Founded by seasoned cybersecurity professionals with experience in Israeli intelligence, our client is looking for a proactive Founding DevOps Engineer for a hybrid role located in central Berlin. If you have a passion for cybersecurity and AI, excel in dynamic startup settings, and relish the challenge of building sophisticated platforms from the ground up, this is a chance to make a significant impact.This startup is creating a state-of-the-art cyber risk platform designed to help enterprises effectively comprehend, measure, and mitigate identity risks on a large scale. Their mission is to transform intricate identity and security data into clear, actionable insights that Chief Information Security Officers (CISOs) and Chief Technology Officers (CTOs) can rely on. From day one, you will be instrumental in shaping core platform components, influencing how modern enterprises manage risk using cloud-native technologies, AI-driven analytics, and automated enforcement through AI agents.Key ResponsibilitiesDesign, build, and operate the foundational cloud infrastructure for a secure, scalable, production-ready SaaS platform from the outset.Manage AWS environments comprehensively, encompassing networking, IAM, compute, storage, and security parameters.Develop and sustain Infrastructure as Code practices to ensure efficient deployment and management.

Mar 13, 2026

Apply

Site Reliability Engineer

Helsing

Full-time|On-site|Berlin; London; Munich

Who We AreHelsing is a pioneering defense AI company dedicated to safeguarding democracies. Our mission is to attain technological leadership, enabling open societies to make sovereign decisions and uphold their ethical standards. As a company, we recognize the profound responsibility that comes with developing and deploying powerful technologies like AI, and we are committed to addressing this responsibility with integrity.Our team consists of driven engineers, AI specialists, and customer-facing program managers who are passionate about solving the most complex and impactful challenges. We embrace a culture of openness and transparency, encouraging healthy debates about the role of technology in defense, its benefits, and its ethical implications.The RoleWe operate primarily in high-security, on-premise environments, and we are seeking a Site Reliability Engineer to support these critical infrastructures. In this role, you will be responsible for the design, implementation, and management of our on-premise Kubernetes infrastructure.We value engineers who exhibit a strong work ethic, prioritize effectively, and excel in teamwork. Clear communication, knowledge sharing, and collaboration are essential to advancing both our team and our mission.The Day-to-DayAs a Site Reliability Engineer, you will design and build cloud-native infrastructure platforms on-premises, focusing on Kubernetes-based solutions that empower our development teams to operate services at scale.You will create robust observability frameworks using tools like Grafana, Prometheus, and distributed tracing to ensure system reliability and performance.You will architect and implement secure, multi-tenant Kubernetes clusters to support our high-security environments.

Feb 18, 2026

Apply

Site Reliability Engineer at Air Apps | Berlin Metropolitan Area

Air Apps

Full-time|On-site|Berlin Metropolitain Area

About Air Apps Air Apps began as a family-founded company in Lisbon, Portugal in 2018. The team focuses on building AI-powered tools for personal and entrepreneurial planning, including the Personal & Entrepreneurial Resource Planner (PRP). Over 100 million downloads worldwide mark a significant milestone for the self-funded company, which now has offices in Lisbon and San Francisco. Air Apps pursues long-term goals, working to challenge standard approaches and develop AI-driven solutions that make a real difference. The company values innovation and aims to empower people globally through its products. Site Reliability Engineer Role The Site Reliability Engineer (SRE) will help maintain and improve the reliability, availability, and scalability of Air Apps’ systems. This role bridges software development and operations, focusing on automation, monitoring, and performance tuning to reduce downtime and strengthen system resilience. Work Location This position is fully onsite at the Lisbon office. Collaboration with cross-functional teams is central to the role. Relocation support is available for the right candidate.

Apr 17, 2026

Apply

Site Reliability Engineer (m/f/d) - Join the Flip Team

flipapp1

Full-time|Remote|Berlin, Berlin, Germany; Remote (Europe); Stuttgart, Baden-Württemberg, Germany

Flip is building an AI-powered employee experience platform designed for frontline workers. The mission centers on giving every employee, no matter their location, access to essential company information. Flip’s goal is to become the most widely used platform for frontline teams, changing how these teams connect and collaborate. Role overview As a Site Reliability Engineer in the Platform Squad, the focus is on keeping Flip’s infrastructure reliable, fast, and scalable. The role involves shaping reliability practices, developing internal tools, and supporting engineering teams as they deploy at scale. This position is well suited for someone who enjoys building high-throughput, highly available systems and wants to have a direct impact on the operations of a SaaS platform in production. What you will do Scale infrastructure: Improve and optimize Azure cloud environments and Kubernetes clusters to support global growth and high availability. Ensure resilience and safety: Build and maintain zero-downtime deployments, rollback strategies, and disaster recovery plans to keep the platform running around the clock. Advance observability: Enhance the LGTM stack (Loki, Grafana, Tempo, Mimir) to provide teams with visibility and use these tools to define and refine Service Level Objectives (SLOs). Automate infrastructure: Create and improve infrastructure as code using Pulumi in Go, reducing manual work and making the platform more self-service for engineers. Promote reliability practices: Support CI/CD best practices, incident response, post-mortems, and improvements to developer experience across engineering. Shape the platform’s future: Collaborate with your squad and engineering leadership to influence the roadmap, including scaling, cost management, security, and compliance. Location This role is open to candidates based in Berlin or Stuttgart, Germany, as well as remote applicants located within Europe.

Apr 23, 2026

Apply

Site Reliability Engineer (m/w/d)

flipapp

Full-time|Hybrid|Berlin, Berlin, Germany; Remote (Europe); Stuttgart, Baden-Württemberg, Germany

Flip develops an AI-powered employee experience platform designed for frontline workers. The company’s mission is to make internal information easily accessible for every employee, wherever they work. Flip is expanding quickly and aims to change how millions of frontline employees stay connected with their organizations. Role overview The Site Reliability Engineer (m/w/d) joins the Platform Squad to keep Flip’s infrastructure fast, resilient, and ready for growth. This role focuses on shaping reliability practices, building internal tools, and fostering a culture where engineering teams can deploy confidently at scale while maintaining high uptime. The position is well-suited for those who enjoy designing high-throughput, highly available systems and want to influence the production operations of a growing SaaS platform. Key responsibilities Enable scaling: Expand and optimize Azure cloud infrastructure and Kubernetes clusters to support Flip’s global growth, prioritizing high throughput and availability. Ensure resilience & security: Design and implement zero-downtime deployments, effective rollback mechanisms, and disaster recovery strategies to keep the platform available at all times. Create observability: Improve the LGTM stack (Loki, Grafana, Tempo, Mimir) so teams have clear insight into system health and performance. Location This position can be based in Berlin or Stuttgart, Germany, or performed remotely from anywhere in Europe.

Apr 23, 2026

Apply

Senior Site Reliability Engineer - Observability (m/f/x)

Doctolib

Full-time|On-site|Berlin, Berlin, Germany; Paris, Paris, France

At Doctolib, we pride ourselves on fostering a dynamic engineering environment where innovation thrives. Our mission is to enhance the lives of healthcare professionals and patients alike. We are seeking a Senior Site Reliability Engineer to ensure our production systems operate seamlessly, playing a crucial role in supporting the rapid expansion of Doctolib's services. Your Responsibilities As a Senior Site Reliability Engineer within the Core Reliability & Observability team, you will be instrumental in defining the company's observability strategy and maintaining the reliability, debuggability, and scalability of our platform. This position bridges infrastructure, developer experience, and product engineering, focusing on developing and enhancing the core elements of logging, metrics, tracing, and alerting across our organization. Lead the implementation of an observability strategy across the platform, emphasizing scalable, developer-friendly logging and tracing solutions. Identify and spearhead cross-functional reliability initiatives to enhance incident detection, response, and postmortem analysis capabilities. Participate in the on-call rotation and actively work on improving our on-call experience by optimizing alerting, minimizing noise, and providing actionable telemetry. Who You Are You could be our next teammate if you possess: A minimum of 3 years of hands-on experience with large-scale production platforms. Demonstrated proficiency with cloud platforms such as AWS, Azure, or Google Cloud. A strong understanding of containerization and orchestration technologies (Docker and Kubernetes). A deep knowledge of Helm for managing Kubernetes manifests and ArgoCD for GitOps workflows. Extensive expertise in observability tooling and architecture, including: Logging: Fluent Bit, OpenTelemetry, Loki, Elasticsearch, Logstash, Vector. Tracing: OpenTelemetry or proprietary APMs. Metrics: Prometheus, Thanos, Datadog, or equivalent. Proficiency in at least one programming language (e.g., Ruby, Python, Go, Java) and a strong grasp of infrastructure as code principles. Experience with monitoring and observability tools.

Mar 19, 2026

Apply

Site Reliability Engineer

Orcrist Technologies

Full-time|Remote|Remote / Berlin

Site Reliability Engineer Company Overview At Orcrist Technologies, we are pioneering a next-generation data intelligence platform designed to manage petabyte-scale data with lightning-fast query responses. Our innovative solution is based on Kubernetes and is offered as both a B2B SaaS and an on-premise self-hosted option, including air-gapped deployments. We empower clients in defense, law enforcement, and enterprise sectors to translate mission-critical data into actionable insights. Your Role As a Site Reliability Engineer, you will be integral in deploying and managing our data intelligence platform within agency-controlled environments. You will construct and operate secure, highly available Kubernetes clusters, both on-premises and in hybrid architectures. In this role, you will also respond as a forward-deployed SRE during incidents and upgrades, ensuring our systems adhere to strict privacy, audit, and legal evidence standards tailored for law enforcement applications. Key Responsibilities Deploy, install, and manage Kubernetes clusters for our platform in on-prem and hybrid settings. Configure and maintain GitOps workflows, Helm/Kustomize, and artifact registries within restricted networks. Design and lead incident response initiatives for the observability stack (Prometheus, Grafana) and enforce disaster recovery protocols. Enhance system security through network segmentation, mTLS, IAM, and vulnerability remediation. Create compliance documentation, operational runbooks, and train both agency and Orcrist teams on best practices. About You 5+ years of experience in SRE/DevOps, with a focus on on-call ownership and managing production systems. Extensive hands-on experience with Kubernetes (on-prem/hybrid), GitOps (Argo CD/Flux), and infrastructure automation tools (Ansible, Terraform). Strong expertise in observability tools (Prometheus, Grafana, Loki) and complex incident response methodologies. Fluency in both German and English (C1+), authorized to work in Germany, with a willingness to travel (20–30%). Preferred Qualifications In-depth understanding of IT and governance frameworks within law enforcement or the public sector. Relevant certifications such as CKA/CKAD, ISO 27001 Lead Implementer, CISSP, or GDPR Practitioner. Demonstrated experience integrating with essential enterprise systems, including Identity and Access Management (SAML, LDAP), and Security Information and Event Management (SIEM) platforms. Familiarity with digital evidence workflows and contributions to judicial processes. Previous exposure to managing sensitive environments, including air-gapped systems and investigative tools for public safety.

Jan 9, 2026

Apply

Site Reliability Engineering Lead (f/m/d)

Upvest

Full-time|Hybrid|Berlin

Join Upvest, where we aim to revolutionize investment accessibility, making it as seamless as everyday spending. Our innovative Investment API allows businesses to offer a diverse array of investment products while enhancing capital market investment and retirement planning experiences.As one of Europe's leading fintechs, Upvest provides a comprehensive suite of investment opportunities for our B2B clients, spanning principal broking, proprietary trading, and secure custody for traditional securities. Founded in 2017 by Martin Kassing, we have expanded to over 240 employees across Europe, supported by a recent €100 million Series C funding round led by Hedosophia and Sapphire Ventures, along with esteemed existing investors such as Bessemer Venture Partners and BlackRock.With our headquarters in Berlin and additional hubs in Tallinn and London, we embrace a hybrid work model, allowing flexibility with regular travel to Berlin.The OpportunityAt Upvest, reliability is not just a metric; it's the cornerstone of our growth. As we rapidly scale, we are committed to establishing a dedicated Site Reliability Engineering (SRE) function aimed at continuously enhancing our reliability standards. This is your opportunity to redefine what exceptional reliability entails for a high-growth fintech leader.You will have the autonomy to create a reliability culture, establish standards, and implement practices that will guide us through our next phase of expansion. If you've ever envisioned building an SRE practice from the ground up, now is your moment.The RoleYour mission as the SRE Lead will focus on prevention rather than reaction. You will be a blend of technical visionary and organizational innovator, integrating reliability into our development processes. Collaborating closely with engineering teams, you will enhance observability and resilience while creating frameworks that enable us to operate swiftly without sacrificing stability. Rather than owning services, your role will be to elevate those who do.Your influence will extend to shaping engineering leaders' perspectives on reliability, guiding product managers in balancing features with stability, and defining what it means to be 'production-ready' across the organization. You will lead and mentor a talented team of 2 to 4 SREs, fostering a culture of excellence that amplifies our impact.

Dec 11, 2025

Create account — see all 4,249 results