Clicking Apply Now takes you to AutoApply where you can tailor your resume and apply.
Unlock Your Potential
Generate Job-Optimized Resume
One Click And Our AI Optimizes Your Resume to Match The Job Description.
Is Your Resume Optimized For This Role?
Find Out If You're Highlighting The Right Skills And Fix What's Missing
Experience Level
Senior
Qualifications
We are looking for skilled individuals with a passion for site reliability engineering. If you have experience in cloud infrastructure, automation, and monitoring, we want to hear from you! A background in software engineering and a proactive approach to problem-solving will be essential.
About the job
Why Join Scout24? Scout24 is the proud home of ImmoScout24, Germany's premier platform for real estate. For over 25 years, we have been at the forefront of transforming the real estate market in Germany and Austria. Our mission is to create a digital ecosystem that unites homeowners, seekers, and agents, making the journey to find the perfect home a seamless experience. Your career is as vital as finding the right property; hence, #WorkingatScout24 means you will be part of a vibrant, diverse team of around 1,100 colleagues from 58 nationalities. We celebrate individuality and foster a culture of open-mindedness and authenticity, enabling true learning and personal growth. Mistakes are viewed as opportunities for growth and innovation. Together, we proactively strive for improvement and take responsibility, discussing both successes and challenges with mutual respect because we are #oneteam.
If this resonates with you, we would love to welcome you on board! Even if you don't meet every requirement, we encourage you to share how you can contribute to our team. Grow with us! Welcome home!
Beyond our outstanding company culture, we offer exceptional benefits that make Scout24 a fantastic workplace!
About Scout24 AG
Scout24 AG is a leading digital marketplace in the real estate sector, dedicated to connecting homeowners, seekers, and agents through innovative technology. With a commitment to diversity and inclusion, we strive to create a supportive environment for all our employees.
Why Join Scout24?Scout24 is the proud home of ImmoScout24, Germany's premier platform for real estate. For over 25 years, we have been at the forefront of transforming the real estate market in Germany and Austria. Our mission is to create a digital ecosystem that unites homeowners, seekers, and agents, making the journey to find the perfect home a seamless experience. Your career is as vital as finding the right property; hence, #WorkingatScout24 means you will be part of a vibrant, diverse team of around 1,100 colleagues from 58 nationalities. We celebrate individuality and foster a culture of open-mindedness and authenticity, enabling true learning and personal growth. Mistakes are viewed as opportunities for growth and innovation. Together, we proactively strive for improvement and take responsibility, discussing both successes and challenges with mutual respect because we are #oneteam.If this resonates with you, we would love to welcome you on board! Even if you don't meet every requirement, we encourage you to share how you can contribute to our team. Grow with us! Welcome home!Beyond our outstanding company culture, we offer exceptional benefits that make Scout24 a fantastic workplace!
We are seeking a dynamic and results-oriented Strategic Account Manager to join our team at Scout24. In this role, you will be responsible for managing and nurturing key client relationships, driving strategic initiatives, and ensuring customer satisfaction. You will leverage your expertise to identify growth opportunities and collaborate with internal teams to deliver exceptional value to our clients.
Join redcare-pharmacy as a Senior Site Reliability Engineer in Berlin. We are seeking a talented and experienced individual who can enhance our infrastructure and ensure the reliability and performance of our systems. This role will involve collaboration with development teams to build scalable systems and improve our operational practices.
Role Overview scalablegmbh is looking for a Senior Cloud Site Reliability Engineer with a focus on network systems. This position is based in Berlin. What You Will Do Maintain and improve the reliability, performance, and scalability of cloud infrastructure. Work closely with engineering teams to optimize network services and resolve technical challenges. Contribute to developing solutions that strengthen network systems. Support a culture of ongoing improvement across the organization. About You Bring expertise in cloud technologies and network systems. Enjoy solving complex problems and collaborating with others. Ready to make an impact in a growing company.
Join Almedia, a pioneering company on a mission to revolutionize marketing by rewarding a community of over 60 million users for their engagement with global brands. Here, you can accelerate your career in an exciting environment aiming to become Germany's next bootstrapped unicorn, recognized as Europe's #3 fastest-growing company in 2025 (FT1000).We are seeking a passionate and skilled Site Reliability Engineer / DevOps to help us maintain the performance and reliability of our high-traffic platform.
Superhuman embraces a dynamic hybrid working model for this position, offering team members the ideal balance of focused work and in-person collaboration that nurtures trust, innovation, and a vibrant team culture.About SuperhumanSuperhuman is at the forefront of AI productivity, empowering individuals to reach their superhuman potential. As the proud home of Grammarly, our suite of applications integrates seamlessly with over 1 million platforms, enhancing productivity through intelligent features. Our offerings include Grammarly's writing assistance, Coda's collaborative spaces, and Go, an AI assistant that proactively provides contextual support. Since our inception in 2009, we have transformed the workflows of more than 40 million users, 50,000 organizations, and 3,000 educational institutions globally. Discover more at superhuman.com.The OpportunityIn pursuit of our ambitious goals, we seek a Site Reliability Engineer (SRE) to strengthen our infrastructure team. This pivotal role involves developing software to enhance the reliability of our backend systems, collaborating closely with engineers, and strategizing for future scalability. You will engage with our existing production engineering teams in the EU as we transition away from the “you build it, you own it” approach.The engineers and researchers at Superhuman are given the freedom to innovate and drive breakthroughs, subsequently influencing our product roadmap. As we expand our interfaces, algorithms, and infrastructure, the complexity of our technical challenges continues to grow. Learn more about our technical endeavors on our technical blog.As an SRE, your responsibilities will include:Scaling our Kubernetes-based control plane that processes billions of events daily.Enhancing our automation systems that respond to workload demands.Deploying machine learning systems company-wide.
Join TechBiz Global as we empower our prestigious clients by providing exceptional recruitment services. We are currently on the lookout for a Founding DevOps Engineer (SRE) to become an integral part of our client's team. If you are eager to advance your career in a cutting-edge environment, this opportunity could be perfect for you.Berlin • Cybersecurity & AI Startup • Recently FundedOur client, an innovative cybersecurity startup based in Berlin, is seeking a DevOps Engineer to join as a founding member and contribute to the development of the core security, identity, and enforcement frameworks of a pioneering AI-driven risk management platform.Founded by seasoned cybersecurity professionals with experience in Israeli intelligence, our client is looking for a proactive Founding DevOps Engineer for a hybrid role located in central Berlin. If you have a passion for cybersecurity and AI, excel in dynamic startup settings, and relish the challenge of building sophisticated platforms from the ground up, this is a chance to make a significant impact.This startup is creating a state-of-the-art cyber risk platform designed to help enterprises effectively comprehend, measure, and mitigate identity risks on a large scale. Their mission is to transform intricate identity and security data into clear, actionable insights that Chief Information Security Officers (CISOs) and Chief Technology Officers (CTOs) can rely on. From day one, you will be instrumental in shaping core platform components, influencing how modern enterprises manage risk using cloud-native technologies, AI-driven analytics, and automated enforcement through AI agents.Key ResponsibilitiesDesign, build, and operate the foundational cloud infrastructure for a secure, scalable, production-ready SaaS platform from the outset.Manage AWS environments comprehensively, encompassing networking, IAM, compute, storage, and security parameters.Develop and sustain Infrastructure as Code practices to ensure efficient deployment and management.
Who We AreHelsing is a pioneering defense AI company dedicated to safeguarding democracies. Our mission is to attain technological leadership, enabling open societies to make sovereign decisions and uphold their ethical standards. As a company, we recognize the profound responsibility that comes with developing and deploying powerful technologies like AI, and we are committed to addressing this responsibility with integrity.Our team consists of driven engineers, AI specialists, and customer-facing program managers who are passionate about solving the most complex and impactful challenges. We embrace a culture of openness and transparency, encouraging healthy debates about the role of technology in defense, its benefits, and its ethical implications.The RoleWe operate primarily in high-security, on-premise environments, and we are seeking a Site Reliability Engineer to support these critical infrastructures. In this role, you will be responsible for the design, implementation, and management of our on-premise Kubernetes infrastructure.We value engineers who exhibit a strong work ethic, prioritize effectively, and excel in teamwork. Clear communication, knowledge sharing, and collaboration are essential to advancing both our team and our mission.The Day-to-DayAs a Site Reliability Engineer, you will design and build cloud-native infrastructure platforms on-premises, focusing on Kubernetes-based solutions that empower our development teams to operate services at scale.You will create robust observability frameworks using tools like Grafana, Prometheus, and distributed tracing to ensure system reliability and performance.You will architect and implement secure, multi-tenant Kubernetes clusters to support our high-security environments.
About PlayStation and Sony Interactive Entertainment PlayStation, part of Sony Interactive Entertainment and a subsidiary of Sony Group Corporation, is known worldwide for delivering leading entertainment experiences. Our portfolio includes PlayStation®5, PlayStation®4, PlayStation®VR, PlayStation®Plus, and acclaimed titles from PlayStation Studios. We value diversity and inclusion, working to create an environment where employees feel empowered and supported. Our teams bring together people who are curious about technology and eager to shape the future of gaming. Role Overview: Site Reliability Engineer Based in Berlin, this Site Reliability Engineer role sits within the Gaming Developer & Future Technology Group (GDFT). The group drives cloud gaming innovation, delivering console-quality experiences to players across TVs, mobile devices, and more. The SRE team plays a central part in maintaining and improving the stability of our cloud gaming services. This position involves shaping both design and operational strategies, owning production systems, ensuring code quality, and managing deployments. SREs here contribute to decisions at multiple levels and work closely with teams throughout the software development lifecycle to support operational readiness and service stability. Main Responsibilities Lead and participate in technical discussions to improve reliability and scalability within the team. Contribute to High-Level Design (HLD) documents for new products and platforms. Mentor junior SREs, providing guidance and support for their growth. Take charge of incident response and post-mortem analysis within the assigned service area. Work with cross-functional groups to drive operational efficiency.
About Air Apps Air Apps began as a family-founded company in Lisbon, Portugal in 2018. The team focuses on building AI-powered tools for personal and entrepreneurial planning, including the Personal & Entrepreneurial Resource Planner (PRP). Over 100 million downloads worldwide mark a significant milestone for the self-funded company, which now has offices in Lisbon and San Francisco. Air Apps pursues long-term goals, working to challenge standard approaches and develop AI-driven solutions that make a real difference. The company values innovation and aims to empower people globally through its products. Site Reliability Engineer Role The Site Reliability Engineer (SRE) will help maintain and improve the reliability, availability, and scalability of Air Apps’ systems. This role bridges software development and operations, focusing on automation, monitoring, and performance tuning to reduce downtime and strengthen system resilience. Work Location This position is fully onsite at the Lisbon office. Collaboration with cross-functional teams is central to the role. Relocation support is available for the right candidate.
Full-time|Hybrid|Berlin, Berlin, Germany; Remote (Europe); Stuttgart, Baden-Württemberg, Germany
Flip develops an AI-powered employee experience platform designed for frontline workers. The company’s mission is to make internal information easily accessible for every employee, wherever they work. Flip is expanding quickly and aims to change how millions of frontline employees stay connected with their organizations. Role overview The Site Reliability Engineer (m/w/d) joins the Platform Squad to keep Flip’s infrastructure fast, resilient, and ready for growth. This role focuses on shaping reliability practices, building internal tools, and fostering a culture where engineering teams can deploy confidently at scale while maintaining high uptime. The position is well-suited for those who enjoy designing high-throughput, highly available systems and want to influence the production operations of a growing SaaS platform. Key responsibilities Enable scaling: Expand and optimize Azure cloud infrastructure and Kubernetes clusters to support Flip’s global growth, prioritizing high throughput and availability. Ensure resilience & security: Design and implement zero-downtime deployments, effective rollback mechanisms, and disaster recovery strategies to keep the platform available at all times. Create observability: Improve the LGTM stack (Loki, Grafana, Tempo, Mimir) so teams have clear insight into system health and performance. Location This position can be based in Berlin or Stuttgart, Germany, or performed remotely from anywhere in Europe.
Why Join Scout24?Scout24 is the proud home of ImmoScout24, Germany's leading real estate platform. For over 25 years, we have been transforming the real estate landscape in Germany and Austria. Our mission is to create a digital ecosystem that connects homeowners, seekers, and real estate agents, making the process of finding the perfect home seamless and efficient. Your career is as significant as finding the right property! At #WorkingatScout24, you will join a vibrant, inclusive team of around 1,100 colleagues from 58 different nationalities. We celebrate diversity and individuality, fostering a culture of openness and authenticity that promotes personal growth. We view mistakes as opportunities for innovation and improvement, and together we strive for success while treating one another with respect as #oneteam.If this resonates with you, we would be thrilled to welcome you aboard! Even if you believe you don't meet all the job requirements, we are eager to learn how you can contribute to our team. Join us in our journey of growth! Welcome home!In addition to our exceptional team and culture, we offer fantastic benefits that make Scout24 an outstanding workplace!
Join the Scout24 Team!Scout24, the proud home of ImmoScout24, has been at the forefront of transforming the real estate market in Germany and Austria for over 25 years. Our mission is to create a digital ecosystem that connects homeowners, seekers, and agents seamlessly. We understand that finding the perfect home is one of life's pivotal decisions—just like choosing your career! #WorkingatScout24 means being part of a vibrant and inclusive team composed of approximately 1,100 professionals from 58 different nationalities. We celebrate diversity and individuality while fostering a culture of openness and authenticity that enables personal growth and learning. We believe that mistakes are opportunities for innovation and growth. Together, we take proactive steps towards improvement, embracing responsibility, and engaging in respectful discussions about our successes and challenges because we are #oneteam.If this resonates with you, we would be thrilled to have you on board! Even if you don’t meet every requirement, we welcome the unique value you can bring to our team. Grow with us!Beyond our supportive culture, we offer an array of fantastic benefits that make Scout24 a remarkable workplace!
Full-time|On-site|Berlin, Berlin, Germany; Paris, Paris, France
At Doctolib, we pride ourselves on fostering a dynamic engineering environment where innovation thrives. Our mission is to enhance the lives of healthcare professionals and patients alike. We are seeking a Senior Site Reliability Engineer to ensure our production systems operate seamlessly, playing a crucial role in supporting the rapid expansion of Doctolib's services. Your Responsibilities As a Senior Site Reliability Engineer within the Core Reliability & Observability team, you will be instrumental in defining the company's observability strategy and maintaining the reliability, debuggability, and scalability of our platform. This position bridges infrastructure, developer experience, and product engineering, focusing on developing and enhancing the core elements of logging, metrics, tracing, and alerting across our organization. Lead the implementation of an observability strategy across the platform, emphasizing scalable, developer-friendly logging and tracing solutions. Identify and spearhead cross-functional reliability initiatives to enhance incident detection, response, and postmortem analysis capabilities. Participate in the on-call rotation and actively work on improving our on-call experience by optimizing alerting, minimizing noise, and providing actionable telemetry. Who You Are You could be our next teammate if you possess: A minimum of 3 years of hands-on experience with large-scale production platforms. Demonstrated proficiency with cloud platforms such as AWS, Azure, or Google Cloud. A strong understanding of containerization and orchestration technologies (Docker and Kubernetes). A deep knowledge of Helm for managing Kubernetes manifests and ArgoCD for GitOps workflows. Extensive expertise in observability tooling and architecture, including: Logging: Fluent Bit, OpenTelemetry, Loki, Elasticsearch, Logstash, Vector. Tracing: OpenTelemetry or proprietary APMs. Metrics: Prometheus, Thanos, Datadog, or equivalent. Proficiency in at least one programming language (e.g., Ruby, Python, Go, Java) and a strong grasp of infrastructure as code principles. Experience with monitoring and observability tools.
About the RoleAs a Senior Product Manager for our Private Seeker Vertical, you will be responsible for the growth and scaling of our Wohnen+ membership, focusing clearly on tenants and property owners. You will develop a scalable subscription product that creates genuine value throughout the entire housing and real estate journey— from searching and applying to financing, concluding contracts, and managing and optimizing properties or rental agreements.Your core objectives: You will create a product that continuously supports tenants and property owners, guiding them through all aspects related to their properties.Your focus lies on growth, product-market fit, and data-driven product development. You will work iteratively, testing hypotheses with clear success criteria and consistently transforming valid insights into scalable solutions. In close collaboration with Tech, Data, Marketing, and UX, you will manage an interdisciplinary product setup with a clear impact focus.Your ResponsibilitiesGrowth & Scaling: End-to-end responsibility for the growth of Wohnen+, scaling the user base, conversion, retention, and revenue, and continually enhancing the value proposition for tenants.
Site Reliability Engineer Company Overview At Orcrist Technologies, we are pioneering a next-generation data intelligence platform designed to manage petabyte-scale data with lightning-fast query responses. Our innovative solution is based on Kubernetes and is offered as both a B2B SaaS and an on-premise self-hosted option, including air-gapped deployments. We empower clients in defense, law enforcement, and enterprise sectors to translate mission-critical data into actionable insights. Your Role As a Site Reliability Engineer, you will be integral in deploying and managing our data intelligence platform within agency-controlled environments. You will construct and operate secure, highly available Kubernetes clusters, both on-premises and in hybrid architectures. In this role, you will also respond as a forward-deployed SRE during incidents and upgrades, ensuring our systems adhere to strict privacy, audit, and legal evidence standards tailored for law enforcement applications. Key Responsibilities Deploy, install, and manage Kubernetes clusters for our platform in on-prem and hybrid settings. Configure and maintain GitOps workflows, Helm/Kustomize, and artifact registries within restricted networks. Design and lead incident response initiatives for the observability stack (Prometheus, Grafana) and enforce disaster recovery protocols. Enhance system security through network segmentation, mTLS, IAM, and vulnerability remediation. Create compliance documentation, operational runbooks, and train both agency and Orcrist teams on best practices. About You 5+ years of experience in SRE/DevOps, with a focus on on-call ownership and managing production systems. Extensive hands-on experience with Kubernetes (on-prem/hybrid), GitOps (Argo CD/Flux), and infrastructure automation tools (Ansible, Terraform). Strong expertise in observability tools (Prometheus, Grafana, Loki) and complex incident response methodologies. Fluency in both German and English (C1+), authorized to work in Germany, with a willingness to travel (20–30%). Preferred Qualifications In-depth understanding of IT and governance frameworks within law enforcement or the public sector. Relevant certifications such as CKA/CKAD, ISO 27001 Lead Implementer, CISSP, or GDPR Practitioner. Demonstrated experience integrating with essential enterprise systems, including Identity and Access Management (SAML, LDAP), and Security Information and Event Management (SIEM) platforms. Familiarity with digital evidence workflows and contributions to judicial processes. Previous exposure to managing sensitive environments, including air-gapped systems and investigative tools for public safety.
Join Upvest, where we aim to revolutionize investment accessibility, making it as seamless as everyday spending. Our innovative Investment API allows businesses to offer a diverse array of investment products while enhancing capital market investment and retirement planning experiences.As one of Europe's leading fintechs, Upvest provides a comprehensive suite of investment opportunities for our B2B clients, spanning principal broking, proprietary trading, and secure custody for traditional securities. Founded in 2017 by Martin Kassing, we have expanded to over 240 employees across Europe, supported by a recent €100 million Series C funding round led by Hedosophia and Sapphire Ventures, along with esteemed existing investors such as Bessemer Venture Partners and BlackRock.With our headquarters in Berlin and additional hubs in Tallinn and London, we embrace a hybrid work model, allowing flexibility with regular travel to Berlin.The OpportunityAt Upvest, reliability is not just a metric; it's the cornerstone of our growth. As we rapidly scale, we are committed to establishing a dedicated Site Reliability Engineering (SRE) function aimed at continuously enhancing our reliability standards. This is your opportunity to redefine what exceptional reliability entails for a high-growth fintech leader.You will have the autonomy to create a reliability culture, establish standards, and implement practices that will guide us through our next phase of expansion. If you've ever envisioned building an SRE practice from the ground up, now is your moment.The RoleYour mission as the SRE Lead will focus on prevention rather than reaction. You will be a blend of technical visionary and organizational innovator, integrating reliability into our development processes. Collaborating closely with engineering teams, you will enhance observability and resilience while creating frameworks that enable us to operate swiftly without sacrificing stability. Rather than owning services, your role will be to elevate those who do.Your influence will extend to shaping engineering leaders' perspectives on reliability, guiding product managers in balancing features with stability, and defining what it means to be 'production-ready' across the organization. You will lead and mentor a talented team of 2 to 4 SREs, fostering a culture of excellence that amplifies our impact.
N26 is looking for a Site Reliability Engineer to join the Platform Engineering team in Berlin. This role centers on maintaining and improving the reliability, performance, and scalability of core systems. Role overview Work closely with cross-functional teams to support and enhance the platform. The focus is on building solutions that keep systems stable and responsive as the company grows. What you will do Monitor and improve system reliability and uptime Collaborate with other teams to address performance and scalability challenges Contribute to solutions that strengthen the platform’s technical foundation Location This position is based in Berlin.
As a Principal Product Manager in Site Reliability Engineering at Delivery Hero, you will take the lead in enhancing our site reliability practices to ensure optimal performance and availability of our platforms. You will collaborate with cross-functional teams to define product strategies, drive initiatives, and implement solutions that enhance user experience and operational efficiency. Your expertise will guide our engineering teams in adopting best practices and innovative technologies to maintain our position as a leader in the online food delivery market.
GetYourGuide connects travelers with memorable experiences in over 12,000 cities. Since 2009, the company has helped millions discover new destinations. The Berlin headquarters leads a global team, with offices in cities such as New York and Bangkok. More than 850 employees collaborate to reshape how people find and book travel adventures. The Staff Site Reliability Engineer joins the Operational Excellence team, which works to minimize disruptions, boost productivity, and build user trust. As GetYourGuide expands its AI-powered travel solutions, this role ensures engineering speed and reliability remain strong so customers enjoy seamless experiences. What you will do Collaborate with product teams to improve system reliability, performance, and trust across the platform. Incident management and reliability Reduce the number of incidents, as well as Mean Time to Detect (MTTD) and Mean Time to Recovery (MTTR). Lead post-incident reviews and turn findings into lasting improvements. Create tools and runbooks that speed up diagnosis and resolution of production issues. Foster a culture that treats incidents as learning opportunities, not blame assignments. Take part in the infrastructure on-call rotation. Observability and production confidence Advance the Datadog-based observability stack, including metrics, logs, traces, dashboards, and alerts. Help teams define meaningful Service Level Objectives (SLOs) and prevent alert fatigue. Strengthen production debugging tools so engineers can solve issues independently. Change confidence and release quality Lower change failure rates by guiding teams on effective testing and deployment practices. Learn more about GetYourGuide’s team and mission at getyourguide.careers.
Why Join Scout24?Scout24 is the proud home of ImmoScout24, Germany's premier platform for real estate. For over 25 years, we have been at the forefront of transforming the real estate market in Germany and Austria. Our mission is to create a digital ecosystem that unites homeowners, seekers, and agents, making the journey to find the perfect home a seamless experience. Your career is as vital as finding the right property; hence, #WorkingatScout24 means you will be part of a vibrant, diverse team of around 1,100 colleagues from 58 nationalities. We celebrate individuality and foster a culture of open-mindedness and authenticity, enabling true learning and personal growth. Mistakes are viewed as opportunities for growth and innovation. Together, we proactively strive for improvement and take responsibility, discussing both successes and challenges with mutual respect because we are #oneteam.If this resonates with you, we would love to welcome you on board! Even if you don't meet every requirement, we encourage you to share how you can contribute to our team. Grow with us! Welcome home!Beyond our outstanding company culture, we offer exceptional benefits that make Scout24 a fantastic workplace!
We are seeking a dynamic and results-oriented Strategic Account Manager to join our team at Scout24. In this role, you will be responsible for managing and nurturing key client relationships, driving strategic initiatives, and ensuring customer satisfaction. You will leverage your expertise to identify growth opportunities and collaborate with internal teams to deliver exceptional value to our clients.
Join redcare-pharmacy as a Senior Site Reliability Engineer in Berlin. We are seeking a talented and experienced individual who can enhance our infrastructure and ensure the reliability and performance of our systems. This role will involve collaboration with development teams to build scalable systems and improve our operational practices.
Role Overview scalablegmbh is looking for a Senior Cloud Site Reliability Engineer with a focus on network systems. This position is based in Berlin. What You Will Do Maintain and improve the reliability, performance, and scalability of cloud infrastructure. Work closely with engineering teams to optimize network services and resolve technical challenges. Contribute to developing solutions that strengthen network systems. Support a culture of ongoing improvement across the organization. About You Bring expertise in cloud technologies and network systems. Enjoy solving complex problems and collaborating with others. Ready to make an impact in a growing company.
Join Almedia, a pioneering company on a mission to revolutionize marketing by rewarding a community of over 60 million users for their engagement with global brands. Here, you can accelerate your career in an exciting environment aiming to become Germany's next bootstrapped unicorn, recognized as Europe's #3 fastest-growing company in 2025 (FT1000).We are seeking a passionate and skilled Site Reliability Engineer / DevOps to help us maintain the performance and reliability of our high-traffic platform.
Superhuman embraces a dynamic hybrid working model for this position, offering team members the ideal balance of focused work and in-person collaboration that nurtures trust, innovation, and a vibrant team culture.About SuperhumanSuperhuman is at the forefront of AI productivity, empowering individuals to reach their superhuman potential. As the proud home of Grammarly, our suite of applications integrates seamlessly with over 1 million platforms, enhancing productivity through intelligent features. Our offerings include Grammarly's writing assistance, Coda's collaborative spaces, and Go, an AI assistant that proactively provides contextual support. Since our inception in 2009, we have transformed the workflows of more than 40 million users, 50,000 organizations, and 3,000 educational institutions globally. Discover more at superhuman.com.The OpportunityIn pursuit of our ambitious goals, we seek a Site Reliability Engineer (SRE) to strengthen our infrastructure team. This pivotal role involves developing software to enhance the reliability of our backend systems, collaborating closely with engineers, and strategizing for future scalability. You will engage with our existing production engineering teams in the EU as we transition away from the “you build it, you own it” approach.The engineers and researchers at Superhuman are given the freedom to innovate and drive breakthroughs, subsequently influencing our product roadmap. As we expand our interfaces, algorithms, and infrastructure, the complexity of our technical challenges continues to grow. Learn more about our technical endeavors on our technical blog.As an SRE, your responsibilities will include:Scaling our Kubernetes-based control plane that processes billions of events daily.Enhancing our automation systems that respond to workload demands.Deploying machine learning systems company-wide.
Join TechBiz Global as we empower our prestigious clients by providing exceptional recruitment services. We are currently on the lookout for a Founding DevOps Engineer (SRE) to become an integral part of our client's team. If you are eager to advance your career in a cutting-edge environment, this opportunity could be perfect for you.Berlin • Cybersecurity & AI Startup • Recently FundedOur client, an innovative cybersecurity startup based in Berlin, is seeking a DevOps Engineer to join as a founding member and contribute to the development of the core security, identity, and enforcement frameworks of a pioneering AI-driven risk management platform.Founded by seasoned cybersecurity professionals with experience in Israeli intelligence, our client is looking for a proactive Founding DevOps Engineer for a hybrid role located in central Berlin. If you have a passion for cybersecurity and AI, excel in dynamic startup settings, and relish the challenge of building sophisticated platforms from the ground up, this is a chance to make a significant impact.This startup is creating a state-of-the-art cyber risk platform designed to help enterprises effectively comprehend, measure, and mitigate identity risks on a large scale. Their mission is to transform intricate identity and security data into clear, actionable insights that Chief Information Security Officers (CISOs) and Chief Technology Officers (CTOs) can rely on. From day one, you will be instrumental in shaping core platform components, influencing how modern enterprises manage risk using cloud-native technologies, AI-driven analytics, and automated enforcement through AI agents.Key ResponsibilitiesDesign, build, and operate the foundational cloud infrastructure for a secure, scalable, production-ready SaaS platform from the outset.Manage AWS environments comprehensively, encompassing networking, IAM, compute, storage, and security parameters.Develop and sustain Infrastructure as Code practices to ensure efficient deployment and management.
Who We AreHelsing is a pioneering defense AI company dedicated to safeguarding democracies. Our mission is to attain technological leadership, enabling open societies to make sovereign decisions and uphold their ethical standards. As a company, we recognize the profound responsibility that comes with developing and deploying powerful technologies like AI, and we are committed to addressing this responsibility with integrity.Our team consists of driven engineers, AI specialists, and customer-facing program managers who are passionate about solving the most complex and impactful challenges. We embrace a culture of openness and transparency, encouraging healthy debates about the role of technology in defense, its benefits, and its ethical implications.The RoleWe operate primarily in high-security, on-premise environments, and we are seeking a Site Reliability Engineer to support these critical infrastructures. In this role, you will be responsible for the design, implementation, and management of our on-premise Kubernetes infrastructure.We value engineers who exhibit a strong work ethic, prioritize effectively, and excel in teamwork. Clear communication, knowledge sharing, and collaboration are essential to advancing both our team and our mission.The Day-to-DayAs a Site Reliability Engineer, you will design and build cloud-native infrastructure platforms on-premises, focusing on Kubernetes-based solutions that empower our development teams to operate services at scale.You will create robust observability frameworks using tools like Grafana, Prometheus, and distributed tracing to ensure system reliability and performance.You will architect and implement secure, multi-tenant Kubernetes clusters to support our high-security environments.
About PlayStation and Sony Interactive Entertainment PlayStation, part of Sony Interactive Entertainment and a subsidiary of Sony Group Corporation, is known worldwide for delivering leading entertainment experiences. Our portfolio includes PlayStation®5, PlayStation®4, PlayStation®VR, PlayStation®Plus, and acclaimed titles from PlayStation Studios. We value diversity and inclusion, working to create an environment where employees feel empowered and supported. Our teams bring together people who are curious about technology and eager to shape the future of gaming. Role Overview: Site Reliability Engineer Based in Berlin, this Site Reliability Engineer role sits within the Gaming Developer & Future Technology Group (GDFT). The group drives cloud gaming innovation, delivering console-quality experiences to players across TVs, mobile devices, and more. The SRE team plays a central part in maintaining and improving the stability of our cloud gaming services. This position involves shaping both design and operational strategies, owning production systems, ensuring code quality, and managing deployments. SREs here contribute to decisions at multiple levels and work closely with teams throughout the software development lifecycle to support operational readiness and service stability. Main Responsibilities Lead and participate in technical discussions to improve reliability and scalability within the team. Contribute to High-Level Design (HLD) documents for new products and platforms. Mentor junior SREs, providing guidance and support for their growth. Take charge of incident response and post-mortem analysis within the assigned service area. Work with cross-functional groups to drive operational efficiency.
About Air Apps Air Apps began as a family-founded company in Lisbon, Portugal in 2018. The team focuses on building AI-powered tools for personal and entrepreneurial planning, including the Personal & Entrepreneurial Resource Planner (PRP). Over 100 million downloads worldwide mark a significant milestone for the self-funded company, which now has offices in Lisbon and San Francisco. Air Apps pursues long-term goals, working to challenge standard approaches and develop AI-driven solutions that make a real difference. The company values innovation and aims to empower people globally through its products. Site Reliability Engineer Role The Site Reliability Engineer (SRE) will help maintain and improve the reliability, availability, and scalability of Air Apps’ systems. This role bridges software development and operations, focusing on automation, monitoring, and performance tuning to reduce downtime and strengthen system resilience. Work Location This position is fully onsite at the Lisbon office. Collaboration with cross-functional teams is central to the role. Relocation support is available for the right candidate.
Full-time|Hybrid|Berlin, Berlin, Germany; Remote (Europe); Stuttgart, Baden-Württemberg, Germany
Flip develops an AI-powered employee experience platform designed for frontline workers. The company’s mission is to make internal information easily accessible for every employee, wherever they work. Flip is expanding quickly and aims to change how millions of frontline employees stay connected with their organizations. Role overview The Site Reliability Engineer (m/w/d) joins the Platform Squad to keep Flip’s infrastructure fast, resilient, and ready for growth. This role focuses on shaping reliability practices, building internal tools, and fostering a culture where engineering teams can deploy confidently at scale while maintaining high uptime. The position is well-suited for those who enjoy designing high-throughput, highly available systems and want to influence the production operations of a growing SaaS platform. Key responsibilities Enable scaling: Expand and optimize Azure cloud infrastructure and Kubernetes clusters to support Flip’s global growth, prioritizing high throughput and availability. Ensure resilience & security: Design and implement zero-downtime deployments, effective rollback mechanisms, and disaster recovery strategies to keep the platform available at all times. Create observability: Improve the LGTM stack (Loki, Grafana, Tempo, Mimir) so teams have clear insight into system health and performance. Location This position can be based in Berlin or Stuttgart, Germany, or performed remotely from anywhere in Europe.
Why Join Scout24?Scout24 is the proud home of ImmoScout24, Germany's leading real estate platform. For over 25 years, we have been transforming the real estate landscape in Germany and Austria. Our mission is to create a digital ecosystem that connects homeowners, seekers, and real estate agents, making the process of finding the perfect home seamless and efficient. Your career is as significant as finding the right property! At #WorkingatScout24, you will join a vibrant, inclusive team of around 1,100 colleagues from 58 different nationalities. We celebrate diversity and individuality, fostering a culture of openness and authenticity that promotes personal growth. We view mistakes as opportunities for innovation and improvement, and together we strive for success while treating one another with respect as #oneteam.If this resonates with you, we would be thrilled to welcome you aboard! Even if you believe you don't meet all the job requirements, we are eager to learn how you can contribute to our team. Join us in our journey of growth! Welcome home!In addition to our exceptional team and culture, we offer fantastic benefits that make Scout24 an outstanding workplace!
Join the Scout24 Team!Scout24, the proud home of ImmoScout24, has been at the forefront of transforming the real estate market in Germany and Austria for over 25 years. Our mission is to create a digital ecosystem that connects homeowners, seekers, and agents seamlessly. We understand that finding the perfect home is one of life's pivotal decisions—just like choosing your career! #WorkingatScout24 means being part of a vibrant and inclusive team composed of approximately 1,100 professionals from 58 different nationalities. We celebrate diversity and individuality while fostering a culture of openness and authenticity that enables personal growth and learning. We believe that mistakes are opportunities for innovation and growth. Together, we take proactive steps towards improvement, embracing responsibility, and engaging in respectful discussions about our successes and challenges because we are #oneteam.If this resonates with you, we would be thrilled to have you on board! Even if you don’t meet every requirement, we welcome the unique value you can bring to our team. Grow with us!Beyond our supportive culture, we offer an array of fantastic benefits that make Scout24 a remarkable workplace!
Full-time|On-site|Berlin, Berlin, Germany; Paris, Paris, France
At Doctolib, we pride ourselves on fostering a dynamic engineering environment where innovation thrives. Our mission is to enhance the lives of healthcare professionals and patients alike. We are seeking a Senior Site Reliability Engineer to ensure our production systems operate seamlessly, playing a crucial role in supporting the rapid expansion of Doctolib's services. Your Responsibilities As a Senior Site Reliability Engineer within the Core Reliability & Observability team, you will be instrumental in defining the company's observability strategy and maintaining the reliability, debuggability, and scalability of our platform. This position bridges infrastructure, developer experience, and product engineering, focusing on developing and enhancing the core elements of logging, metrics, tracing, and alerting across our organization. Lead the implementation of an observability strategy across the platform, emphasizing scalable, developer-friendly logging and tracing solutions. Identify and spearhead cross-functional reliability initiatives to enhance incident detection, response, and postmortem analysis capabilities. Participate in the on-call rotation and actively work on improving our on-call experience by optimizing alerting, minimizing noise, and providing actionable telemetry. Who You Are You could be our next teammate if you possess: A minimum of 3 years of hands-on experience with large-scale production platforms. Demonstrated proficiency with cloud platforms such as AWS, Azure, or Google Cloud. A strong understanding of containerization and orchestration technologies (Docker and Kubernetes). A deep knowledge of Helm for managing Kubernetes manifests and ArgoCD for GitOps workflows. Extensive expertise in observability tooling and architecture, including: Logging: Fluent Bit, OpenTelemetry, Loki, Elasticsearch, Logstash, Vector. Tracing: OpenTelemetry or proprietary APMs. Metrics: Prometheus, Thanos, Datadog, or equivalent. Proficiency in at least one programming language (e.g., Ruby, Python, Go, Java) and a strong grasp of infrastructure as code principles. Experience with monitoring and observability tools.
About the RoleAs a Senior Product Manager for our Private Seeker Vertical, you will be responsible for the growth and scaling of our Wohnen+ membership, focusing clearly on tenants and property owners. You will develop a scalable subscription product that creates genuine value throughout the entire housing and real estate journey— from searching and applying to financing, concluding contracts, and managing and optimizing properties or rental agreements.Your core objectives: You will create a product that continuously supports tenants and property owners, guiding them through all aspects related to their properties.Your focus lies on growth, product-market fit, and data-driven product development. You will work iteratively, testing hypotheses with clear success criteria and consistently transforming valid insights into scalable solutions. In close collaboration with Tech, Data, Marketing, and UX, you will manage an interdisciplinary product setup with a clear impact focus.Your ResponsibilitiesGrowth & Scaling: End-to-end responsibility for the growth of Wohnen+, scaling the user base, conversion, retention, and revenue, and continually enhancing the value proposition for tenants.
Site Reliability Engineer Company Overview At Orcrist Technologies, we are pioneering a next-generation data intelligence platform designed to manage petabyte-scale data with lightning-fast query responses. Our innovative solution is based on Kubernetes and is offered as both a B2B SaaS and an on-premise self-hosted option, including air-gapped deployments. We empower clients in defense, law enforcement, and enterprise sectors to translate mission-critical data into actionable insights. Your Role As a Site Reliability Engineer, you will be integral in deploying and managing our data intelligence platform within agency-controlled environments. You will construct and operate secure, highly available Kubernetes clusters, both on-premises and in hybrid architectures. In this role, you will also respond as a forward-deployed SRE during incidents and upgrades, ensuring our systems adhere to strict privacy, audit, and legal evidence standards tailored for law enforcement applications. Key Responsibilities Deploy, install, and manage Kubernetes clusters for our platform in on-prem and hybrid settings. Configure and maintain GitOps workflows, Helm/Kustomize, and artifact registries within restricted networks. Design and lead incident response initiatives for the observability stack (Prometheus, Grafana) and enforce disaster recovery protocols. Enhance system security through network segmentation, mTLS, IAM, and vulnerability remediation. Create compliance documentation, operational runbooks, and train both agency and Orcrist teams on best practices. About You 5+ years of experience in SRE/DevOps, with a focus on on-call ownership and managing production systems. Extensive hands-on experience with Kubernetes (on-prem/hybrid), GitOps (Argo CD/Flux), and infrastructure automation tools (Ansible, Terraform). Strong expertise in observability tools (Prometheus, Grafana, Loki) and complex incident response methodologies. Fluency in both German and English (C1+), authorized to work in Germany, with a willingness to travel (20–30%). Preferred Qualifications In-depth understanding of IT and governance frameworks within law enforcement or the public sector. Relevant certifications such as CKA/CKAD, ISO 27001 Lead Implementer, CISSP, or GDPR Practitioner. Demonstrated experience integrating with essential enterprise systems, including Identity and Access Management (SAML, LDAP), and Security Information and Event Management (SIEM) platforms. Familiarity with digital evidence workflows and contributions to judicial processes. Previous exposure to managing sensitive environments, including air-gapped systems and investigative tools for public safety.
Join Upvest, where we aim to revolutionize investment accessibility, making it as seamless as everyday spending. Our innovative Investment API allows businesses to offer a diverse array of investment products while enhancing capital market investment and retirement planning experiences.As one of Europe's leading fintechs, Upvest provides a comprehensive suite of investment opportunities for our B2B clients, spanning principal broking, proprietary trading, and secure custody for traditional securities. Founded in 2017 by Martin Kassing, we have expanded to over 240 employees across Europe, supported by a recent €100 million Series C funding round led by Hedosophia and Sapphire Ventures, along with esteemed existing investors such as Bessemer Venture Partners and BlackRock.With our headquarters in Berlin and additional hubs in Tallinn and London, we embrace a hybrid work model, allowing flexibility with regular travel to Berlin.The OpportunityAt Upvest, reliability is not just a metric; it's the cornerstone of our growth. As we rapidly scale, we are committed to establishing a dedicated Site Reliability Engineering (SRE) function aimed at continuously enhancing our reliability standards. This is your opportunity to redefine what exceptional reliability entails for a high-growth fintech leader.You will have the autonomy to create a reliability culture, establish standards, and implement practices that will guide us through our next phase of expansion. If you've ever envisioned building an SRE practice from the ground up, now is your moment.The RoleYour mission as the SRE Lead will focus on prevention rather than reaction. You will be a blend of technical visionary and organizational innovator, integrating reliability into our development processes. Collaborating closely with engineering teams, you will enhance observability and resilience while creating frameworks that enable us to operate swiftly without sacrificing stability. Rather than owning services, your role will be to elevate those who do.Your influence will extend to shaping engineering leaders' perspectives on reliability, guiding product managers in balancing features with stability, and defining what it means to be 'production-ready' across the organization. You will lead and mentor a talented team of 2 to 4 SREs, fostering a culture of excellence that amplifies our impact.
N26 is looking for a Site Reliability Engineer to join the Platform Engineering team in Berlin. This role centers on maintaining and improving the reliability, performance, and scalability of core systems. Role overview Work closely with cross-functional teams to support and enhance the platform. The focus is on building solutions that keep systems stable and responsive as the company grows. What you will do Monitor and improve system reliability and uptime Collaborate with other teams to address performance and scalability challenges Contribute to solutions that strengthen the platform’s technical foundation Location This position is based in Berlin.
As a Principal Product Manager in Site Reliability Engineering at Delivery Hero, you will take the lead in enhancing our site reliability practices to ensure optimal performance and availability of our platforms. You will collaborate with cross-functional teams to define product strategies, drive initiatives, and implement solutions that enhance user experience and operational efficiency. Your expertise will guide our engineering teams in adopting best practices and innovative technologies to maintain our position as a leader in the online food delivery market.
GetYourGuide connects travelers with memorable experiences in over 12,000 cities. Since 2009, the company has helped millions discover new destinations. The Berlin headquarters leads a global team, with offices in cities such as New York and Bangkok. More than 850 employees collaborate to reshape how people find and book travel adventures. The Staff Site Reliability Engineer joins the Operational Excellence team, which works to minimize disruptions, boost productivity, and build user trust. As GetYourGuide expands its AI-powered travel solutions, this role ensures engineering speed and reliability remain strong so customers enjoy seamless experiences. What you will do Collaborate with product teams to improve system reliability, performance, and trust across the platform. Incident management and reliability Reduce the number of incidents, as well as Mean Time to Detect (MTTD) and Mean Time to Recovery (MTTR). Lead post-incident reviews and turn findings into lasting improvements. Create tools and runbooks that speed up diagnosis and resolution of production issues. Foster a culture that treats incidents as learning opportunities, not blame assignments. Take part in the infrastructure on-call rotation. Observability and production confidence Advance the Datadog-based observability stack, including metrics, logs, traces, dashboards, and alerts. Help teams define meaningful Service Level Objectives (SLOs) and prevent alert fatigue. Strengthen production debugging tools so engineers can solve issues independently. Change confidence and release quality Lower change failure rates by guiding teams on effective testing and deployment practices. Learn more about GetYourGuide’s team and mission at getyourguide.careers.
Apr 27, 2026
Sign in to browse more jobs
Create account — see all 4,219 results
Tailoring 0 resumes…
Tailoring 0 resumes…
We'll move completed jobs to Ready to Apply automatically.