Team Lead Site Reliability Engineering Storage Layer Service jobs in Dublin – Browse 772 openings on RoboApply Jobs

Team Lead Site Reliability Engineering Storage Layer Service jobs in Dublin

Open roles matching “Team Lead Site Reliability Engineering Storage Layer Service” with location signals for Dublin. 772 active listings on RoboApply Jobs.

772 jobs found

1 - 20 of 772 Jobs
Apply
companyMongoDB, Inc. logo
Full-time|On-site|Dublin

Role Overview MongoDB is hiring a Team Lead for Site Reliability Engineering, with a focus on the Storage Layer Service. This position is based in Dublin. What You Will Do Lead efforts to improve the reliability and performance of the Storage Layer Service. Work closely with teams across the company to deliver solutions that support both user experience and operational goals. Guide and support engineers as they address technical challenges in the storage layer. Collaboration This role involves regular collaboration with other engineering groups and stakeholders to identify opportunities for improvement and implement changes that make a measurable impact.

Apr 15, 2026
Apply
companyMongoDB, Inc. logo
Full-time|Hybrid|Dublin

The Team The Storage Layer Services (SLS) team at MongoDB is pioneering the re-architecture of our cloud storage layer, fundamentally enhancing the core of our next-generation cloud storage architecture. This innovative team is dedicated to developing high-performance, multi-tenant distributed storage services that elevate the current Atlas storage stack and facilitate the efficient execution of diverse customer workloads. As a member of this team, you will collaborate closely with engineers responsible for building these storage services. Your role will involve defining Service Level Objectives (SLOs), shaping capacity plans, and ensuring the reliability, durability, and operational safety of the storage layer that supports Atlas. You will be part of a select group of senior Site Reliability Engineers (SREs), playing a vital role in the execution of a strategic multi-year roadmap for MongoDB's cloud storage architecture. We are particularly eager to connect with candidates located in Dublin, as this role follows a hybrid working model.

Apr 10, 2026
Apply
companyMongoDB, Inc. logo
Full-time|Hybrid|Dublin

MongoDB, Inc. supports organizations as they build and operate modern applications. The company’s flagship product, MongoDB Atlas, is a multi-cloud database platform available across AWS, Google Cloud, and Microsoft Azure in more than 115 regions. Atlas enables customers to run applications both on-premises and in the cloud. Each month, over 175,000 new developers join the MongoDB community. Companies such as Samsung and Toyota rely on MongoDB for next-generation, AI-driven applications. Role overview The Site Reliability Engineer III joins a team responsible for designing and maintaining the infrastructure that powers MongoDB services, with a particular focus on the Atlas platform. As customer requirements and regulations change, the SRE team works to deliver low-latency responses and address data sovereignty needs. The goal is to build complex systems that are reliable, straightforward to operate, and easy to monitor. Infrastructure-as-code and self-healing systems are core values for the team. Collaboration with other engineering groups is a regular part of the role, ensuring shared knowledge and responsibility for system health. Location This position is based in Dublin and follows a hybrid work model.

Apr 21, 2026
Apply
companyKlaviyo logo
On-site|On-site|Dublin, IE

Join Klaviyo as a Site Reliability Engineer II in Dublin, where you'll play a pivotal role in ensuring the reliability, scalability, and sustainability of our critical platforms. Our approach treats reliability as a core product feature, leveraging your engineering skills to tackle complex operational challenges. You'll collaborate with a dynamic team to enhance our infrastructure, security, and software engineering practices, ensuring our systems perform optimally at scale. Your contributions will directly influence how our engineering teams build software and how our customers engage with our platform daily.

Jan 31, 2026
Apply
companyInterSystems logo
Full-time|Remote|Dublin (Remote)

Overview Join our dynamic Managed Services team as a Major Incident Lead specializing in Site Reliability. In this critical role, you will spearhead the response to significant, customer-impacting incidents across InterSystems’ managed services platforms. As the Incident Commander, you will ensure swift service restoration, maintain clear and confident communication with stakeholders, and coordinate effectively across SRE, engineering, support, cloud, and service delivery teams. Operating within a service model aligned with SRE principles, you will prioritize service reliability by leveraging service level indicators and objectives, focusing on reducing customer impact during live incidents over root cause analysis. Beyond immediate incident management, you will lead post-incident reviews to transform operational failures into actionable reliability enhancements and minimize repeat incidents. This position is vital for preserving customer trust, ensuring platform resilience, and achieving operational excellence in a 24x7, mission-critical, and highly regulated environment.

Mar 26, 2026
Apply
companyStepStone logo
Full-time|On-site|Dublin

Join StepStone as a Site Reliability Engineer and play a critical role in ensuring the stability and performance of our innovative platforms. In this position, you will collaborate with cross-functional teams to enhance system reliability, improve the scalability of our applications, and automate operations processes. Your expertise in monitoring, incident response, and cloud technologies will be invaluable as you work on enhancing our infrastructure and delivering top-notch solutions.

Apr 10, 2026
Apply
companyairapps logo
Full-time|On-site|Dublin

airapps is looking for a Site Reliability Engineer (SRE) in Dublin. This role centers on keeping services reliable, available, and performing well. Working side by side with software development teams, the SRE will help strengthen system architecture and support ongoing improvements. Role overview The Site Reliability Engineer focuses on supporting the stability and efficiency of airapps’ systems. The position involves regular collaboration with developers to address system challenges and refine processes. Key responsibilities Monitor and maintain the reliability and uptime of core services Work with development teams to improve system design and architecture Apply new technologies and methods to boost operational efficiency Location This position is based in Dublin.

Apr 28, 2026
Apply
companyArista Networks logo
Full-time|On-site|Dublin

Join Arista Networks as a Senior Site Reliability Engineer, where you will play a crucial role in ensuring the reliability, performance, and scalability of our systems. You will collaborate with cross-functional teams to implement best practices in software development and operational excellence.

Apr 1, 2026
Apply
companyArista Networks logo
Full-time|On-site|Dublin

Collaboration and Innovation Await YouJoin Arista Networks as a talented Site Reliability Engineer within our Engineering Productivity (EngProd) team, where you will play a crucial role in maintaining and enhancing our rapidly expanding infrastructure. We seek a versatile and adaptable professional who is eager to explore new technologies. As part of our software engineering team, you will collaborate with peers to design, build, and manage secure, scalable, and fault-tolerant tools and infrastructure in a hybrid cloud environment.In the EngProd group, you will engage with fellow engineers to architect, scale, and operate the systems that support Arista’s product development teams. Our technology stack includes industry standards such as Ansible, Artifactory, Gerrit, Jenkins, Kubernetes, Grafana, Spinnaker, MySQL, ElasticSearch, Google Cloud, Varnish, and Perforce, alongside custom-built internal systems designed to automate CI/CD, testing, analysis, and visualization.Your ResponsibilitiesSafely and incrementally build, deploy, and manage critical production systems with an emphasis on scalability, reliability, observability, performance, and security.Enhance and monitor the developer experience across various services.Automate processes to eliminate toil and enhance operational efficiency of production systems.Proactively monitor and respond to alerts while setting up automated alert handling mechanisms.Develop and maintain incident response runbooks.Triage platform and infrastructural issues, assisting Arista software engineers and collaborating with third-party vendor support.Document postmortems and create solutions to prevent recurring incidents.Communicate and plan maintenance windows for production systems.Work closely with Arista’s product development teams to identify and resolve infrastructural bottlenecks affecting their workflows.Research and implement best practices around infrastructure and platforms to ensure secure, scalable, and fault-tolerant systems.Analyze and understand the design and implementation details of open-source systems to improve triage and resolution processes.

Mar 12, 2026
Apply
companyCrusoe logo
Full-time|On-site|Dublin - IE

Crusoe is on a mission to revolutionize the way we access and utilize energy and intelligence. We are building the infrastructure that empowers a future where ambitious AI-driven projects can thrive without compromising on scale, speed, or sustainability.Join us at Crusoe and be part of the AI revolution through sustainable technology. Here, you will spearhead significant innovations, create a lasting impact, and collaborate with a team committed to delivering responsible and transformative cloud infrastructure.About This Role:As a Site Reliability Engineer (SRE) at Crusoe, you will be integral in maintaining the reliability and performance of our cutting-edge infrastructure. Our SRE team focuses on identifying, analyzing, and mitigating issues to uphold high Service Level Agreements (SLAs) through effective Service Level Indicators (SLIs) and Service Level Objectives (SLOs). By automating processes and proactively addressing potential problems, you will help ensure that our systems run seamlessly, advising engineering teams on best practices for resilient coding. Your role will involve anticipating issues before they affect our customers, conducting comprehensive post-mortems, and promoting continuous improvement to uphold the highest reliability standards for Crusoe's AI platform. The ideal candidate possesses a solid foundation in SRE practices, distributed systems, networking, and Linux, along with a passion for automation and problem-solving. This is a full-time position.What You’ll Be Working On:Automation and Tool Development: Streamline routine processes and enhance Crusoe’s internal infrastructure platform, allowing software teams to operate effectively without needing in-depth knowledge of the operating system, hardware, or network.Collaboration and Planning: Engage in daily stand-up meetings with the team to review projects, recent incidents, and daily priorities. Collaborate on strategies for launching new data centers or upgrading existing ones. Work closely with software engineers to ensure the adoption of resilient coding practices and review modifications prior to deployment.System Monitoring and Alerting: Analyze overnight alerts and performance metrics to guarantee optimal system operation. Evaluate system logs and develop innovative tools to enhance our monitoring capabilities.Incident Response and Problem Solving: Participate in incident response simulations, post-mortems, and root cause analysis sessions to extract valuable lessons from past issues.

Jan 14, 2026
Apply
companyTenable, Inc. logo
Full-time|On-site|Ireland - Office - Dublin

About Tenable Tenable is a global leader in Exposure Management, trusted by over 44,000 organizations to help understand and reduce cyber risk. The company supports 65% of the Fortune 500, 45% of the Global 2000, and many government agencies. Team and Culture Tenable’s people are at the heart of its success. Teams work together to build cybersecurity solutions and maintain a culture rooted in respect and excellence. Employees collaborate with industry experts and have the tools and support to make a measurable difference. Role Overview: Senior Site Reliability Engineer This Dublin-based role sits within the SRE Infrastructure Management team. The team’s mission is to keep Tenable’s cloud-centric exposure management platform reliable, scalable, and secure. The focus is on reducing manual operational work by building advanced automation, especially using AI. What You Will Do Design and build AI-powered agentic workflows to automate complex SRE tasks, including incident investigation and deployment reliability. Develop evaluation frameworks, prompt engineering methods, retrieval strategies, and structured output validation to improve the accuracy and observability of agent pipelines. Write production code, create agentic workflows, and integrate observability and infrastructure platforms. Analyze the impact of automation efforts using real toil data. What Sets This Role Apart This position is not limited to operations with minor automation. Most of the work involves hands-on development: designing, coding, and deploying intelligent systems that replace manual SRE workflows. The team uses large language models, agentic architectures, and deep SRE knowledge to drive results. Location Office-based in Dublin, Ireland.

Apr 20, 2026
Apply
companyCrusoe logo
Full-time|On-site|Dublin - IE

At Crusoe, we are on a mission to drive the future of energy and intelligence. Our innovative platform empowers individuals to harness the full potential of artificial intelligence without compromising on scalability, speed, or sustainability.Join the forefront of the AI revolution with Crusoe's sustainable technology. Here, you'll be instrumental in pioneering transformative innovations, making a significant impact, and collaborating with a team that is redefining responsible cloud infrastructure.About the Role:As a Software Engineering Intern, you will be part of a dedicated team shaping the future of distributed systems technology. This 12-week, full-time internship in our Dublin office offers a unique opportunity to contribute to the development of a robust cloud infrastructure that supports groundbreaking advancements in fields such as artificial intelligence, graphics rendering, and computational biology. You won't just observe; you'll take on real responsibilities, tackle production-level challenges, and play a key role in Crusoe's vision for sustainable and ethical high-performance computing.Throughout your internship, you will engage in impactful projects that extend beyond traditional classroom learning. Benefit from one-on-one mentorship from industry veterans and collaborate with a diverse group of engineers to construct fault-tolerant systems utilized by customers across the globe. We are looking for motivated, inquisitive, and proactive students ready to forge valuable connections and launch their careers by addressing today's most challenging computational problems.Your ResponsibilitiesSystem Development: Design, implement, and maintain scalable, highly available, and fault-tolerant distributed systems to support demanding computational workloads.Product Development: Innovate and create cutting-edge products and tools from inception that will be leveraged by a global user base.Production Support: Identify, troubleshoot, and resolve complex issues in production environments to maintain platform reliability.Feature Development: Collaborate with product owners and stakeholders to design, test, and iterate on new features that enhance platform capabilities.Team Collaboration: Work closely with senior engineers and peers to ensure technical tasks align with broader organizational objectives.Mentorship Opportunities: Engage in dedicated mentorship sessions to accelerate your growth and deepen your technical expertise.

Jan 29, 2026
Apply
companyVeeva Systems Inc. logo
Full-time|Hybrid|Ireland - Dublin

Veeva Systems is a purpose-driven leader in cloud solutions for the life sciences industry, dedicated to accelerating the delivery of therapies to patients. As one of the fastest-growing SaaS companies globally, we achieved over $2 billion in revenue last year and are poised for continued growth.Our core values—Do the Right Thing, Customer Success, Employee Success, and Speed—guide our operations. We made history in 2021 by becoming a public benefit corporation (PBC), committed to balancing the interests of our customers, employees, society, and investors.At Veeva, we embrace flexibility through our Work Anywhere philosophy, enabling you to thrive in your preferred work environment—whether from home or in the office.Be a part of our mission to transform the life sciences sector, making a meaningful impact on our customers, employees, and communities.The Role We are looking for a Senior Site Reliability Engineer to join our Vault Platform team. In this role, you will be responsible for maintaining the scalability and reliability of our enterprise applications, addressing complex challenges on a global scale. Your expertise in Java and modern open-source technologies will be critical in enhancing our production systems.The ideal candidate will possess a wealth of experience with Java applications and the latest open-source technologies, ideally gained from enterprise software development or a rapidly growing tech environment. As a Senior SRE, you should be innately curious and proficient in problem-solving. You will also offer a unique engineering perspective, understanding how systems integrate to function effectively for hundreds of customers across North America, Europe, and Asia.

Aug 10, 2021
Apply
companyAnthropic logo
On-site|On-site|Dublin, IE

About AnthropicAt Anthropic, we are on a mission to develop AI systems that are not only reliable and interpretable but also steerable. Our primary goal is to ensure that AI technology is safe and advantageous for all users and society at large. Our rapidly expanding team consists of dedicated researchers, engineers, policy experts, and business leaders, all working collaboratively to create beneficial AI solutions.Role OverviewAt Anthropic, we believe in the strength of collaboration. Our AI Reliability Engineering (AIRE) team plays a crucial role in maintaining the robustness of Claude, our flagship AI, ensuring it remains reliable for everyone who relies on it. We work closely with various teams within Anthropic to enhance reliability across our essential service paths—from the SDK, through our network, API layers, serving infrastructure, and accelerators, and back again. Our hands-on approach allows us to make impactful improvements during incidents and in collaborative projects.Reliability is an emergent quality that extends beyond individual teams. Our role involves taking a comprehensive view of the systems, offering a unique opportunity for dynamic, cross-functional engagement with the most critical aspects of our operations.

Feb 9, 2026
Apply
companyTOMRA Food logo
Full-time|On-site|Dublin

Join TOMRA Food as a Project Engineering Team Lead! In this pivotal role, you will oversee and mentor a talented project engineering team, ensuring the timely and budget-conscious delivery of high-quality customer projects. You will spearhead technical execution and collaborate with cross-functional teams to ensure that engineering deliverables align with our exceptional company standards. As a leader and technical expert, you will drive innovation and continuous improvement within our Customer Solutions Delivery department.Key Responsibilities:1. Leadership and Team Management:Lead, mentor, and develop a multi-disciplinary team of engineers.Assign tasks based on individual skills and project needs.Promote a collaborative environment with effective communication and professional growth.2. Technical Project Management:Oversee customer engineering projects from inception to completion.Create comprehensive project plans detailing timelines, resource allocation, and budgets.Monitor project performance to ensure compliance with goals, schedules, and quality standards.Identify risks and develop strategies to mitigate them, ensuring project success.Provide updates on project progress and challenges to senior leadership and stakeholders.3. Technical Guidance:Offer technical direction and support for project teams.Ensure engineering solutions adhere to industry standards, safety regulations, and client specifications.Conduct design reviews, technical analyses, and feasibility studies to foster innovative solutions.Stay updated on industry trends and emerging technologies, integrating best practices into project execution.4. Stakeholder Management:Collaborate with departments such as R&D, production, procurement, and quality for seamless project execution.Liaise with external clients, vendors, and contractors to ensure deliverables meet contractual obligations.Address any engineering issues that arise throughout the project lifecycle.

Feb 10, 2026
Apply
company
Full-time|On-site|Dublin

Join our dynamic team at Catalyx as a Supply Chain Customer Service Team Lead, located at our customer site in Dublin. This pivotal role will involve leading our dedicated Customer Service Team, ensuring top-notch service delivery while engaging closely with our clients.About Catalyx:Catalyx is a leading machine vision and automation company committed to integrating cutting-edge technology with skilled personnel to assist global manufacturers and logistics firms in achieving superior quality and efficiency. With over thirty years of experience, our team has been transforming operational processes across highly regulated industries through innovative technology applications and expert support. Our presence spans 9 global offices, comprising over 550 professionals and successfully completed 3,000 projects to date. Discover how we continue to redefine possibilities at www.catalyx.ai.We pride ourselves on being a trusted partner in offering world-class lifecycle services tailored for regulated and high-risk sectors. Our relentless pursuit of innovation and excellence empowers life science and other highly regulated organizations to enhance their operational efficiency and achieve success. We are dedicated to developing our on-site teams to further advance customer operations.Key Responsibilities:Lead and manage the Catalyx Customer Service Team on-site, fostering a performance-oriented environment.Engage in one-on-one interactions to monitor and enhance employee performance.Oversee daily operations and coordination of Customer Service Coordinators.Act as the primary escalation point for any issues arising within the Customer Service Team.Coordinate and maintain the sales order processing functions, including orders, acknowledgments, invoicing, and credits.Respond to customer and sales team inquiries promptly and effectively.Support the commercial team with timely responses to their inquiries.Develop familiarity with products and processes to facilitate swift responses to customer inquiries.Participate in regular business reviews and meetings, both virtual and in-person, related to defined customer portfolios.Collaborate with distribution to ensure adherence to schedules.Work with manufacturing teams to ensure timely schedule adherence.Coordinate with commercial, manufacturing finance, and financial shared services to ensure accuracy in material and customer master setups.

Mar 11, 2026
Apply
companyPublic Storage logo
Full-time|On-site|Dublin

Join our dynamic team as a Self Storage Customer Service Manager, where you will oversee daily operations while providing top-notch customer service. In this entry-level role, you will be responsible for managing customer inquiries, ensuring facility cleanliness, and maintaining high standards of security.

Apr 6, 2026
Apply
companyCatalyx logo
Full-time|On-site|Dublin

Catalyx is on the lookout for a dynamic and experienced Supply Chain Customer Service Team Lead to be a part of our dedicated team at our Dublin customer site.About Us:At Catalyx, we are a pioneering machine vision and automation company committed to integrating innovative technology and skilled personnel to assist global manufacturers and logistics firms in achieving unparalleled quality and efficiency. With over thirty years of experience, our team has excelled in optimizing operational processes across various highly regulated sectors. Our global presence encompasses 9 offices with over 550 professionals and 3,000 projects successfully executed. We are dedicated to solving complex process challenges and consistently push the boundaries of what is possible. For more details, visit www.catalyx.ai.As a trusted partner for high-risk and regulated markets, we are devoted to delivering exceptional lifecycle services. Our commitment to continuous innovation enables us to empower life science and other regulated organizations to enhance their operational efficiency and drive success. To support this mission, we are focused on enhancing our on-site teams to advance customer operations effectively.

Mar 11, 2026
Apply
companyNory logo
Full-time|On-site|Dublin

Join Us in Transforming Hospitality for the Better!The hospitality industry presents numerous challenges – tight margins, excessive waste, and overworked teams. But there’s a solution, and that’s where Nory comes in.Our CEO, Conor, has experienced these struggles firsthand. After founding and successfully scaling Mad Egg in Ireland, he realized the need for a streamlined solution. Frustrated with outdated systems, cumbersome spreadsheets, and endless paperwork, he set out to create the tool he wished he had from the beginning.Nory is an innovative restaurant management system that integrates real-time data with AI-driven predictive analytics, empowering operators to take control of their margins. Covering everything from food preparation to forecasting, it provides operational intelligence that enables restaurants to operate with consistency, reliability, and profitability. The outcome? Flourishing restaurants, improved job satisfaction, reduced waste, and healthier profit margins. Discover more about our mission here.We’re just getting started. Following a successful Series B funding round led by Kinnevik, our team has expanded to over 80 talented individuals across Ireland, the UK, Spain, and New York, with demand exceeding all expectations.The OpportunityWe are seeking a dedicated Engineering Team Lead to join our dynamic Inventory team. This is a crucial component of the Nory operating software, essential for managing COGS, gross profit, financial reporting, and facilitating automated ordering and reconciliation at scale. Your team will ensure the reliability of Nory’s inventory data daily, making it robust enough for enterprise finance teams and aligned with our 2026 AI vision.In this hands-on leadership role, you will balance technical ownership with team leadership, delivering a reliable and impactful product. Learn more about the role of an Engineering Team Lead here.You will be comfortable working across both backend and frontend systems. We are stack-agnostic in our hiring, valuing strong engineering fundamentals and the ability to learn rapidly. Our current technology stack includes Python, FastAPI, React, TypeScript, Node.js, AWS (Fargate, Aurora, SQS), Postgres, MongoDB, and Docker.

Feb 6, 2026
Apply
company
Site Engineer

XYZ Reality

Full-time|On-site|Dublin, Ireland

About XYZ RealityXYZ Reality is at the forefront of innovation, offering the world's first engineering-grade Augmented Reality solution specifically designed for the construction industry. Our groundbreaking technology integrates seamlessly into The Atom, a smart, site-safe headset/hardhat, enabling us to implement AR solutions that enhance project delivery while adhering to timelines and budget constraints.With a rapidly expanding team of over 100 professionals across the UK, US, and Europe, we partner with critical organizations and construction firms to realize major projects successfully.Role OverviewAs a Site Engineer at XYZ Reality, you will play a pivotal role in executing our core services on construction projects. Your responsibilities will include monitoring construction progress in relation to BIM models, conducting quality inspections on-site, and delivering findings to clients through our innovative platform.This position is ideal for individuals with hands-on construction experience who are eager to embrace XYZ Reality’s advanced technology and methodologies.

Mar 31, 2026

Sign in to browse more jobs

Create account — see all 772 results

Tailoring 0 resumes

We'll move completed jobs to Ready to Apply automatically.