Site Reliability Engineer At Crusoe Dublin Ie jobs in Dublin – Browse 1,860 openings on RoboApply Jobs

Site Reliability Engineer at Crusoe | Dublin, IE

CrusoeDublin - IE

On-site Full-time

Clicking Apply Now takes you to AutoApply where you can tailor your resume and apply.

Experience Level

Experience

About the job

Crusoe is on a mission to revolutionize the way we access and utilize energy and intelligence. We are building the infrastructure that empowers a future where ambitious AI-driven projects can thrive without compromising on scale, speed, or sustainability.

Join us at Crusoe and be part of the AI revolution through sustainable technology. Here, you will spearhead significant innovations, create a lasting impact, and collaborate with a team committed to delivering responsible and transformative cloud infrastructure.

About This Role:

As a Site Reliability Engineer (SRE) at Crusoe, you will be integral in maintaining the reliability and performance of our cutting-edge infrastructure. Our SRE team focuses on identifying, analyzing, and mitigating issues to uphold high Service Level Agreements (SLAs) through effective Service Level Indicators (SLIs) and Service Level Objectives (SLOs). By automating processes and proactively addressing potential problems, you will help ensure that our systems run seamlessly, advising engineering teams on best practices for resilient coding. Your role will involve anticipating issues before they affect our customers, conducting comprehensive post-mortems, and promoting continuous improvement to uphold the highest reliability standards for Crusoe's AI platform. The ideal candidate possesses a solid foundation in SRE practices, distributed systems, networking, and Linux, along with a passion for automation and problem-solving. This is a full-time position.

What You’ll Be Working On:

Automation and Tool Development: Streamline routine processes and enhance Crusoe’s internal infrastructure platform, allowing software teams to operate effectively without needing in-depth knowledge of the operating system, hardware, or network.
Collaboration and Planning: Engage in daily stand-up meetings with the team to review projects, recent incidents, and daily priorities. Collaborate on strategies for launching new data centers or upgrading existing ones. Work closely with software engineers to ensure the adoption of resilient coding practices and review modifications prior to deployment.
System Monitoring and Alerting: Analyze overnight alerts and performance metrics to guarantee optimal system operation. Evaluate system logs and develop innovative tools to enhance our monitoring capabilities.
Incident Response and Problem Solving: Participate in incident response simulations, post-mortems, and root cause analysis sessions to extract valuable lessons from past issues.

Similar jobs

Browse all companies, explore by city & role, or SEO search pages. View directory listings: all jobs, search results, location & role pages.

1 - 20 of 1,860 Jobs

Select all on this page (20)

Apply

Site Reliability Engineer at Crusoe | Dublin, IE

Crusoe

Full-time|On-site|Dublin - IE

Jan 14, 2026

Apply

Infrastructure Engineer at Crusoe | Dublin, IE

Crusoe Technologies

Full-time|On-site|Dublin - IE

Join Crusoe Technologies as an Infrastructure Engineer, where you will be pivotal in designing, implementing, and maintaining robust infrastructure systems that support our innovative solutions. You will collaborate with cross-functional teams to ensure high availability and performance of our services.

Apr 10, 2026

Apply

Site Reliability Engineering Internship - Summer 2026 at Crusoe | Dublin, Ireland

Crusoe

Full-time|On-site|Dublin - IE

At Crusoe, we are on a mission to drive the future of energy and intelligence. Our innovative platform empowers individuals to harness the full potential of artificial intelligence without compromising on scalability, speed, or sustainability.Join the forefront of the AI revolution with Crusoe's sustainable technology. Here, you'll be instrumental in pioneering transformative innovations, making a significant impact, and collaborating with a team that is redefining responsible cloud infrastructure.About the Role:As a Software Engineering Intern, you will be part of a dedicated team shaping the future of distributed systems technology. This 12-week, full-time internship in our Dublin office offers a unique opportunity to contribute to the development of a robust cloud infrastructure that supports groundbreaking advancements in fields such as artificial intelligence, graphics rendering, and computational biology. You won't just observe; you'll take on real responsibilities, tackle production-level challenges, and play a key role in Crusoe's vision for sustainable and ethical high-performance computing.Throughout your internship, you will engage in impactful projects that extend beyond traditional classroom learning. Benefit from one-on-one mentorship from industry veterans and collaborate with a diverse group of engineers to construct fault-tolerant systems utilized by customers across the globe. We are looking for motivated, inquisitive, and proactive students ready to forge valuable connections and launch their careers by addressing today's most challenging computational problems.Your ResponsibilitiesSystem Development: Design, implement, and maintain scalable, highly available, and fault-tolerant distributed systems to support demanding computational workloads.Product Development: Innovate and create cutting-edge products and tools from inception that will be leveraged by a global user base.Production Support: Identify, troubleshoot, and resolve complex issues in production environments to maintain platform reliability.Feature Development: Collaborate with product owners and stakeholders to design, test, and iterate on new features that enhance platform capabilities.Team Collaboration: Work closely with senior engineers and peers to ensure technical tasks align with broader organizational objectives.Mentorship Opportunities: Engage in dedicated mentorship sessions to accelerate your growth and deepen your technical expertise.

Jan 29, 2026

Apply

Site Reliability Engineer at StepStone | Dublin

StepStone

Full-time|On-site|Dublin

Join StepStone as a Site Reliability Engineer and play a critical role in ensuring the stability and performance of our innovative platforms. In this position, you will collaborate with cross-functional teams to enhance system reliability, improve the scalability of our applications, and automate operations processes. Your expertise in monitoring, incident response, and cloud technologies will be invaluable as you work on enhancing our infrastructure and delivering top-notch solutions.

Apr 10, 2026

Apply

Site Reliability Engineer at airapps | Dublin

airapps

Full-time|On-site|Dublin

airapps is looking for a Site Reliability Engineer (SRE) in Dublin. This role centers on keeping services reliable, available, and performing well. Working side by side with software development teams, the SRE will help strengthen system architecture and support ongoing improvements. Role overview The Site Reliability Engineer focuses on supporting the stability and efficiency of airapps’ systems. The position involves regular collaboration with developers to address system challenges and refine processes. Key responsibilities Monitor and maintain the reliability and uptime of core services Work with development teams to improve system design and architecture Apply new technologies and methods to boost operational efficiency Location This position is based in Dublin.

Apr 28, 2026

Apply

Senior Site Reliability Engineer at Tenable | Dublin, Ireland

Tenable, Inc.

Full-time|On-site|Ireland - Office - Dublin

About Tenable Tenable is a global leader in Exposure Management, trusted by over 44,000 organizations to help understand and reduce cyber risk. The company supports 65% of the Fortune 500, 45% of the Global 2000, and many government agencies. Team and Culture Tenable’s people are at the heart of its success. Teams work together to build cybersecurity solutions and maintain a culture rooted in respect and excellence. Employees collaborate with industry experts and have the tools and support to make a measurable difference. Role Overview: Senior Site Reliability Engineer This Dublin-based role sits within the SRE Infrastructure Management team. The team’s mission is to keep Tenable’s cloud-centric exposure management platform reliable, scalable, and secure. The focus is on reducing manual operational work by building advanced automation, especially using AI. What You Will Do Design and build AI-powered agentic workflows to automate complex SRE tasks, including incident investigation and deployment reliability. Develop evaluation frameworks, prompt engineering methods, retrieval strategies, and structured output validation to improve the accuracy and observability of agent pipelines. Write production code, create agentic workflows, and integrate observability and infrastructure platforms. Analyze the impact of automation efforts using real toil data. What Sets This Role Apart This position is not limited to operations with minor automation. Most of the work involves hands-on development: designing, coding, and deploying intelligent systems that replace manual SRE workflows. The team uses large language models, agentic architectures, and deep SRE knowledge to drive results. Location Office-based in Dublin, Ireland.

Apr 20, 2026

Apply

SRE, Site Reliability Engineering

Klaviyo

On-site|On-site|Dublin, IE

Join Klaviyo as a Site Reliability Engineer II in Dublin, where you'll play a pivotal role in ensuring the reliability, scalability, and sustainability of our critical platforms. Our approach treats reliability as a core product feature, leveraging your engineering skills to tackle complex operational challenges. You'll collaborate with a dynamic team to enhance our infrastructure, security, and software engineering practices, ensuring our systems perform optimally at scale. Your contributions will directly influence how our engineering teams build software and how our customers engage with our platform daily.

Jan 31, 2026

Apply

Senior Site Reliability Engineer at Veeva | Dublin, Ireland

Veeva Systems Inc.

Full-time|Hybrid|Ireland - Dublin

Veeva Systems is a purpose-driven leader in cloud solutions for the life sciences industry, dedicated to accelerating the delivery of therapies to patients. As one of the fastest-growing SaaS companies globally, we achieved over $2 billion in revenue last year and are poised for continued growth.Our core values—Do the Right Thing, Customer Success, Employee Success, and Speed—guide our operations. We made history in 2021 by becoming a public benefit corporation (PBC), committed to balancing the interests of our customers, employees, society, and investors.At Veeva, we embrace flexibility through our Work Anywhere philosophy, enabling you to thrive in your preferred work environment—whether from home or in the office.Be a part of our mission to transform the life sciences sector, making a meaningful impact on our customers, employees, and communities.The Role We are looking for a Senior Site Reliability Engineer to join our Vault Platform team. In this role, you will be responsible for maintaining the scalability and reliability of our enterprise applications, addressing complex challenges on a global scale. Your expertise in Java and modern open-source technologies will be critical in enhancing our production systems.The ideal candidate will possess a wealth of experience with Java applications and the latest open-source technologies, ideally gained from enterprise software development or a rapidly growing tech environment. As a Senior SRE, you should be innately curious and proficient in problem-solving. You will also offer a unique engineering perspective, understanding how systems integrate to function effectively for hundreds of customers across North America, Europe, and Asia.

Aug 10, 2021

Apply

Senior Solutions Engineer at Crusoe | Dublin, IE

Crusoe

Full-time|On-site|Dublin - IE

At Crusoe, we are on a mission to enhance the availability of energy and intelligence, creating the engine for an ambitious AI-driven world where creativity thrives without compromising on scale, speed, or sustainability.Join Crusoe and become part of the AI revolution with sustainable technology. You will be at the forefront of innovation, driving impactful solutions and collaborating with a team dedicated to shaping responsible and transformative cloud infrastructure.Role Overview:Crusoe Cloud is looking for a Senior Solutions Engineer to partner with our key enterprise clients as they deploy AI/ML workloads on our state-of-the-art GPU infrastructure. This role is highly interactive and customer-focused, requiring extensive technical knowledge in Kubernetes, MLOps, and cloud architecture.You will manage the entire deployment process—from conducting Proofs of Concept (PoCs) to optimizing workloads after the sale, acting as a vital technical liaison between our customers and engineering teams. Ideal candidates will have a passion for AI infrastructure, proficiency in containerized systems, and the ability to translate workloads seamlessly across different cloud environments.Your Responsibilities:Customer Enablement: Lead the technical onboarding and deployment of intricate AI/ML workloads, managing the PoC process through to post-sales optimization.Kubernetes & MLOps Implementation: Design and deploy ML workloads utilizing Kubernetes-based frameworks (e.g., Ray, Kubeflow) to create infrastructure that optimally balances performance, scalability, and efficiency.Infrastructure-Centric Approach: Move beyond abstract services—deploy and refine AI/ML workloads directly on Crusoe's infrastructure, ensuring optimum performance at both the container and hardware levels.Cross-Cloud Migration: Assist customers in transitioning and adapting workloads across AWS, Azure, and GCP, articulating the trade-offs between cloud-native and Crusoe-native strategies.Technical Communication: Conduct workshops, live demonstrations, and solution reviews, while contributing to case studies, solution briefs, and blog content that showcase real-world customer successes.Customer Advocacy: Provide valuable feedback to internal engineering and product teams, helping to enhance Crusoe’s platform based on hands-on implementation experiences.Qualifications:Proven expertise in Kubernetes and MLOps, with hands-on experience in deploying and managing cloud-based infrastructure.Strong understanding of AI/ML workloads and the ability to optimize them for performance and efficiency.Excellent communication skills, capable of conveying complex technical concepts to diverse audiences.Experience with cross-cloud architectures and migrations between AWS, Azure, and GCP.A passion for innovative technology and a desire to drive meaningful change in the AI landscape.

May 28, 2025

Apply

Site Reliability Engineer III

MongoDB, Inc.

Full-time|Hybrid|Dublin

MongoDB, Inc. supports organizations as they build and operate modern applications. The company’s flagship product, MongoDB Atlas, is a multi-cloud database platform available across AWS, Google Cloud, and Microsoft Azure in more than 115 regions. Atlas enables customers to run applications both on-premises and in the cloud. Each month, over 175,000 new developers join the MongoDB community. Companies such as Samsung and Toyota rely on MongoDB for next-generation, AI-driven applications. Role overview The Site Reliability Engineer III joins a team responsible for designing and maintaining the infrastructure that powers MongoDB services, with a particular focus on the Atlas platform. As customer requirements and regulations change, the SRE team works to deliver low-latency responses and address data sovereignty needs. The goal is to build complex systems that are reliable, straightforward to operate, and easy to monitor. Infrastructure-as-code and self-healing systems are core values for the team. Collaboration with other engineering groups is a regular part of the role, ensuring shared knowledge and responsibility for system health. Location This position is based in Dublin and follows a hybrid work model.

Apr 21, 2026

Apply

Senior Cloud Support Engineer at Crusoe | Dublin, IE

Crusoe

Full-time|On-site|Dublin - IE

At Crusoe, we are on a mission to accelerate the abundance of energy and intelligence, creating the driving force behind a world where individuals can ambitiously innovate with AI without compromising on scale, speed, or sustainability.Join us in the AI revolution powered by sustainable technology at Crusoe. Here, you will foster meaningful innovation, make a significant impact, and be part of a team that leads the way in responsible and transformative cloud infrastructure.Role Overview:As a Senior Cloud Support Engineer, you will be instrumental in the transformation of high-performance computing through the provision of sustainable and cost-effective GPU compute power. Your role will empower our customers to harness this technology for pioneering developments in areas such as AI/ML, physics simulations, and computational biology. Acting as the primary technical support contact, you will ensure that our customers can effortlessly utilize Crusoe Cloud to reach their objectives. This position is vital to Crusoe's mission, facilitating our customers' research and development efforts and contributing to a sustainable future. You will engage in exciting projects, collaborate with a talented team, and tackle complex challenges using cutting-edge technologies. We are seeking a highly motivated and experienced technical professional with a strong commitment to customer success, a comprehensive understanding of cloud technologies, and alignment with Crusoe's core values. This is a full-time position.Key Responsibilities:Customer Support: Deliver outstanding technical support to customers via Zendesk, adhering to SLAs and maintaining a high customer satisfaction score (CSAT of 95% or greater).On-Call Rotation: Participate in a 24/7 on-call rotation to promptly address critical issues.Troubleshooting: Diagnose and resolve issues related to VMs, hardware failures, and scaling tests using CLI and internal tools.Alert Management: Oversee alert triage, prepare for maintenance windows, and conduct node delivery testing.Collaboration: Collaborate closely with SRE, Networking, and Storage teams from initial triage through root cause analysis (RCA) delivery.Global Collaboration: Follow established global team collaboration and handoff procedures for ticketing and on-call management.Knowledge Development: Create onboarding materials, knowledge base documentation, and standard operating procedures (SOPs).

Dec 16, 2025

Apply

Data Center Deployment Engineer at Crusoe | Dublin, IE

Crusoe

Full-time|On-site|Dublin - IE

At Crusoe, our mission is to revolutionize how energy and intelligence come together. We are building the foundational technology that empowers a world where creativity with AI flourishes without compromising on scale, speed, or sustainability.Join us in leading the AI transformation with eco-friendly technology. At Crusoe, you will spearhead meaningful advancements, make a significant difference, and collaborate with a team dedicated to pioneering responsible and transformative cloud infrastructure.About This Role:As our Data Center Deployment Engineer, you will play a crucial role in designing and optimizing Crusoe’s operational computing environments. Your expertise will connect heavy infrastructure with high-performance computing, focusing on the critical whitespace where our servers, storage, and network equipment operate. Your contributions will directly support Crusoe’s vision of harmonizing the future of computing with climate needs by enhancing power distribution, cooling solutions, and equipment density, thus enabling the scale and energy efficiency of our sustainable cloud platform.You will be part of an innovative and collaborative team, leading the planning and execution of high-density layouts and modular infrastructure. We seek a technical problem-solver who excels at the intersection of mechanical, electrical, and network engineering. This full-time role is designed for a forward-thinking engineer eager to create the physical foundations of an ethical and sustainable digital ecosystem.What You’ll Be Working On:Whitespace Design & Planning: Create detailed whitespace layouts, including rack placements and aisle configurations, to optimize equipment density while ensuring ideal airflow and accessibility.Infrastructure Integration: Collaborate with Electrical and Mechanical teams to integrate power delivery and cooling systems seamlessly into the whitespace, supporting high-performance cloud workloads.Capacity & Growth Modeling: Craft advanced 3D models and capacity planning tools to forecast future utilization, ensuring our data centers are scalable and resilient (N+1, 2N).Connectivity Strategy: Work alongside Network and Cloud teams to strategize and implement structured cabling and fiber management systems that adhere to stringent performance and low-latency standards.Cross-Functional Leadership: Engage in design reviews and risk assessments, offering technical insights to internal stakeholders.

Jan 6, 2026

Apply

Staff Site Reliability Engineer

MongoDB, Inc.

Full-time|Hybrid|Dublin

The Team The Storage Layer Services (SLS) team at MongoDB is pioneering the re-architecture of our cloud storage layer, fundamentally enhancing the core of our next-generation cloud storage architecture. This innovative team is dedicated to developing high-performance, multi-tenant distributed storage services that elevate the current Atlas storage stack and facilitate the efficient execution of diverse customer workloads. As a member of this team, you will collaborate closely with engineers responsible for building these storage services. Your role will involve defining Service Level Objectives (SLOs), shaping capacity plans, and ensuring the reliability, durability, and operational safety of the storage layer that supports Atlas. You will be part of a select group of senior Site Reliability Engineers (SREs), playing a vital role in the execution of a strategic multi-year roadmap for MongoDB's cloud storage architecture. We are particularly eager to connect with candidates located in Dublin, as this role follows a hybrid working model.

Apr 10, 2026

Apply

Senior Site Reliability Engineer - Ireland

Arista Networks

Full-time|On-site|Dublin

Join Arista Networks as a Senior Site Reliability Engineer, where you will play a crucial role in ensuring the reliability, performance, and scalability of our systems. You will collaborate with cross-functional teams to implement best practices in software development and operational excellence.

Apr 1, 2026

Apply

Site Reliability Engineer (SRE/DevOps) - Engineering Productivity

Arista Networks

Full-time|On-site|Dublin

Collaboration and Innovation Await YouJoin Arista Networks as a talented Site Reliability Engineer within our Engineering Productivity (EngProd) team, where you will play a crucial role in maintaining and enhancing our rapidly expanding infrastructure. We seek a versatile and adaptable professional who is eager to explore new technologies. As part of our software engineering team, you will collaborate with peers to design, build, and manage secure, scalable, and fault-tolerant tools and infrastructure in a hybrid cloud environment.In the EngProd group, you will engage with fellow engineers to architect, scale, and operate the systems that support Arista’s product development teams. Our technology stack includes industry standards such as Ansible, Artifactory, Gerrit, Jenkins, Kubernetes, Grafana, Spinnaker, MySQL, ElasticSearch, Google Cloud, Varnish, and Perforce, alongside custom-built internal systems designed to automate CI/CD, testing, analysis, and visualization.Your ResponsibilitiesSafely and incrementally build, deploy, and manage critical production systems with an emphasis on scalability, reliability, observability, performance, and security.Enhance and monitor the developer experience across various services.Automate processes to eliminate toil and enhance operational efficiency of production systems.Proactively monitor and respond to alerts while setting up automated alert handling mechanisms.Develop and maintain incident response runbooks.Triage platform and infrastructural issues, assisting Arista software engineers and collaborating with third-party vendor support.Document postmortems and create solutions to prevent recurring incidents.Communicate and plan maintenance windows for production systems.Work closely with Arista’s product development teams to identify and resolve infrastructural bottlenecks affecting their workflows.Research and implement best practices around infrastructure and platforms to ensure secure, scalable, and fault-tolerant systems.Analyze and understand the design and implementation details of open-source systems to improve triage and resolution processes.

Mar 12, 2026

Apply

Senior Network Operations Engineer at Crusoe | Dublin

Crusoe

Full-time|On-site|Dublin - IE

At Crusoe, our mission is to fuel the future with abundant energy and intelligence. We are developing a transformative engine that empowers individuals to innovate ambitiously with AI while prioritizing scale, speed, and sustainability.Join us at the forefront of the AI revolution, where we utilize sustainable technologies to drive impactful innovation. Become a part of a dynamic team that is redefining responsible cloud infrastructure.About the Role:The Crusoe Cloud Network Engineering team is in search of a dedicated and skilled Senior Network Operations Engineer to enhance our Network Operations team. In this critical role, you will oversee and operate a global edge, backbone, and data center network designed for high-performance computing (HPC) clusters that utilize GPUs. We seek an individual who is motivated, hands-on, and passionate about working with advanced environmental technologies. You should possess outstanding analytical abilities, excellent communication skills, and be an effective team player.You will be instrumental in maintaining the operational integrity, security, and scalability of our global network infrastructure. You will collaborate with network engineers and specialists to implement and manage network solutions that support Crusoe Cloud. This essential role includes providing 24/7 monitoring and management to ensure ongoing operational coverage, swift incident response, and high availability of network services to meet customer demands.Your ResponsibilitiesMonitor network performance, conduct advanced troubleshooting, and perform root cause analysis for incidents while facilitating post-mortem reviews and enhancements.Implement network changes across data centers, backbone, and edge infrastructure to realize next-generation designs, enhance capacity, and boost network reliability.Oversee, optimize, and balance immediate needs with long-term objectives for the entire Crusoe network (frontend, backend, backbone, edge, public cloud connectivity).Collaborate effectively within the Network Engineering team and with cross-functional departments to ensure the network aligns with business requirements.Lead initiatives for operational excellence by developing monitoring, alerting, and systems to maintain high network availability.Mentor network engineers and establish best practices for incident response, documentation, and operational readiness.

Feb 20, 2026

Apply

Team Lead, Site Reliability Engineering - Storage Layer Service

MongoDB, Inc.

Full-time|On-site|Dublin

Role Overview MongoDB is hiring a Team Lead for Site Reliability Engineering, with a focus on the Storage Layer Service. This position is based in Dublin. What You Will Do Lead efforts to improve the reliability and performance of the Storage Layer Service. Work closely with teams across the company to deliver solutions that support both user experience and operational goals. Guide and support engineers as they address technical challenges in the storage layer. Collaboration This role involves regular collaboration with other engineering groups and stakeholders to identify opportunities for improvement and implement changes that make a measurable impact.

Apr 15, 2026

Apply

Senior Engineering Manager

Crusoe

Full-time|On-site|Dublin - IE

Crusoe seeks a Senior Engineering Manager based in Dublin to guide a team of engineers. This role centers on delivering engineering projects in a collaborative setting. Close work with cross-functional partners is essential to move projects from concept to completion. Role overview The Senior Engineering Manager leads technical teams, shapes project direction, and ensures high standards in execution. This position involves balancing technical priorities with team development and project delivery. What you will do Manage and mentor a group of engineers Oversee the progress of engineering projects from planning through launch Coordinate with other departments to align goals and resources Requirements Experience in engineering leadership roles Strong interest in technology and team development Ability to collaborate effectively with diverse teams

Apr 28, 2026

Apply

Staff Security Engineer

Crusoe

Full-time|On-site|Dublin - IE

Role Overview Crusoe is hiring a Staff Security Engineer in Dublin, IE. This role focuses on protecting the company’s technology and infrastructure. The position involves designing, implementing, and maintaining security solutions to safeguard sensitive data and support compliance with industry standards. What You Will Do Design security solutions that address current and emerging threats Implement and maintain security controls across systems and infrastructure Monitor for vulnerabilities and respond to incidents as needed Work with a team of engineers to develop and refine security strategies Support compliance efforts with relevant industry standards Team and Culture Collaborate with skilled professionals who focus on innovation and security. The team values practical solutions and a proactive approach to protecting company assets.

Apr 14, 2026

Apply

Staff Software Engineer, AI Reliability Engineering

Anthropic

On-site|On-site|Dublin, IE

About AnthropicAt Anthropic, we are on a mission to develop AI systems that are not only reliable and interpretable but also steerable. Our primary goal is to ensure that AI technology is safe and advantageous for all users and society at large. Our rapidly expanding team consists of dedicated researchers, engineers, policy experts, and business leaders, all working collaboratively to create beneficial AI solutions.Role OverviewAt Anthropic, we believe in the strength of collaboration. Our AI Reliability Engineering (AIRE) team plays a crucial role in maintaining the robustness of Claude, our flagship AI, ensuring it remains reliable for everyone who relies on it. We work closely with various teams within Anthropic to enhance reliability across our essential service paths—from the SDK, through our network, API layers, serving infrastructure, and accelerators, and back again. Our hands-on approach allows us to make impactful improvements during incidents and in collaborative projects.Reliability is an emergent quality that extends beyond individual teams. Our role involves taking a comprehensive view of the systems, offering a unique opportunity for dynamic, cross-functional engagement with the most critical aspects of our operations.

Feb 9, 2026

Create account — see all 1,860 results

Browse all companies, explore by city & role, or SEO search pages. View directory listings: all jobs, search results, or location & role pages.