Senior Network Site Reliability Engineer At Nebius Amsterdam Netherlands jobs in Amsterdam – Browse 3,244 openings on RoboApply Jobs

Senior Network Site Reliability Engineer At Nebius Amsterdam Netherlands jobs in Amsterdam

Open roles matching “Senior Network Site Reliability Engineer At Nebius Amsterdam Netherlands” with location signals for Amsterdam. 3,244 active listings on RoboApply Jobs.

3,244 jobs found

1 - 20 of 3,244 Jobs
Apply
Nebius logoNebius logo
Full-time|On-site|Amsterdam, Netherlands

Why Join Nebius?Nebius is at the forefront of a transformative era in cloud computing, designed to empower the global AI economy. We provide innovative tools and resources that enable our clients to tackle real-world challenges and revolutionize industries, all while minimizing infrastructure costs and eliminating the necessity for extensive in-house AI/ML t…

Apr 30, 2026
Apply
Nebius logoNebius logo
Full-time|Remote|Amsterdam, Netherlands; Remote - Europe

Why Join NebiusNebius is pioneering a transformative era in cloud computing, tailored to meet the demands of the global AI economy. We provide the essential tools and resources that empower our clients to address real-world challenges and revolutionize their industries without incurring substantial infrastructure costs or assembling large in-house AI/ML teams. Our workforce is engaged at the forefront of AI cloud infrastructure, collaborating with some of the most talented and innovative leaders and engineers in the industry.Our Work EnvironmentHeadquartered in Amsterdam and publicly traded on Nasdaq, Nebius boasts a worldwide presence with R&D centers across Europe, North America, and Israel. Our diverse team of over 1400 professionals includes more than 400 highly skilled engineers, well-versed in both hardware and software engineering, complemented by an in-house AI R&D team.The RoleWe are seeking a Network Site Reliability Engineer (NetSRE) to play a critical role in developing and maintaining the foundational infrastructure of Nebius—the Network, which is essential for all other services. This engineering-centric SRE position will involve defining clear reliability objectives, implementing the necessary tooling and automation to achieve them, while enhancing the operational safety of the network as we scale rapidly.Your Responsibilities Will Include:Establish and oversee reliability benchmarks for network services and critical pathways (including SLIs/SLOs, availability targets, and error budgets as applicable).Enhance reliability across the entire network, focusing not just on services, but also on site readiness, inter-site connectivity (DCI), and operational protocols.Lead incident response efforts in your areas, directing investigations/postmortems and transforming failures into sustainable solutions rather than recurring issues.Develop and refine observability tools including actionable metrics, logs, traces, alerting systems, and expedited debugging processes.

Apr 30, 2026
Apply
Nebius logoNebius logo
Full-time|Remote|Amsterdam, Netherlands; Remote - Europe

Why Join Nebius?Nebius is revolutionizing cloud computing to empower the global AI economy. We offer innovative tools and resources that enable our clients to tackle real-world problems and transform their industries without incurring exorbitant infrastructure costs or the necessity for extensive in-house AI/ML teams. Our workforce operates at the forefront of AI cloud infrastructure, collaborating with some of the most seasoned and creative leaders and engineers in the sector.Our Work EnvironmentBased in Amsterdam and publicly traded on Nasdaq, Nebius boasts a global presence with R&D centers across Europe, North America, and Israel. Our team of over 1400 includes more than 400 highly skilled engineers with profound expertise in hardware and software engineering, complemented by an in-house AI R&D division.The OpportunityWe are seeking a Senior Network Engineer to enhance our team, ensuring the seamless operation of our data center infrastructure, points of presence, and proprietary backbone network. The ideal candidate will design and develop expansive data center networks featuring thousands of server ports, InfiniBand-based GPU cluster interconnect networks, alongside automation and management tools for these networks.You are invited to work from our office in Amsterdam.Your Key Responsibilities:Provide expert technical design and operational support to cross-functional teams, including our internal R&D team, HWaaS, and cloud overlay & underlay network environments.Develop monitoring and automation tools.Play a crucial role in launching new regions within our cloud and GPU platforms.What We Expect From You:Expertise in MPLS, routing, and switching for Service Provider and DataCenter networks, Ethernet switching/vxlan, routing protocols (BGP/ISIS), SR MPLS+v6, traffic balancing/ECMP, L3 MPLS VPN, and cloud overlay network technologies.Proven experience in network design development and documentation.Ability to create testing plans for network infrastructure and vendors.Strong diagnostic skills with the TCP/IPv4/v6 protocol stack in a CLOS topology data center network.In-depth knowledge of modern network equipment design, QoS mechanisms, and operational practices.

Apr 23, 2026
Apply
Nebius logoNebius logo
Full-time|Remote|Amsterdam, Netherlands; Remote - Europe

Why Join Nebius?Nebius is at the forefront of a transformative era in cloud computing, dedicated to empowering the global AI economy. We provide the essential tools and resources that enable our clients to tackle real-world challenges and revolutionize industries—all while minimizing infrastructure expenses and reducing the need for extensive in-house AI/ML teams. By becoming part of our team, you will work alongside some of the industry's most experienced and innovative leaders and engineers, pushing the boundaries of AI cloud infrastructure.Our Work EnvironmentHeadquartered in Amsterdam and publicly traded on Nasdaq, Nebius boasts a global presence with R&D hubs across Europe, North America, and Israel. Our diverse team of over 1,400 employees includes more than 400 highly skilled engineers with profound expertise in hardware and software engineering, complemented by a dedicated in-house AI R&D team.The RoleWe are looking for an Enterprise Applications Engineer who has a strong emphasis on the Atlassian Cloud ecosystem to join our Enterprise Technologies team. In this pivotal role, you will be responsible for the administration, configuration, and ongoing enhancement of our Atlassian Cloud suite, which includes tools such as Jira, Confluence, Assets, and Statuspage. You will engage with cross-functional teams to ensure our business applications remain secure, efficient, and aligned with organizational objectives. This position is perfect for candidates with substantial hands-on experience in Atlassian administration who are eager to expand their knowledge in SaaS management, integrations, and automation across the enterprise stack.You are welcome to work from our office in Amsterdam or remotely.

Apr 23, 2026
Apply
Nebius logoNebius logo
Internship|On-site|Amsterdam, Netherlands

Why Join Nebius?Nebius is at the forefront of a transformative wave in cloud computing, dedicated to empowering the global AI economy. We provide essential tools and resources that enable our customers to tackle real-world challenges and revolutionize industries—all while avoiding exorbitant infrastructure expenses and the necessity of large in-house AI/ML teams. Our staff operates at the leading edge of AI cloud infrastructure, collaborating with some of the most innovative leaders and engineers in the field.Our Work EnvironmentBased in the vibrant city of Amsterdam and publicly traded on Nasdaq, Nebius boasts a worldwide presence with research and development hubs across Europe, North America, and Israel. Our diverse team of over 1400 professionals includes more than 400 highly skilled engineers, bringing extensive expertise in both hardware and software engineering, complemented by a dedicated in-house AI R&D team.Position Summary:Location: AmsterdamDuration: 3 monthsStart Date: June 2026Compensation: PaidEligibility: Current university student pursuing a degree in Computer Science or a related field, recent graduates, or early career professionalsWork Authorization: Authorized to work in the job's location

Apr 23, 2026
Apply
Nebius logoNebius logo
Full-time|Remote|Amsterdam, Netherlands; Remote - Europe

Why Choose Nebius?Nebius is at the forefront of revolutionizing cloud computing, catering specifically to the global AI economy. Our mission is to provide our clients with the essential tools and resources needed to tackle real-world challenges and innovate industries, all without incurring hefty infrastructure expenses or the necessity of assembling large in-house AI/ML teams. Join us and collaborate with some of the brightest minds in AI cloud infrastructure, alongside seasoned leaders and engineers.Where We OperateFounded in Amsterdam and publicly traded on Nasdaq, Nebius boasts a worldwide presence with R&D centers located throughout Europe, North America, and Israel. Our workforce comprises over 1,400 dedicated professionals, including more than 400 highly skilled engineers proficient in both hardware and software engineering, complemented by a dedicated in-house AI R&D team.Your RoleAs a Senior Site Reliability Engineer (SRE) within the Compute Node team at Nebius AI Cloud, you will play a pivotal role in constructing and managing the cluster scheduler and node-level services that oversee and maintain virtual machines across our cloud regions. The focus of this role is on Linux systems engineering, virtualization, and operational reliability. You will work closely with the operating system and hypervisor, influencing the integration of reliability and observability within the Compute platform.Your Key Responsibilities:Guarantee the reliability, availability, and performance of compute nodes hosting virtual machines.Analyze and troubleshoot Linux systems at both user and kernel space, recognizing their capabilities, limitations, and trade-offs.Resolve intricate production issues involving CPU, memory, NUMA, cgroups, and scheduling.Engage hands-on with virtualization and containerization using QEMU/KVM and Linux-based technologies.Develop and enhance observability as a core capability of the node layer, including metrics, logs, traces, alerts, SLIs, and SLOs.Lead incident response efforts, conduct root-cause analyses, and perform postmortems, driving long-term enhancements in reliability.Work in close partnership with platform, kernel/hypervisor, GPU, and infrastructure teams to refine system design and operability.

Apr 23, 2026
Apply
Nebius logoNebius logo
Full-time|On-site|Amsterdam, Netherlands

Why Join Nebius?Nebius is at the forefront of revolutionizing cloud computing, catering to the global AI economy. We provide innovative tools and resources that empower our clients to tackle real-world challenges and transform industries efficiently, without incurring massive infrastructure costs or the necessity of assembling large in-house AI/ML teams. Our team collaborates with some of the most experienced and innovative leaders and engineers in the field, working on the cutting edge of AI cloud infrastructure.Our Work EnvironmentBased in Amsterdam and publicly traded on Nasdaq, Nebius boasts a global presence with R&D hubs in Europe, North America, and Israel. Our diverse team of over 1,400 employees includes more than 400 highly skilled engineers specializing in hardware and software engineering, complemented by an in-house AI R&D team.About Our Hardware TeamThe Hardware Infrastructure department oversees the complete lifecycle of our infrastructure: from server design and supply chain management to data center deployment and ongoing operations. With rapid growth, we're building the foundational systems that will support AI computing for years to come.The RoleWe are seeking a Senior Technical Project Manager who can effectively engage with all functions within the Hardware Infrastructure department. This role goes beyond mere coordination; you will possess substantial technical expertise to collaborate with hardware engineers, software teams, and data center operations. You will navigate complex tradeoffs and proactively drive programs forward without needing detailed explanations. One week, you may be spearheading a GPU cluster deployment; the next, you could be working with engineers to outline a monitoring platform or resolving a supply chain issue. You will ensure that the most crucial initiatives within the department are organized, visible, and progressing smoothly.Scope of ResponsibilitiesHardware Engineering & NPI - Oversee timelines for new hardware development cycles, coordinating between design, validation, and manufacturing partners. Monitor hardware bring-up milestones and assist engineering teams in adhering to schedules through DVT/PVT and into production.Infrastructure Automation - Lead the delivery of internal tooling programs: functional and load testing systems for servers, DCIM, and monitoring platforms covering power, cooling, racks, servers, JBODs, JBOGs, power shelves, and network devices.

May 1, 2026
Apply
Nebius logoNebius logo
Full-time|On-site|Amsterdam, Netherlands

Why Join NebiusNebius is at the forefront of the cloud computing revolution, dedicated to fueling the global AI economy. We empower our clients with innovative tools and resources to tackle real-world challenges and revolutionize industries without incurring exorbitant infrastructure costs or the necessity of establishing extensive in-house AI/ML teams. By joining our team, you'll collaborate with some of the most seasoned and imaginative leaders and engineers in AI cloud infrastructure.Our Work EnvironmentWith our headquarters in Amsterdam and a presence on Nasdaq, Nebius boasts a global reach with R&D centers across Europe, North America, and Israel. Our diverse team of over 1,400 employees includes more than 400 highly skilled engineers with profound expertise in both hardware and software engineering, in addition to an in-house AI R&D team.Role SummaryThe Operations Specialist will oversee and streamline the entire expense, payment, and documentation processes, ensuring that all company purchases are processed accurately, promptly, and in compliance with regulations. This role will serve as a vital operational partner across various teams including finance, procurement, legal, compliance, accounting, treasury, and logistics. A keen attention to detail, the capability to work autonomously within established frameworks, and a proactive attitude towards enhancing operational workflows are essential for success in this role.Main Responsibilities• Manage document processes involving both internal teams and external contractors, ensuring precise creation, version control, and secure storage.• Coordinate agreements, reconciliations, reports, and other operational documents for both internal and external partners.• Process reports and confirm they meet internal standards for completeness, accuracy, and formatting.• Collaborate closely with Accounts Payable teams to facilitate transaction processing and resolve discrepancies.• Engage with cross-functional teams including Legal and Accounting to address operational challenges and ensure efficient workflows.• Provide guidance to internal stakeholders on transaction processing requirements, documentation standards, and operational procedures.• Participate in special projects and initiatives aimed at operational improvement as required.

Apr 23, 2026
Apply
Nebius logoNebius logo
Full-time|Remote|Amsterdam, Netherlands; Israel; Remote - Europe

Why choose Nebius?Nebius is at the forefront of revolutionizing cloud computing to empower the global AI economy. We develop essential tools and resources that enable our clients to tackle real-world problems and innovate across industries—all without incurring substantial infrastructure costs or the necessity of assembling large in-house AI/ML teams. Our team operates at the cutting-edge of AI cloud infrastructure, collaborating with some of the most experienced and innovative leaders and engineers in the industry.Our Work EnvironmentWith our headquarters in Amsterdam and a presence on Nasdaq, Nebius boasts a global footprint with R&D hubs across Europe, North America, and Israel. Our workforce of over 1400 includes more than 400 expert engineers with extensive experience in hardware and software engineering, alongside a dedicated in-house AI R&D team.The RoleYour responsibilities will include:Ensuring fault tolerance, scalability, and uninterrupted operations for our services.Utilizing cutting-edge cloud technology to address various infrastructure challenges.Implementing and enhancing CI/CD processes.We expect you to have:Strong experience with programming languages such as Go, Python, or C++.A solid understanding of classic algorithms and data structures.Commercial experience with and a deep understanding of Unix systems and networking technologies.Experience with containerization and configuration management tools like Ansible, Salt, Terraform, Docker, Kubernetes, and Helm.Bonus points for:A keen interest in backend development.Experience in designing, developing, and managing high-load distributed systems.Commercial experience across various cloud platforms.Coding interviews are part of our hiring process.What we offer:A competitive salary and a comprehensive benefits package.Opportunities for professional advancement within Nebius.Flexible working arrangements.A dynamic, collaborative work environment that fosters initiative and innovation.

Apr 23, 2026
Apply
Nebius logoNebius logo
Full-time|On-site|Amsterdam, Netherlands

Why Choose a Career with Nebius?Nebius is at the forefront of the cloud computing revolution, dedicated to empowering the global AI economy. We equip our clients with innovative tools and resources to tackle real-world challenges and revolutionize industries, all while minimizing infrastructure expenses and the need for extensive in-house AI/ML teams. Our team members work at the cutting edge of AI cloud infrastructure alongside some of the most experienced and creative leaders and engineers in the industry.Our Work EnvironmentHeadquartered in Amsterdam and publicly listed on Nasdaq, Nebius has a worldwide presence with R&D centers across Europe, North America, and Israel. Our diverse team of over 1,400 professionals includes more than 400 highly skilled engineers with expertise in both hardware and software engineering, supported by an in-house AI R&D division.The OpportunityWe invite you to join our dynamic legal team and take a key role in shaping the future of AI. We are on the lookout for a Legal Counsel who excels in a collaborative, fast-paced environment, bringing extensive expertise in technology-centric commercial agreements. In this position, you will act as a trusted business partner to various cross-functional teams—including product, sales, and engineering—while independently managing complex negotiations and devising creative, pragmatic solutions to legal and commercial challenges. We appreciate strategic thinkers who take a holistic approach to problems and adeptly balance risk management with driving business success.Your Key Responsibilities Include:Drafting, reviewing, and negotiating a diverse range of complex commercial IT agreements, including SaaS, IaaS, and HWaaS contracts.Leading contract negotiations independently with customers, vendors, and strategic partners.Providing proactive legal advice and support to cross-functional teams to facilitate business objectives.Ensuring compliance with applicable laws and regulations.Staying up-to-date with industry trends and developments to inform legal strategies.

Apr 23, 2026
Apply
airapps logoairapps logo
Full-time|On-site|Amsterdam

airapps is seeking a Site Reliability Engineer (SRE) based in Amsterdam. This position centers on maintaining the reliability, scalability, and performance of core systems. Role overview The SRE works alongside both development and operations teams. The main focus is to keep infrastructure running smoothly and to improve service quality for users. What you will do Monitor and support system reliability and uptime Collaborate with developers and operations staff to optimize infrastructure Contribute to enhancing the overall user experience by ensuring stable services Location This role is based in Amsterdam.

Apr 28, 2026
Apply
pinely logopinely logo
Full-time|On-site|Amsterdam, North Holland, Netherlands

Join pinely as we expand our innovative team! We are seeking a dedicated Site Reliability Engineer who thrives in a dynamic environment.Key Responsibilities:Deploy, configure, and manage Linux-based servers efficiently.Diagnose and resolve hardware and network availability issues while monitoring for failures.Oversee numerous nodes across various remote sites and cloud infrastructures.Contribute to infrastructure automation initiatives using Python and/or Go.Engage with cloud platforms including AWS, Google Cloud, and Alibaba Cloud.Enhance monitoring systems for production trading environments utilizing Grafana.Required Qualifications:A minimum of 3 years of experience in managing and troubleshooting high-load systems.Strong grasp of the Linux TCP/IP stack.Familiarity with essential network components such as DHCP, DNS, and BGP.Proficiency in at least one configuration management tool (e.g., Salt, Ansible).Extensive knowledge of infrastructure monitoring tools, including Prometheus and Grafana.Fluent in English (B2/Upper-Intermediate or above).Basic skills in Python/Bash/Go.Willingness to travel for work-related tasks.Preferred Qualifications:Familiarity with leading server hardware brands.Experience optimizing hardware and OS configurations for peak performance.What We Offer:Competitive salary and comprehensive social benefits.Attractive bonus structure with flexibility in salary negotiations.Opportunity to work with unique networks such as radio relay, shortwave, FPGA cards, and atomic clocks, including server optimization on overclocked systems.Access to cutting-edge technologies and a supportive environment for implementing innovative solutions.Flexible working conditions, minimizing bureaucracy and promoting autonomy.Tuition reimbursement and sponsorship for conferences and training.

Feb 25, 2026
Apply
Nebius logoNebius logo
Full-time|Remote|Amsterdam, Netherlands; Remote - Europe

Why Join Nebius?Nebius is at the forefront of a transformative era in cloud computing, catering to the global AI economy. We empower our clients with the tools and resources necessary to tackle real-world challenges and revolutionize industries without incurring hefty infrastructure costs or the necessity of extensive in-house AI/ML teams. Our team members are engaged in cutting-edge AI cloud infrastructure projects, collaborating with some of the most seasoned and innovative leaders and engineers in the industry.Work EnvironmentHeadquartered in Amsterdam and publicly traded on Nasdaq, Nebius boasts a worldwide presence with R&D centers across Europe, North America, and Israel. Our diverse workforce of over 1,400 employees includes more than 400 highly skilled engineers, proficient in both hardware and software engineering, complemented by a dedicated in-house AI R&D team.The RoleWe are seeking a Senior Network Software Engineer (NetSWE) who will engineer software to enhance the safety, scalability, and reliability of network operations, particularly as we accelerate our data center launches. This position is more than just scripting for configurations; you will develop the tools and services that connect the core network components (switches/ports/VLANs, traffic processors) with our cloud platform, utilizing open-source solutions where applicable and creating custom solutions where necessary.Your responsibilities will include:Designing and maintaining services and tools that automate the entire network lifecycle, including day-0 provisioning, ongoing changes, drift detection, and operational verification.Ensuring network changes are safe and transparent through CI/CD workflows, diff/review tools, staged rollouts/rollbacks, audit trails, and protective measures.Creating observability systems that scale seamlessly across various locations, focusing on telemetry pipelines, signal quality, and tools that expedite incident investigations.

Apr 30, 2026
Apply
Nebius logoNebius logo
Full-time|Remote|Amsterdam, Netherlands; Remote - Europe

Why Join Nebius?Nebius is at the forefront of a transformative era in cloud computing, empowering the global AI economy. We provide innovative tools and resources that enable our customers to tackle real-world challenges and revolutionize their industries, all while minimizing infrastructure costs and the necessity for extensive in-house AI/ML teams. Our talented employees are engaged in pioneering work in AI cloud infrastructure, collaborating with some of the most experienced and creative leaders and engineers in the industry.Our Work EnvironmentWith our headquarters in Amsterdam and listed on Nasdaq, Nebius boasts a global presence with R&D hubs across Europe, North America, and Israel. Our diverse team of over 1,400 employees includes more than 400 highly skilled engineers with deep expertise in both hardware and software engineering, complemented by a dedicated in-house AI R&D team.The RoleWe are seeking a Senior Hypervisor Engineer to play a pivotal role in the development of our hyperscaler platform. The Hypervisor team is responsible for advancing the components of our Cloud platform that directly interface with the KVM hypervisor and QEMU device emulator. We focus on the intricate details of hardware virtualization and device emulation, ensuring optimal performance and robust protection against untrusted code. You will collaborate closely with the open-source community to implement significant enhancements to the QEMU/KVM virtualization stack.In this role, your key responsibilities will include:Optimizing I/O for emulated disk and network devicesIntegrating the hypervisor with other platform services and storage solutions for user dataEfficiently allocating resources among virtual machinesEnhancing support for guest systemsPushing the boundaries of open-source virtualization

Apr 23, 2026
Apply
Nebius logoNebius logo
Full-time|On-site|Amsterdam, Netherlands

Join Nebius - Pioneers in Cloud ComputingNebius is at the forefront of a revolutionary shift in cloud computing, designed to support the burgeoning global AI economy. We develop innovative tools and resources that empower our customers to tackle real-world challenges while minimizing infrastructure costs and eliminating the necessity for large in-house AI/ML teams. Our workforce operates at the leading edge of AI cloud infrastructure, collaborating with some of the most seasoned and inventive leaders and engineers in the industry.Our Work EnvironmentLocated in Amsterdam and publicly traded on Nasdaq, Nebius boasts a worldwide presence with R&D centers throughout Europe, North America, and Israel. Our team consists of over 1,400 professionals, including more than 400 highly skilled engineers with extensive expertise in hardware and software engineering, complemented by a dedicated AI R&D team.The RoleWe are forming a global L3 Support Line from the ground up to manage the highest level of technical escalation for server and rack infrastructure across Europe and the US. Positioned at the crossroads of data center operations, R&D engineering, and ODM partnerships, this team will take full ownership of intricate server and firmware incidents — driving root-cause resolution and transforming recurring issues into scalable architectural advancements.You will lead a team of approximately 10 L3 engineers based in Europe (Amsterdam HQ and additional data center locations), closely collaborating with the regional L3 Lead to ensure 24/7 global support coverage.In this capacity, you will serve as the Incident Commander for high-severity production incidents, establish formal problem management practices, and design enterprise-level support frameworks for our contracted bare-metal clients, including two major FAANG companies at launch.This managerial position entails significant technical responsibility: you will oversee people and processes while retaining the expertise to conduct advanced investigations into Linux, hardware, and firmware when L2 support reaches its limits.We welcome you to work from our office in Amsterdam, the Netherlands.

Apr 23, 2026
Apply
Nebius logoNebius logo
Full-time|Remote|Amsterdam, Netherlands; Berlin, Germany; London, United Kingdom; Prague, Czech Republic; Remote - Europe; Remote - United States; United States

Why join Nebius?Nebius is at the forefront of a revolutionary shift in cloud computing, dedicated to empowering the global AI economy. We provide innovative tools and resources that enable our clients to tackle real-world challenges and revolutionize their industries without incurring substantial infrastructure costs or the necessity of assembling extensive in-house AI/ML teams. Our workforce operates on the cutting edge of AI cloud infrastructure, collaborating with some of the most seasoned and creative leaders and engineers in the industry.Our Work EnvironmentHeadquartered in Amsterdam and publicly traded on Nasdaq, Nebius boasts a worldwide presence with R&D centers across Europe, North America, and Israel. Our team consists of over 1,400 professionals, including more than 400 highly skilled engineers with profound expertise in both hardware and software engineering, complemented by an in-house AI R&D team.As part of Nebius Cloud, one of the largest GPU clouds globally, the Token Factory team operates tens of thousands of GPUs. We are developing an inference platform designed to deploy a variety of foundation models — including text, vision, audio, and cutting-edge multimodal architectures — quickly, dependably, and effortlessly at scale. To achieve this goal, we are seeking an engineer capable of ensuring the platform operates flawlessly under heavy loads and can recover seamlessly from unexpected issues.In this position, you will take ownership of the reliability, performance, and observability of the complete inference stack. Your day may start with designing and refining telemetry pipelines — turning hundreds of terabytes of signals into actionable insights through metrics, logs, and traces. You might also optimize Kubernetes autoscalers for enhanced GPU efficiency, create Terraform modules that incorporate resilience into every new cluster, or strengthen our request-routing and retry logic to ensure that transient failures remain unnoticed by users. When incidents occur, you will utilize the automation and runbooks you’ve developed to swiftly detect, isolate, and address issues, while fostering a post-mortem culture to prevent future occurrences. All these efforts are directed towards a singular objective: achieving smooth platform scaling while meeting rigorous cost and reliability targets.Success in this role requires a deep understanding of Kubernetes, Prometheus, Grafana, Terraform, and the principles of infrastructure-as-code. You should be comfortable scripting in Python or Bash, grasp the intricacies of alert design and SLOs for high-throughput APIs, and have enough production experience to recognize how distributed back-ends can fail in real-world scenarios. Experience managing GPU-intensive workloads — whether with vLLM, Triton, Ray, or a similar accelerator stack — will be advantageous, as will a background in MLOps or model-hosting platforms.

Apr 23, 2026
Apply
Jump Trading logoJump Trading logo
Full-time|On-site|Amsterdam

Join Jump Trading as a Site Reliability Engineer in our Trading Operations team. In this pivotal role, you will ensure the reliability and performance of our trading systems, utilizing your expertise to implement best practices in system design and operations.Your responsibilities will include monitoring system performance, troubleshooting issues, and collaborating with software engineers to improve system architecture. Your contributions will play a critical role in maintaining our competitive edge in the trading industry.

Mar 30, 2026
Apply
Nebius logoNebius logo
Full-time|On-site|Amsterdam, Netherlands

Why join Nebius?Nebius is at the forefront of a transformative era in cloud computing, dedicated to empowering the global AI economy. We develop innovative tools and resources that enable our customers to tackle genuine challenges and revolutionize industries without incurring hefty infrastructure costs or the need for extensive in-house AI/ML teams. Our team operates on the cutting edge of AI cloud infrastructure, working alongside some of the most seasoned and imaginative leaders and engineers in the industry.Our Work EnvironmentBased in Amsterdam and publicly traded on Nasdaq, Nebius boasts a global presence with research and development hubs in Europe, North America, and Israel. Our workforce comprises over 1,400 employees, including more than 400 highly skilled engineers specializing in hardware and software engineering, complemented by an in-house AI R&D team.The OpportunityWe are seeking a talented Senior Frontend Developer to architect and implement internal user interfaces leveraging React and TypeScript. These user interfaces facilitate automation workflows that empower teams to effectively manage our company's hardware infrastructure—tracking assets, orchestrating data center deployments, provisioning and configuring servers, installing operating systems and drivers, and troubleshooting issues across distributed environments. You will play a crucial role throughout the product lifecycle—from concept and UX designs to production deployment in containerized environments (Docker, Kubernetes).

Apr 23, 2026
Apply
Nebius logoNebius logo
Full-time|On-site|Amsterdam, Netherlands; Berlin, Germany; London, United Kingdom; Prague, Czech Republic

Join Nebius and Shape the Future of AI Cloud NetworkingNebius is at the forefront of cloud computing, pioneering solutions that empower the global AI economy. We provide our customers with tools and resources to tackle real-world challenges and revolutionize industries without incurring hefty infrastructure expenses or the necessity of assembling large internal AI/ML teams. Our workforce operates on the cutting edge of AI cloud infrastructure, collaborating with some of the most seasoned and innovative leaders and engineers in the industry.Your Workplace Awaits YouWith our headquarters situated in Amsterdam and publicly traded on Nasdaq, Nebius boasts a global reach with R&D hubs across Europe, North America, and Israel. Our team of over 1,400 professionals includes more than 400 highly proficient engineers with extensive expertise in both hardware and software engineering, complemented by an in-house AI R&D team.The OpportunityNebius is in search of a Technical Product Manager – AI Cloud Networking to join our dynamic team. In this pivotal role, you will take ownership of the vision, roadmap, and priorities for our networking services, which encompass overlay (VPC) networks, underlay networks (data center fabric and WAN), and DNS.You will also be instrumental in shaping and managing backlogs for networking service teams, leading critical company-wide initiatives that enhance connectivity. This position demands a robust technical foundation along with the capability to coordinate seamlessly across engineering, development, product, technical support, and go-to-market teams.Your Key Responsibilities Will Include:Ownership and management of the product backlog for network service teams.Leading and coordinating essential cross-company implementations related to networking and connectivity.Collaborating closely with engineering and architecture teams to define product requirements and deliver innovative networking features.Partnering with product marketing and technical pre-sales/post-sales teams on technical publications, go-to-market strategies, customer engagement, acquisition, and retention initiatives pertinent to networking features.Ensuring the provision of networking services that uphold high standards of performance, security, scalability, and reliability.

Apr 23, 2026
Apply
Nebius logoNebius logo
Full-time|On-site|Amsterdam, Netherlands

Why Choose Nebius?Nebius is at the forefront of a transformative wave in cloud computing, designed to empower the global AI economy. We provide innovative tools and resources that enable our clients to tackle real-world challenges and revolutionize industries without incurring hefty infrastructure expenses or the necessity of assembling large in-house AI/ML teams. Our team operates at the cutting edge of AI cloud infrastructure, collaborating with some of the most innovative leaders and engineers in the industry.Our WorkspaceHeadquartered in Amsterdam and publicly traded on Nasdaq, Nebius boasts a global presence, with R&D hubs across Europe, North America, and Israel. Our workforce of over 1,400 includes more than 400 highly skilled engineers with extensive expertise in both hardware and software engineering, along with a dedicated in-house AI R&D team.About the Role:As part of Nebius Cloud, one of the largest GPU clouds globally, Token Factory is developing an inference platform designed to facilitate the rapid, reliable, and effortless deployment of a wide array of foundation models—spanning text, vision, audio, and emerging multimodal architectures—on a massive scale.

Apr 23, 2026

Sign in to browse more jobs

Create account — see all 3,244 results

Tailoring 0 resumes

We'll move completed jobs to Ready to Apply automatically.