Clicking Apply Now takes you to AutoApply where you can tailor your resume and apply.
Experience Level
Experience
Qualifications
Key Responsibilities:Design, develop, and maintain scalable and resilient infrastructure solutions to enhance our expanding platform and customer base. Collaborate closely with software engineers to optimize application performance and reliability. Implement robust monitoring, alerting, and logging systems to proactively identify and resolve issues. Automate deployment processes and refine infrastructure management using contemporary DevOps tools and methodologies. Advance our backend architecture and infrastructure for both cloud and on-premise deployments. Work collaboratively with the team to set and prioritize our roadmap to maximize customer impact. Lead initiatives aimed at enhancing infrastructure reliability, performance, and cost efficiency. Required Skills:3+ years of experience in distributed systems with a focus on designing and managing cloud-based environments (e.g., AWS, Azure, GCP). Practical experience with containerization technologies (e.g., Docker, Kubernetes) and orchestration platforms. A strong passion for constructing and operating developer productivity tools, frameworks, and other facets of platform engineering. Familiarity with CI/CD pipelines and version control systems. Understanding of automated testing best practices and frameworks to ensure software reliability through integration and performance testing.
About the job
At Sift, we are revolutionizing the way cutting-edge machines are constructed, tested, and managed. Our innovative platform provides engineers with real-time visibility into high-frequency telemetry, effectively removing bottlenecks and facilitating quicker, more dependable development.
Sift originated from our experience at SpaceX, contributing to projects like Dragon, Falcon, Starlink, and Starship, where the demands of scaling telemetry, debugging flight systems, and ensuring mission reliability necessitated a new kind of infrastructure. Founded by a talented team from SpaceX, Google, and Palantir, Sift is tailored for mission-critical systems where precision and scalability are imperative.
As one of the pioneering engineers at Sift, your role will extend beyond just coding, you will play a crucial part in defining the architecture, shaping the product, and influencing the culture of a company dedicated to addressing real engineering challenges. If you're eager to take on intricate technical obstacles and build foundational systems that support complex machines from the ground up, we would love to connect with you.
About Sift
Sift is at the forefront of innovation in machine infrastructure, born from the expertise of engineers who have worked on groundbreaking projects at SpaceX. Our team consists of professionals from industry-leading companies like Google and Palantir, focused on creating scalable solutions for mission-critical applications.
Full-time|$140K/yr - $140K/yr|On-site|San Mateo, California, United States
Skydio builds autonomous drones for a wide range of users, from utility inspectors and first responders to military personnel in the field. Based in San Mateo, California, Skydio combines artificial intelligence expertise with advanced hardware and software development, always focused on customer needs. About the Cloud Infrastructure Team The Cloud infrastru…
GitLab seeks a Staff Infrastructure Security Engineer to focus on safeguarding infrastructure across the Asia-Pacific (APAC) and Europe, Middle East, and Africa (EMEA) regions. This position is fully remote and supports a distributed team working across multiple time zones. Role focus This role centers on improving GitLab's infrastructure security. The engineer will help identify and address risks, working to strengthen the company's overall security posture. Collaboration Collaboration with team members across APAC and EMEA is essential. The position supports a remote, globally distributed team and contributes to shared security goals.
Full-time|$120K/yr - $185K/yr|Remote|Boston or Remote
Infrastructure Software Engineer AcuityMD is at the forefront of revolutionizing access to medical technologies through our innovative software and data platform. We empower MedTech companies to gain insights into product usage, customer variability, and opportunities to enhance patient care. With approximately 6,000 new medical devices approved by the FDA each year, our platform accelerates the journey from product development to physician access, ultimately improving patient outcomes. Backed by prominent investors including Benchmark, Redpoint, ICONIQ Growth, and Ajax Health, we are a rapidly scaling SaaS organization. As an Infrastructure Software Engineer, you will collaborate closely with various teams across Engineering and Production. Your role will involve designing, building, and maintaining core platform services—encompassing compute, networking, storage, CI/CD, and developer tooling—that drive our applications from start to finish. You will enhance reliability, security, and efficiency, while fostering an exceptional developer experience. Additionally, you will contribute to our strategic objectives as we advance our infrastructure and practices to maximize both internal and customer impact. Team Mission Our Platform Team serves as the backbone for the organization, ensuring that product teams can deliver swiftly and safely. We create direct customer value through our cloud capabilities and security measures while supporting internal success through partnerships with application teams, enabling a superior development experience. Responsibilities Steward core platform services: Implement scalable container orchestration, service mesh, ingress, and secrets management. Cross-functional partnership: Collaborate with Product, Engineering, Data, and Security to drive external and internal value. Harden reliability: Enhance observability through logging, metrics, and tracing, along with automated remediation to boost availability and reduce latency. Automate everything: Utilize infrastructure-as-code and configuration management to ensure systems and processes are repeatable, auditable, and secure. Scale cost-effectively: Optimize cluster utilization and autoscaling, balancing performance, reliability, and costs. Level-up developer experience: Develop internal tooling, templates, and best practices that minimize cognitive load and expedite time-to-deploy for product teams. On-call & incident response: Engage in a sustainable on-call rotation, lead post-mortems, minimize repetitive tasks, and reduce mean time to recovery through automation. Enable fast, safe delivery: Enhance CI/CD pipelines to facilitate swift and secure software releases.
About the RoleJoin our pioneering team at vooma as a Backend & Infrastructure Software Engineer, where you will play a critical role in shaping the technical infrastructure of a transformative company.If you are passionate about creating not only resilient systems but also the foundational architecture of a groundbreaking enterprise from the outset, this position is ideal for you.We are looking for someone who excels at crafting infrastructure that is elegant, dependable, and secure, even under high-demand scenarios. You thrive on the challenge of scaling systems that enable intelligent agents and take pride in establishing reliable foundations that others can rely on.Your Key Responsibilities Include:Design and maintain secure, scalable infrastructure tailored for AI-powered agents in production environments.Deploy and optimize AI-driven services to meet high availability and performance standards.Manage infrastructure as code, alongside cloud environments and CI/CD pipelines.Implement monitoring, observability, and alerting systems to ensure the reliability of our infrastructure.Contribute to infrastructure security and adhere to best practices.You Should Have:Experience in deploying and productionizing machine learning or AI-centric workloads.Proficiency in developing secure, scalable infrastructures on platforms such as AWS, Azure, or GCP.In-depth knowledge of backend systems, networking, and container orchestration technologies (e.g., Kubernetes).Understanding of infrastructure security principles and compliance standards (e.g., SOC2).A proactive and hands-on mindset, with a strong drive to solve challenges from start to finish.
Role Overview Roku, Inc. is hiring a Security Software Engineer in Austin, Texas. This position focuses on designing and building secure software solutions to safeguard both users and the Roku platform. What You Will Do Create and implement software with security as a core requirement Work with teams across engineering, product, and operations to strengthen security protocols Help shape and improve the security architecture for Roku’s streaming products
Full-time|$300K/yr - $300K/yr|On-site|San Francisco
ABOUT BASETENJoin Baseten, where we drive mission-critical AI inference for leading companies like Cursor, Notion, OpenEvidence, Abridge, Clay, Gamma, and Writer. Our unique blend of applied AI research, robust infrastructure, and intuitive developer tools empowers organizations at the forefront of AI innovation to deploy state-of-the-art models into production. Recently, we secured a $300M Series E funding round, backed by esteemed investors such as BOND, IVP, Spark Capital, Greylock, and Conviction. Be a part of our rapid growth and help shape the platform that engineers trust for launching AI products.THE ROLEAs an Infrastructure Software Engineer at Baseten, you will play a pivotal role in developing and maintaining our ML inference platform that powers AI applications in production. Your contributions will enhance the core infrastructure, enabling developers to deploy, scale, and monitor machine learning models with exceptional performance.EXAMPLE INITIATIVESYou will engage in innovative projects within our Infrastructure team, including:Multi-cloud capacity managementInference on B200 GPUsMulti-node inferenceFractional H100 GPUs for efficient model servingRESPONSIBILITIESDesign and develop infrastructure components for our ML inference platform, primarily using Python and Go.Implement and maintain Kubernetes deployments for optimal model serving.Contribute to the orchestration layer for model deployments.Build and enhance monitoring systems to track model performance metrics effectively.Develop efficient resource management solutions to optimize performance.
Full-time|$150K/yr - $200K/yr|On-site|San Francisco, CA
At Sift, we are revolutionizing the way cutting-edge machines are constructed, tested, and managed. Our innovative platform provides engineers with real-time visibility into high-frequency telemetry, effectively removing bottlenecks and facilitating quicker, more dependable development.Sift originated from our experience at SpaceX, contributing to projects like Dragon, Falcon, Starlink, and Starship, where the demands of scaling telemetry, debugging flight systems, and ensuring mission reliability necessitated a new kind of infrastructure. Founded by a talented team from SpaceX, Google, and Palantir, Sift is tailored for mission-critical systems where precision and scalability are imperative.As one of the pioneering engineers at Sift, your role will extend beyond just coding—you will play a crucial part in defining the architecture, shaping the product, and influencing the culture of a company dedicated to addressing real engineering challenges. If you're eager to take on intricate technical obstacles and build foundational systems that support complex machines from the ground up, we would love to connect with you.
Full-time|$170K/yr - $170K/yr|On-site|San Mateo, California, United States
Skydio, a premier drone manufacturer based in the United States, stands at the forefront of autonomous flight technology, paving the way for the future of drones and aerial mobility. Our diverse team merges profound expertise in artificial intelligence with top-tier hardware and software development, operational excellence, and a relentless focus on customer satisfaction. We empower a wide array of drone users, from utility inspectors to first responders and military personnel, to leverage our cutting-edge technology in various scenarios.About the Team: The Skydio Cloud Infrastructure team is dedicated to ensuring the Skydio Cloud platform is consistently available to our users at critical moments, whether conducting routine inspections or supporting rescue missions during emergencies. With a global fleet of thousands of drones, we are committed to continuous improvement, emphasizing robust delivery and testing pipelines as vital components of our operations.About the Role: As a Senior Infrastructure Engineer focused on an innovative product, you will play a pivotal role in maintaining our Kubernetes fleet and enhancing the core product software to meet evolving use cases. This position blends software engineering and infrastructure management, allowing you to address product deficiencies directly rather than solely relying on automation. We seek a professional who thrives on the autonomy to influence architecture, security, and functionality across the entire stack.Your Impact:Re-engineer and sustain the expanding requirements of our Kubernetes fleet and its underlying infrastructure.Enhance and broaden the continuous delivery processes for our product.Collaborate across teams (hardware to cloud) to introduce new capabilities to the platform.Engage directly with security teams to refine practices and controls that safeguard our customers' data and drones.Lead cost-saving initiatives early in the product lifecycle to ensure scalability.
About XBOWJoin XBOW in shaping the future of offensive security. In a world where attackers leverage AI to outpace defenders, we are developing a pioneering platform that places security at the forefront. Our AI-driven system autonomously identifies, validates, and even exploits vulnerabilities, providing organizations with evidence-backed results in mere hours instead of weeks.Founded by Oege de Moor, the visionary behind GitHub Copilot, and supported by prestigious investors like Sequoia and Altimeter, XBOW is applying cutting-edge AI technology to tackle one of the globe's most pressing challenges. Within just over a year, our exceptional AI team and renowned security researchers have unveiled thousands of real-world zero-days across critical software, achieving the top spot on HackerOne’s global leaderboard.We're a collaborative team of builders, hackers, and researchers who thrive on tackling challenges others deem insurmountable. If you're eager to push the limits of AI, transform security practices, and be a part of the movement defining this new era of defense, we invite you to connect with us.Your Role: Software Engineer - Platform / Core InfrastructureWe are in search of a passionate Software Engineer - Platform / Core Infrastructure dedicated to constructing scalable systems while navigating complex, ambiguous challenges. In this pivotal role, you will design and implement the sophisticated distributed infrastructure that underpins our core AI engine and distributed analysis systems, ensuring XBOW operates seamlessly across various cloud platforms (AWS, Azure, OCI, etc.) and deployment contexts (SaaS, on-premises).This position is perfect for individuals who view infrastructure as a product, appreciate clean abstractions, and possess the skills to address performance issues at multiple layers. You will be part of a high-trust, high-velocity team where your contributions will have an immediate impact on developer experience and product efficacy. If you enjoy working at the crossroads of advanced technology and real-world impact, you will thrive in our environment.What You Will DoDesign and implement reliable and secure infrastructure systems capable of deployment across multiple cloud environments (AWS, Azure, OCI, etc.) and contexts (SaaS, on-premises).Optimize cloud services across compute, storage, networking, and observability to enhance performance, reliability, and maintainability of core services.Develop core services using TypeScript, Kotlin, and Go (with a willingness to learn quickly if you haven't used these languages before) to meet our unique deployment and infrastructure needs.Support large-scale systems with a focus on performance and scalability.
Join Ivo's Engineering Team!At Ivo, we are pioneers in the tech industry. Our engineers are innovators who have created groundbreaking solutions such as:• An AI agent that seamlessly integrates with MS Word to enhance document editing [2023]• Revolutionizing embedding models with agentic RAG technology [2023]• Advanced LLM-based legal fact extraction capabilities [2024]• A legal assistant designed to search extensive contract databases without compromising accuracy [2024]• Clustering legal documents from the same lineage [2025]• Automatic deviation analysis to uncover hidden risks in vast contract databases [2025]• Merging contracts with their amendments to create a “composite” contract timeline that has moved our clients to tears [2025]Role OverviewAs an Infrastructure Engineer at Ivo, you will lay the groundwork for our platform's future. Your responsibilities will include:• Designing and owning the future of our infrastructure, allowing you the freedom to innovate.• Managing multiple customer deployments, ensuring each receives tailored containers, databases, and VPCs.• Instrumenting our systems to identify performance bottlenecks and errors.• Aggregating metrics and logs into visually appealing dashboards and setting up pager alerts.• Leading infrastructure-related incidents and being on-call as necessary.• Enhancing our CI/CD system to reduce deployment time from ~12 minutes.If you're passionate about LLMs, you'll thrive in our engineering team, where you’ll have the opportunity to:• Develop real-time LLM evaluations to monitor the accuracy of our responses.• Collaborate with talented engineers to push the boundaries of DevOps.
About SunoSuno is an innovative music company dedicated to expanding creative possibilities. Utilizing the most advanced AI music technology available, we provide an extraordinary creative platform that includes Suno Studio, a pioneering generative audio workstation. Whether you're a casual singer, an aspiring songwriter, or a seasoned musician, Suno empowers a diverse community to create, share, and explore music, making the joy of musical expression accessible to everyone.About the RoleWe are seeking talented mid-level infrastructure engineers to join our dynamic engineering team. In this role, you will collaborate with seasoned engineers to design, develop, and maintain Suno's robust technical infrastructure and innovative products. This is a unique opportunity to contribute significantly to our platform while honing your skills in a fast-paced, music-centric environment.Explore the Suno Job Posting Here!What You’ll DoDesign and develop services capable of managing extensive consumer traffic, data, and usage.Create systems that prioritize performance, security, scalability, and observability.Exemplify operational and software engineering excellence through your work.What You’ll Need3-5 years of experience in infrastructure engineering is preferred.Proficient with cloud services (AWS/GCP), Kubernetes, Docker, and infrastructure as code (Pulumi/Terraform/CDK).Experience in scaling infrastructure from inception to production.Solid understanding of Postgres and distributed relational databases; experience with large-scale database hosting is advantageous.Strong backend development skills for optimizing application and service code; familiarity with websockets, CDNs, and streaming traffic patterns is a plus.In-depth knowledge of security best practices for building and scaling infrastructure.Experience in MLOps, large-scale inference, and ML data pipelines is a plus.A passion for engineering excellence, rapid iteration, and continuous learning.Technical leadership or management experience is a bonus.A genuine love for music (listening, exploring, creating) is a significant advantage.A Bachelor’s degree or equivalent experience is required.Additional Notes: Applicants must be ...
Your ContributionBecome an integral part of a dynamic team dedicated to developing the next generation of cybersecurity solutions from the ground up. Work alongside industry experts with a proven history of innovation as you design, construct, and launch groundbreaking products that will make a significant impact in the field. This role offers you the chance to enhance your career and skills as part of a world-class organization from the very outset.Job ResponsibilitiesYou will play a pivotal role in architecting and implementing the platform layer, from the Bootloader to system software, for a large-scale embedded system. This encompasses image and software lifecycle management, including packaging, upgrades, high availability, and telemetry/debug infrastructure. You will have the chance to design and implement this system from the ground up.
Full-time|$138K/yr - $200K/yr|On-site|Austin, Texas, United States; Fremont, California, United States
About Neuralink:At Neuralink, we are pioneering groundbreaking devices that establish a bi-directional interface with the brain. Our innovative technology aims to restore movement in paralyzed individuals, provide sight to the visually impaired, and transform the way humans engage with their digital environments.Team Overview:The Infrastructure Team is the backbone of our operations, ensuring the company functions efficiently and safely at an unprecedented pace. Our diverse infrastructure encompasses both cloud-based and on-premises systems, catering to a wide range of users from highly skilled engineers to non-technical scientists and medical professionals, all of whom require reliable systems, robust networking, and resilient software to perform their roles effectively. This position demands close collaboration with various teams across the organization, covering all aspects of the work environment stack, from deploying physical hardware on manufacturing lines to developing custom tools for streaming neural recordings from implants.
Full-time|$110K/yr - $160K/yr|On-site|Santa Clara, CA
Join us in revolutionizing the future of energy! At Oklo, we are at the forefront of developing cutting-edge nuclear reactors. By utilizing your software engineering expertise, you will collaborate with a diverse team of engineers to model, simulate, design, and deploy innovative fission power technologies.This position focuses on refining our software development processes to ensure high-quality deliveries while adhering to compliance standards. You will play a pivotal role in supporting various internal teams by leveraging software and automation, enabling them to perform their tasks efficiently. As a dynamic startup, Oklo is expanding its capabilities, and you will have the chance to influence the engineering infrastructure's evolution. A background in nuclear energy is not a prerequisite; however, a strong desire to learn and explore is essential.
Astranis is seeking a talented and motivated Software Engineer to join our Infrastructure team. In this role, you will be at the forefront of developing and maintaining critical software systems that support our innovative satellite technology. You'll collaborate with cross-functional teams to design, implement, and optimize our infrastructure solutions, ensuring high reliability and performance.
Full-time|$240K/yr - $270K/yr|On-site|New York, United States
At Genius Sports, we combine cutting-edge technology with premier live data to revolutionize the sports experience for fans around the globe. Our mission is to create more immersive, interactive, and personalized experiences than ever before. Discover more about us at geniussports.com.The Role - Staff Engineer - Infrastructure Platform We are on the lookout for an exceptional Staff Engineer to spearhead critical projects within our core infrastructure platform. Genius Sports is currently integrating its diverse tech teams and acquisitions under a cohesive technical strategy, and our infrastructure platform is the foundation of this transformation. Our primary objective is to empower engineering teams to efficiently build, deploy, and manage Genius Sports’ extensive product catalog in a consistent manner. In this role, you will collaborate with fellow InfraPlat leaders to define and execute the technical vision and implementation across an array of projects. These initiatives encompass multi-account and region Kubernetes clusters, MLOps, standardized deployment processes, and a centralized authentication platform. You will also engage with stakeholders from product engineering teams to assess requests, identify common challenges, and prioritize initiatives.
RDQ226R609 - This position is open to candidates located anywhere in the United States. At Databricks, we are passionate about empowering data teams to tackle the world’s most challenging issues, from detecting security threats to advancing cancer drug development. We achieve this by building and operating an exceptional data and AI infrastructure platform, allowing our clients to concentrate on the critical challenges that define their missions. As a key member of the Security Continuous Monitoring team, you will be instrumental in developing and scaling Databricks Security systems on our platform. Your responsibilities will include designing, testing, and implementing data pipelines to evaluate the security configurations of Cloud, SaaS, and on-premise tools. You will create and deploy reliable supporting security tools for managing and assessing security posture, integrate with third-party applications, and engage with cloud APIs (AWS, Azure, GCP, Terraform). You will lead and oversee projects from conception to completion, facilitating data collection and integration with our vulnerability and threat detection initiatives. In this role, you will be an individual contributor on the Security Continuous Monitoring team, reporting directly to the Director of Continuous Monitoring.
Our Mission:At Sunrise Robotics, we are committed to enhancing humanity through cutting-edge robotics technology. Our mission is to revolutionize the manufacturing sector by deploying intelligent, adaptable robots that augment human abilities and improve existing machinery, paving the way for a new era of manufacturing characterized by superior quality, reduced waste, and cost efficiency.Our Vision:We envision a future where every aspect of manufacturing, from design to assembly, is optimized through intelligent automation. Our goal is to integrate flexible robotic solutions, leveraging versatile hardware and sophisticated software/AI capabilities, into the manufacturing processes of small and medium-sized enterprises, making automation economically feasible and accessible for manufacturers of all sizes. We are not just creating robots; we are developing the essential components for the autonomous, intelligent agents of tomorrow.The Role:As an IT Infrastructure & Security Engineer at Sunrise Robotics, you will play a crucial role in ensuring the reliable deployment, updating, and support of our robotic systems at scale. With the growth of our deployments, consistency in environments, secure access control, and automated workflows are vital to mitigate engineering friction and reduce operational risks.Your responsibilities will include maintaining the reliability of our internal systems and engineering infrastructure. By employing thorough troubleshooting, investigating incidents, documenting processes clearly, and communicating effectively, you will ensure our infrastructure remains robust. You will collaborate closely with our Engineering team using Linear as the ticketing and triage system to eliminate persistent infrastructure issues.You will oversee deployment environments, cloud provisioning, endpoint configuration, networking, and security practices comprehensively. Your primary responsibility will be to guarantee our infrastructure is consistent, secure, and dependable, enabling the team to launch and deploy robots smoothly without unnecessary bottlenecks.What You’ll Do:Automate and maintain deployment workflows to optimize robot cell rollouts, minimize manual setups, and resolve recurring operational challenges.Containerize and stabilize our robotics software stack to guarantee consistent, reproducible environments across development, cloud, and deployed production systems.Standardize and manage both Linux and Windows environments, establishing repeatable configuration and provisioning practices.Provision and maintain cloud infrastructure, ensuring secure access, permissions, and identity management (SSO, SSH), with clear accountability for reliability and recovery practices.Continuously enhance infrastructure security practices to safeguard our systems and data integrity.
Full-time|$200K/yr - $250K/yr|On-site|New York, NY
Join Fluidstack: Pioneering the Future of IntelligenceAt Fluidstack, we're transforming the landscape of artificial intelligence infrastructure. Collaborating with leading AI research labs, government entities, and major corporations—including Mistral, Poolside, Black Forest Labs, and Meta—we are dedicated to delivering computing solutions at unprecedented speeds. Our mission is to expedite the realization of Artificial General Intelligence (AGI), and we are seeking passionate individuals who thrive on purpose and excellence.We take immense pride in the systems we develop and the trust we build with our clients. If you are ready to roll up your sleeves and contribute to shaping the future of intelligence, we invite you to join our innovative team.Position OverviewFluidstack, a prominent player in the cloud services arena, is on the lookout for a Software Engineer specializing in Infrastructure Platform Development. In this role, you will be instrumental in constructing the foundational platforms that support our global infrastructure and data center operations. Your focus will be on developing robust internal tools across various domains, including Configuration Management Database (CMDB), asset management, Data Center Infrastructure Management (DCIM), monitoring, observability, security, and operational automation. Collaborating with cross-functional teams, you will craft scalable and user-friendly solutions that enhance our ability to provide top-tier infrastructure services.Key ResponsibilitiesInfrastructure Platform DevelopmentDesign and implement a next-generation CMDB system to serve as the definitive source of truth for infrastructure assets, network architecture, and configuration data.Develop DCIM platforms for managing rack operations, server/GPU deployments, operating system installations, quality assurance, and white-screen activities.Create comprehensive asset lifecycle management systems encompassing receiving, racking, inventory, break-fix, and decommissioning workflows.Build monitoring and observability platforms that integrate telemetry from Building Management Systems (BMS), Environmental Power Monitoring Systems (EPMS), and IT devices, featuring intelligent alerting and incident management capabilities.Develop self-service portals and automation tools for new region initialization, post-deployment operations, and fleet-scale management.Operational Excellence & AutomationMinimize manual tasks through workflow automation and self-service tools that empower our operations and engineering teams.Create workflow orchestration systems to streamline complex multi-step processes that encompass incident, problem, and change management.
Full-time|On-site|San Francisco Bay Area (San Mateo) or Boston (Somerville)
About the RoleThis position is pivotal in overseeing infrastructure across our entire tech stack. If it exists in the cloud, it falls under your purview. In the world of robotics, data is essential, and we require robust, scalable infrastructure to manage, store, and process vast amounts of this data. The APIs, services, and monitoring systems you will manage are critical to our operations.Your Responsibilities Include:Managing compute resources (both CPU and GPU) to efficiently process petabytes of data at high throughput.Overseeing the infrastructure required for data processing and storage.Ensuring the security and integrity of our infrastructure and data.You Will Excel in This Role If You Have:A minimum of 5 years of experience in managing large-scale cloud infrastructure using tools such as Kubernetes and Terraform, with a primary focus on Python services.Deep understanding of AWS services (or their equivalents) and their permission models.Strong perspectives on the effective use of coding agents within an infrastructure context.