Clicking Apply Now takes you to AutoApply where you can tailor your resume and apply.
Experience Level
Experience
Qualifications
Desired ExperienceProven experience in designing and managing large-scale infrastructure, such as GPU clusters, extensive Kubernetes environments, or cloud-based batch job systems. A meticulous approach, consistently focused on reliability, observability, and optimization throughout the entire technology stack.
About the job
At Exa, we are on a mission to create a cutting-edge search engine from the ground up, designed to cater to the diverse needs of AI applications. Our team is building a robust infrastructure that enables us to crawl the internet, train advanced embedding models for indexing, and develop high-performance vector databases using Rust. Additionally, we manage a significant $5M H200 GPU cluster that powers tens of thousands of machines.
The Infrastructure Team at Exa is responsible for developing the essential tools and infrastructure that support our entire system. We are looking for talented infrastructure engineers to help us scale our capabilities rapidly. Your work could involve orchestrating GPU clusters with Kubernetes, implementing map-reduce batch jobs on Ray, or creating top-tier observability tools that set industry standards.
About Exa
Exa is dedicated to innovating the future of AI by developing an unparalleled search engine infrastructure that enhances performance and scalability. Our commitment to building a world-class engineering team is at the forefront of our endeavors.
Full-time|$99.5K/yr - $135K/yr|On-site|San Francisco, California, United States
Join BKF Engineers, a leading multi-service infrastructure consulting firm with over 110 years of dedication to civil engineering and surveying across California and the Pacific Northwest. We pride ourselves on fostering a culture that champions professional autonomy, innovation, and collaborative teamwork. With multiple offices throughout California and the…
Join KPFF Consulting Engineers as a Civil Engineering Project Manager and lead exciting infrastructure projects in the heart of San Francisco! In this role, you will oversee project planning, execution, and delivery while collaborating with a talented team of engineers and architects.Your responsibilities will include managing project schedules, budgets, and resources, ensuring compliance with regulatory requirements, and maintaining strong client relationships. This is a unique opportunity to contribute to innovative projects that shape urban environments.
The City and County of San Francisco seeks Civil Engineers for a range of citywide roles. These positions contribute to the planning, design, and construction of infrastructure projects aimed at improving safety and accessibility throughout San Francisco. Key responsibilities Collaborate with teams from different disciplines on public infrastructure projects Conduct site assessments to guide project planning and design decisions Prepare engineering reports and maintain thorough documentation Verify that projects comply with all applicable regulations and standards What we value Candidates who are dedicated to public service and approach complex civic challenges with creative problem-solving skills will thrive in these roles. Experience working across disciplines and a commitment to making a positive impact on the community are important qualities for success.
Mach9’s Machine Learning Infrastructure Engineers create and maintain the backbone for production AI models used in civil engineering and surveying. The team manages a machine learning pipeline that processes over 10,000 miles of labeled survey data, supports image segmentation networks, and runs 3D prediction models. These systems deliver real-time inference capabilities directly to surveyors and engineers working in the field. Role overview This position is designed for mid-career engineers with a strong background in both training and inference aspects of machine learning infrastructure. The work involves handling large-scale data and ensuring reliable performance for demanding, real-world applications. What you will do Build and improve training pipelines for deep transformer models using hundreds of terabytes of 3D point cloud and image data. Design and implement inference infrastructure to support both offline detection algorithms and responsive, real-time inference integrated with CAD software. Location Based in San Francisco.
The City and County of San Francisco seeks a Senior Civil Engineer to join the Citywide Operations team. This role leads a variety of civil engineering projects that support and improve urban infrastructure across San Francisco. Key responsibilities Oversee civil engineering projects from initial planning through to completion Guide teams to ensure all work aligns with regulatory standards and city requirements Support ongoing efforts to maintain and enhance city infrastructure Use technical expertise to encourage sustainable development throughout the city Role impact This position helps shape San Francisco’s infrastructure and contributes to the community’s well-being by providing strong engineering leadership and thoughtful project management.
The City and County of San Francisco seeks a Junior Transportation Engineer to join its team. This entry-level role offers the chance to support a variety of transportation projects that shape urban mobility and city infrastructure. Junior engineers collaborate with experienced colleagues, gaining practical skills in transportation engineering while working on meaningful projects. Key responsibilities Assist in developing transportation plans for city projects Support traffic studies and help analyze transportation data Contribute to ongoing initiatives that improve transportation throughout San Francisco Who this role is for This position is designed for recent engineering graduates or those starting out in transportation engineering. It provides hands-on learning and direct exposure to real-world projects in San Francisco.
About Our TeamThe Infrastructure Engineering team operates within the IT department, dedicated to the reliable construction, deployment, and management of critical on-premises and hybrid environments that empower our internal services and vital research and development projects.This newly established team is committed to implementing rigorous Site Reliability Engineering (SRE) practices in environments where uptime, safety, recoverability, and security are paramount. We aim to replace unique, one-off infrastructure with standardized infrastructure-as-code components that enhance reliability and operational efficiency as OpenAI continues to grow.About This RoleWe are in search of an Infrastructure Engineering Lead who will architect, build, and maintain reliable, secure, and scalable infrastructure that supports identity, access, endpoint, and shared platform services throughout the organization.You will take full ownership of infrastructure and identity systems from conceptual design and provisioning to policy enforcement, upgrades, recovery, and ongoing operations. Your goal will be to develop robust, production-grade platforms that minimize operational hurdles, enforce security by default, and empower teams to work more effectively and confidently.This position is ideal for a senior engineer who excels in navigating ambiguity, relishes the challenge of overseeing complex systems from start to finish, and enhances reliability and security by transforming fragile implementations into standardized, repeatable infrastructure.This role is based at our San Francisco headquarters and requires in-office attendance.Key Responsibilities:Define and refine infrastructure patterns for on-prem and hybrid environments, including self-hosted platforms, vendor-supported systems, and lab settings.Establish standardized, production-grade deployment and operational models that replace custom-built solutions.Collaborate with IT, Security, Identity, and Network teams to ensure infrastructure is designed to meet reliability, security, and access standards.Design and enhance the production architecture for Identity and Access Management (IAM) adjacent platforms, such as Microsoft Entra, utilizing SRE principles.Develop common management protocols and shared resources within Azure subscriptions to ensure uniformity and policy compliance in operations.
Full-time|On-site|San Francisco, CA | New York City, NY | Seattle, WA
Join Anthropic as a Staff Infrastructure Engineer focused on Cluster Infrastructure. In this role, you will have the opportunity to shape the future of AI by building and maintaining robust infrastructure systems that support our cutting-edge technologies. You will collaborate with a talented team of engineers to design scalable solutions, optimize performance, and ensure reliability across various platforms.
About UsAt Sierra, we are revolutionizing the way businesses engage with their customers by building a cutting-edge platform that harnesses the power of AI. Our headquarters is located in the vibrant city of San Francisco, with additional offices expanding in Atlanta, New York, London, France, Singapore, and Japan.Our company culture is deeply rooted in our core values: Trust, Customer Obsession, Craftsmanship, Intensity, and Family. These principles guide our actions and foster an environment where innovation thrives.Sierra was co-founded by visionary leaders Bret Taylor, who currently serves as the Board Chair of OpenAI and has a rich history with Salesforce and Facebook, and Clay Bavor, who previously led Google Labs and spearheaded initiatives like Google Lens and Project Starline.Your RoleAs a Software Engineer focusing on Infrastructure at Sierra, you will play a pivotal role in designing, constructing, and maintaining the foundational systems that empower our AI platform. Your expertise will ensure that our infrastructure is not only secure and reliable but also scalable, allowing product teams to execute their work with agility and confidence.Guarantee the reliability, scalability, and performance of our platform and LLM inference serving in response to increasing traffic demands.Develop and oversee cloud infrastructure using Terraform to create secure, scalable, and reproducible environments.Establish and manage a self-service infrastructure platform to empower engineering teams in deploying and operating services independently.Take ownership of and improve CI/CD pipelines and release management processes, facilitating rapid and reliable deployments across Sierra’s platform.Design and manage distributed systems utilizing distributed databases, retrieval systems, and machine learning models.Develop and sustain core data serving abstractions along with essential authentication and security features (SSO, RBAC, authentication controls).Effectively navigate and integrate our technology stack with enterprise customer environments in a scalable and maintainable manner.
Full-time|Remote|San Francisco, CA, New York, NY, Portland, OR, or Remote within Canada or United States
Join Mercury as a Senior Infrastructure Engineer, where you will be pivotal in shaping the infrastructure that supports our innovative financial solutions. You will work closely with cross-functional teams to design, implement, and maintain scalable and reliable infrastructure systems. This role is ideal for individuals who thrive in a fast-paced environment and are passionate about leveraging technology to drive business success.
At Exa, we are on a mission to create a cutting-edge search engine from the ground up, designed to cater to the diverse needs of AI applications. Our team is building a robust infrastructure that enables us to crawl the internet, train advanced embedding models for indexing, and develop high-performance vector databases using Rust. Additionally, we manage a significant $5M H200 GPU cluster that powers tens of thousands of machines.The Infrastructure Team at Exa is responsible for developing the essential tools and infrastructure that support our entire system. We are looking for talented infrastructure engineers to help us scale our capabilities rapidly. Your work could involve orchestrating GPU clusters with Kubernetes, implementing map-reduce batch jobs on Ray, or creating top-tier observability tools that set industry standards.
fusionconsulting is seeking a Senior Project Manager with a focus on IT/OT infrastructure for the pharmaceutical industry. This position centers on leading projects that improve how clients operate while maintaining compliance with sector regulations and security requirements. Role overview As Senior Project Manager, you will guide initiatives designed to strengthen IT and OT infrastructure. The work involves close collaboration with teams from different disciplines to address the specific needs and challenges faced by pharmaceutical clients. Ensuring that solutions align with industry standards and security protocols is a key part of the job. Collaboration and location This role is based in San Francisco, with a requirement to be on-site at the Vacaville office three days each week. Regular in-person presence supports strong teamwork and effective project delivery.
About SesameAt Sesame, we envision a world where computers can interact with us in authentic, lifelike ways—seeing, hearing, and collaborating as humans do. Our mission is to create an innovative computer interface that seamlessly integrates voice agents into everyday life. Our diverse team comprises founders from Oculus and Ubiquity6 and seasoned professionals from Meta, Google, and Apple, each bringing extensive expertise in hardware and software. Join us in pioneering a future where technology feels alive.About the RoleAs a Backend Infrastructure Engineer at Sesame, you will play a pivotal role in shaping the foundational aspects of our technology stack. This position focuses on developing high-impact infrastructure, services, and tools that are broad-reaching rather than narrowly defined. You will tackle scalability and architectural challenges across various domains, including agentic workflows, speech recognition and synthesis, IoT, large-scale training, and efficient low-latency inference. If you're driven by the challenge of creating an ultra-efficient, scalable, and reliable engineering ecosystem through a blend of tooling, services, libraries, and infrastructure, this is the perfect opportunity for you.Responsibilities:Design and develop foundational infrastructure to support serving, training, and applications at Sesame.Enhance productivity for engineering teams by automating processes and creating exceptional tools.Deliver software solutions that empower product and machine learning engineers to build secure, scalable, and dependable systems from the ground up.Your responsibilities will encompass provider, service, security, and developer infrastructure, as well as the architecture and implementation of core services and libraries.
About MercorMercor operates at the cutting edge of labor markets and artificial intelligence research. Collaborating with top AI laboratories and corporations, we supply the essential human intelligence that drives AI advancement.Our extensive talent network educates state-of-the-art AI models much like teachers impart knowledge to students: by sharing insights, experiences, and contextual understanding that cannot be encoded. Currently, over 30,000 experts in our network generate more than $2 million daily.At Mercor, we are pioneering a new realm of work where expertise fuels AI progress. Achieving this ambitious vision demands a dynamic, fast-paced, and deeply dedicated team. Here, you will collaborate with researchers, operators, and AI firms at the forefront of transforming societal systems.As a profitable Series C company with a valuation of $10 billion, Mercor operates five days a week from our new headquarters in San Francisco.About the RoleIn your role as an Infrastructure Engineer at Mercor, you will be instrumental in constructing and scaling the systems that support our rapid expansion. You will ensure that our infrastructure is highly reliable, cost-efficient, and capable of accommodating surges in traffic and computational demands. Your collaboration with product, research, and operations engineers will be vital in designing scalable architectures, optimizing deployments, and enhancing observability.We are broadening our search across Infrastructure roles, including Developer Productivity Engineer, Database Engineer, and Platform Engineer. Candidates will be matched to teams after the initial screening, so we encourage applications even if your expertise is predominantly in one area.What You'll Work OnDesigning and maintaining core infrastructure across cloud environments.Creating Infrastructure-as-Code workflows to automate deployments and scaling.Enhancing monitoring, logging, and alerting systems to ensure reliability.Managing CI/CD pipelines (Github, Spacelift) for seamless deployments.Assisting in disaster recovery planning and ensuring system availability.Collaborating with product and research teams to design architectures that meet workload demands.Identifying and resolving performance bottlenecks in compute, storage, and networking.
Join our dynamic team at Bland Inc. as a Senior Infrastructure Engineer, where you will play a critical role in designing and implementing robust infrastructure solutions. You will work alongside a talented group of professionals, using cutting-edge technology to drive innovation and efficiency.
The City and County of San Francisco seeks a full-time Transportation Engineer to help advance citywide transit projects. This position plays a key role in shaping how people move through San Francisco, with an emphasis on making transit safer and more efficient for everyone. What you will do Support transportation projects aimed at improving city transit systems Work on initiatives that enhance efficiency and safety for public transit Requirements Full-time availability Interest in transportation engineering and urban transit improvements Location This role is based in San Francisco.
About the RoleJoin our pioneering team at vooma as a Backend & Infrastructure Software Engineer, where you will play a critical role in shaping the technical infrastructure of a transformative company.If you are passionate about creating not only resilient systems but also the foundational architecture of a groundbreaking enterprise from the outset, this position is ideal for you.We are looking for someone who excels at crafting infrastructure that is elegant, dependable, and secure, even under high-demand scenarios. You thrive on the challenge of scaling systems that enable intelligent agents and take pride in establishing reliable foundations that others can rely on.Your Key Responsibilities Include:Design and maintain secure, scalable infrastructure tailored for AI-powered agents in production environments.Deploy and optimize AI-driven services to meet high availability and performance standards.Manage infrastructure as code, alongside cloud environments and CI/CD pipelines.Implement monitoring, observability, and alerting systems to ensure the reliability of our infrastructure.Contribute to infrastructure security and adhere to best practices.You Should Have:Experience in deploying and productionizing machine learning or AI-centric workloads.Proficiency in developing secure, scalable infrastructures on platforms such as AWS, Azure, or GCP.In-depth knowledge of backend systems, networking, and container orchestration technologies (e.g., Kubernetes).Understanding of infrastructure security principles and compliance standards (e.g., SOC2).A proactive and hands-on mindset, with a strong drive to solve challenges from start to finish.
Be part of our mission to redefine AI by shaping the narrative surrounding document understanding.Role OverviewAt LlamaIndex, our Infrastructure team lays the groundwork for our product and provides essential tools that facilitate the development, deployment, and monitoring of our code. We are tasked with designing, constructing, and scaling the core infrastructure that drives a high-capacity data platform for AI applications. We seek individuals who are passionate about creating supportive systems that enhance our engineering capabilities and contribute to our rapidly expanding product suite.Ideal candidates will have a strong background in cloud infrastructure management, navigating various scalability challenges, and enhancing the productivity of the broader Engineering team. Key traits we value in our culture include a customer-centric mindset, collaboration, diligence, and optimism. We are looking for proactive team players who are eager to help us evolve our culture as we grow.Key ResponsibilitiesCollaborate with engineering teams to develop and maintain foundational systems that empower developers and support our rapid growth.Design and execute scalable infrastructure solutions suitable for various deployment models, including SaaS, single-tenant, and private environments.Oversee and optimize cloud resources and Kubernetes clusters to ensure cost-effectiveness and high performance.Facilitate successful external customer deployments by establishing clear infrastructure guidelines and principles.Enhance the release and deployment processes to improve efficiency and reliability.Ensure compliance with applicable regulations and implement comprehensive security measures across all deployment environments.QualificationsMinimum of 5 years of engineering experience.Experience working on Platform or Infrastructure teams on substantial projects involving infrastructure components like Terraform/CDKTF, Kubernetes, Helm, testing infrastructure, release management, and observability.Proficient in optimizing cloud resource utilization.Skilled in tuning Kubernetes clusters and cloud resources for optimal performance and cost efficiency.Dedicated to cultivating LlamaIndex’s engineering culture as we expand.Ability to balance speed and pragmatism in delivering solutions.
Join our dynamic NorCal Civil engineering team in San Francisco as a Civil Design Engineer. This role presents an excellent chance for civil engineers to enhance their expertise while embracing increased responsibilities in a collaborative and innovative atmosphere. You will engage in various infrastructure projects emphasizing design, analysis, and consulting, playing a vital role in the successful execution of complex assignments.Design Project Leadership: Take charge of design aspects on civil engineering undertakings, including site planning, grading, drainage, and utility design, under the mentorship of senior engineers.Advanced Engineering Calculations and Documentation: Generate intricate calculations, detailed drawings, and specifications using AutoCAD, Civil 3D, and additional tools, while supporting the preparation and review of technical reports.Site Assessments and Data Collection: Conduct site visits to gather data, verify project viability, and ensure compliance with design standards and industry best practices.Project Coordination and Compliance Management: Assist in project timelines and documentation, ensuring adherence to regulations; help navigate permitting processes and liaise with regulatory agencies as required.Collaboration and Mentorship: Engage actively in cross-functional meetings, work alongside contractors and consultants, and guide junior engineers in their development.Quality Design and Regulatory Compliance Support: Commit to high-quality design practices, guarantee adherence to regulations, and uphold the highest industry standards.Research and Development Initiatives: Stay abreast of industry advancements and new technologies, applying innovative solutions to foster project success and enhance operational processes.
About HappyRobotHappyRobot is pioneering the AI-native operating system for the real economy, bridging the gap between intelligence and action. By harnessing real-time truths, specialized AI workers, and orchestrating intelligence, we empower enterprises to manage complex, mission-critical operations with unprecedented autonomy.Our AI OS accumulates knowledge, optimizes processes at every level, and evolves continually. Our initial focus is on supply chain and industrial-scale operations, where resilience, speed, and ongoing improvement are paramount—liberating humans to engage in strategy, creativity, and other high-value endeavors.To explore our vision further, check out our Manifesto. To date, HappyRobot has successfully raised $62 million, including a recent $44 million in Series B funding in September 2025, with support from esteemed investors like Y Combinator (YC), Andreessen Horowitz (a16z), and Base10—partners dedicated to our mission of redefining enterprise operations. We are using this investment to build a world-class team of individuals with relentless drive, exceptional problem-solving skills, and a passion for pushing boundaries in a dynamic, high-intensity environment. If this resonates with you, we invite you to join us at HappyRobot.About the RoleWe are in search of an Infrastructure Engineer to spearhead the enhancement of our operational resilience as we scale. You will be responsible for the stability, observability, and debugging processes that ensure our systems operate seamlessly. As the primary troubleshooter for complex failures in real-time, you will design tools that transform chaos into clarity and assist in transitioning our operations from reactive to proactive.This role carries significant impact and trust, as you will influence how we approach reliability—reducing incident frequency, creating internal tools, and directly enhancing developer focus and system uptime. If you thrive on uncovering the root causes of challenging issues and fortifying systems (and teams), this is your opportunity.