Engineering Manager Online Data Systems jobs in San Francisco – Browse 8,770 openings on RoboApply Jobs

Engineering Manager Online Data Systems jobs in San Francisco

Open roles matching “Engineering Manager Online Data Systems” with location signals for San Francisco. 8,770 active listings on RoboApply Jobs.

8,770 jobs found

1 - 20 of 8,770 Jobs
Apply
companyOpenAI logo
Full-time|On-site|San Francisco

About the TeamJoin our Online Data team, where we design and maintain the foundational online database and indexing services for OpenAI’s cutting-edge AI applications. We support the phenomenal growth of ChatGPT, the leading AI application globally, as well as Codex, the fastest-growing development toolset.Our mission is to uphold the reliability, accuracy, and scalability of our extensive online data infrastructure, enabling product and research teams to innovate swiftly without getting entangled in complex multi-region, multi-cloud, exabyte-scale data systems.About the RoleWe are on the lookout for an Engineering Manager to spearhead our Online Data Systems team. This role entails guiding a talented group of engineers dedicated to developing and managing hyperscale data storage and retrieval technologies.You will oversee the execution of highly complex engineering tasks in areas such as distributed query execution, multi-region data federation, self-orchestrating services, performance optimization, and more.This is a unique opportunity to shape cutting-edge technology at an unprecedented scale, where you won’t just be another cog in the machine but will have a significant impact on our trajectory.In this role, you will:Build, lead, and develop high-performing infrastructure engineering teams.Drive the advancement of OpenAI’s proprietary online data technologies, including our core database systems, indexing technologies, and vector search capabilities.Establish delivery metrics based on measurable reliability goals (SLOs, etc.) to ensure superior system performance and resilience.Advocate for the efficient use of agent technology to enhance execution speed.Minimize operational burdens and incident occurrences through improved abstractions and self-healing systems.You might thrive in this role if you:Possess a relentless pursuit of operational excellence, with hands-on experience in building...

Mar 13, 2026
Apply
companyPinterest, Inc. logo
Full-time|On-site|San Francisco, CA, US; Palo Alto, CA, US

Role overview Pinterest is seeking a Principal Engineer to join the Online Systems team, based in either San Francisco or Palo Alto. This position centers on designing and building scalable systems that serve millions of users. The focus is on enhancing user engagement and optimizing the performance of Pinterest's online platforms. What you will do Architect and develop large-scale applications built for high-traffic environments Collaborate with engineering, product, and design teams to deliver reliable solutions Lead technical decisions that influence the direction of Pinterest's online systems

Apr 27, 2026
Apply
companyCloudflare, Inc. logo
Full-time|Hybrid|Hybrid

Join Cloudflare as a Principal Systems Engineer, Data, where you will lead innovative projects that enhance our data processing capabilities. You will work collaboratively with cross-functional teams to design, implement, and optimize systems that efficiently handle large-scale data. This role requires a deep understanding of systems engineering principles, strong analytical skills, and a passion for leveraging data to drive decisions. Your contributions will be pivotal in shaping the future of our data infrastructure.

Feb 26, 2026
Apply
companyCrusoe logo
Full-time|$237.6K/yr - $288K/yr|On-site|San Francisco, CA - US

At Crusoe, our mission is to accelerate the proliferation of energy and intelligence. We're developing the technology that enables ambitious AI-driven creations without compromising on scale, speed, or sustainability.Join us at the forefront of the AI revolution with sustainable technology. Here, you will lead innovative initiatives, make a significant impact, and work with a team that is pioneering responsible and transformative cloud infrastructure.Role Overview:We are on the lookout for a Senior Engineering Manager for Data Plane Systems, who will spearhead the team accountable for high-performance Software Defined Networking (SDN) data planes across hosts and Data Processing Units (DPUs). In this hands-on senior leadership capacity, you will be responsible for the architecture, implementation, and operational management of a hardware-accelerated networking stack tailored for large-scale GPU workloads, with a focus on rapid feature deployment from commit to production.Key Responsibilities:Technical Leadership: Define the strategic roadmap for SDN data plane systems and guide the integration of DPUs (like NVIDIA BlueField) and hardware accelerators.Architecture & Optimization: Oversee the development of Linux kernel networking components, XDP/eBPF data paths, and DPDK-based fast paths while driving the transition of networking functions to hardware offload architectures.Operational Excellence: Lead performance benchmarking, regression prevention, and incident response, ensuring we meet our operational goals within 3-6 month cycles.Team Development: Mentor and cultivate a high-performing team of senior and staff-level systems engineers, setting technical standards and nurturing a culture of accountability.Collaboration: Work closely with control-plane teams (OVN/OVS) to enhance throughput and latency for multi-tenant GPU clusters.

Feb 20, 2026
Apply
companyZyphra logo
Full-time|On-site|San Francisco

Zyphra is a cutting-edge artificial intelligence firm located in the heart of San Francisco, California, dedicated to advancing technology across various modalities.About the Position:We are seeking a Data Engineer - Multimodal Systems to play a pivotal role in the enhancement and expansion of Zyphra's datasets and data pipelines. This position offers a unique opportunity to collaborate with diverse teams and contribute to innovative data solutions. You will engage in the collection of extensive datasets and the development and optimization of high-performance parallel data pipelines.Your Responsibilities Will Include:Executing large-scale data collection across multiple modalities, including text, audio, and image.Designing and implementing highly efficient, parallelized data processing pipelines that integrate various modalities.Conducting rigorous experimental ablations to evaluate the effectiveness of new data enhancements.Candidate Requirements:Proven ability in implementation and prototyping.Capability to transform ideas into experimental frameworks swiftly.Strong collaborative skills, thriving in a dynamic research environment.Eagerness to learn and apply new concepts effectively.Exceptional communication and teamwork skills, capable of contributing to both research and large-scale engineering projects.Preferred Qualifications:Experience in the collection, management, and processing of large datasets.Familiarity with parallel programming frameworks in Python, such as Dask.In-depth understanding of state-of-the-art dataset curation practices.A detail-oriented mindset with a passion for data integrity and verification.Strong foundation in experimental methodologies for conducting thorough ablation studies and hypothesis testing.Knowledge and interest in large-scale, highly parallel data processing systems.Proficiency in PyTorch and Python.Experience with large, complex codebases and the ability to quickly become productive within them.Published research in respected machine learning venues.Postgraduate degree in a relevant field is a plus.

Jul 1, 2025
Apply
companyStitch Fix, Inc. logo
Full-time|$138K/yr - $230K/yr|Remote|Remote, USA

About Stitch Fix, Inc. Stitch Fix (NASDAQ: SFIX) is a premier online personal styling service that empowers individuals to discover and embrace their unique styles. By expertly blending skilled stylists with cutting-edge AI and recommendation algorithms, we curate an exceptional selection of both exclusive and national brands, tailored to meet each client's distinct tastes and preferences. Founded in 2011 and headquartered in San Francisco, Stitch Fix revolutionizes the way people shop, making it effortless for clients to express their personal style without the hassle of navigating through endless options in stores or online.About the TeamThe Business Systems team serves as the strategic technology and data partner for our core operations. We design and maintain the technological framework that supports our Finance, Procurement, Merchandising, and HR/People and Culture functions. By collaborating directly with business leaders, we create, implement, and enhance scalable systems while transforming our business data into a strategic asset. Our team is responsible for building and managing data engineering pipelines, analytics dashboards, and next-generation automation and Gen AI solutions that provide critical insights and empower leaders to make informed, data-driven decisions.About the RoleWe are on the lookout for a strategic Senior Engineering Manager to lead our Business Systems Data & Insights team, focusing on pivotal domains including Finance (Accounting, FP&A), Merchandising, Procurement, and HR/People & Culture. This high-impact role will allow you to shape how Stitch Fix utilizes data and AI to drive essential business decisions and influence company strategy. You will spearhead our data and AI transformation by constructing scalable data infrastructure, enhancing analytics capabilities, implementing intelligent automation, and accelerating the adoption of Gen AI across these critical business functions.

Feb 12, 2026
Apply
companyDatabricks logo
Full-time|$166K/yr - $225K/yr|On-site|San Francisco, California

At Databricks, we are driven by a passion for empowering data teams to tackle the world’s most challenging problems — from transforming transportation to accelerating medical innovations. We achieve this by creating and maintaining the leading data and AI infrastructure platform, enabling our clients to leverage profound data insights for business enhancement. Founded by engineers with a customer-first mentality, we eagerly embrace every opportunity to tackle complex technical challenges, ranging from the design of next-generation UI/UX for data interactions to scaling our services across millions of virtual machines. Our journey has just begun.As a member of the Runtime team at Databricks, you will be instrumental in developing the next generation of distributed data storage and processing systems. These systems will surpass specialized SQL query engines in relational query performance while offering the programming abstractions necessary to support a variety of workloads, from ETL to data science.Example projects include:Apache Spark™: Contribute to the de facto open-source standard framework for big data.Data Plane Storage: Develop reliable and high-performance services and client libraries for managing vast amounts of data within cloud storage backends like AWS S3 and Azure Blob Store.Delta Lake: Design a storage management system that merges the scalability and cost-effectiveness of data lakes with the performance and reliability of data warehouses, providing features like ACID transactions and time travel.Delta Pipelines: Simplify the orchestration and operation of numerous data pipelines, enabling clients to deploy, test, and upgrade pipelines effortlessly.Performance Engineering: Create the next-generation query optimizer and execution engine that is fast, scalable, and robust.

Jan 30, 2026
Apply
companyCloudflare, Inc. logo
Full-time|Hybrid|Hybrid

Join Cloudflare as a Systems Engineer specializing in Data, where you will play a critical role in enhancing our infrastructure and ensuring the reliability of our services. You will collaborate with cross-functional teams to design, implement, and maintain systems that handle vast amounts of data efficiently and securely. Your contributions will be pivotal in optimizing performance and delivering exceptional user experiences.

Feb 6, 2026
Apply
companyExa logo
Full-time|On-site|San Francisco, California

At Exa, we are on a mission to create a cutting-edge search engine from the ground up, tailored specifically for AI applications. Our team is dedicated to developing large-scale infrastructure that efficiently crawls the internet, trains advanced embedding models for indexing, and constructs high-performance vector databases in Rust for optimized searching. We also manage a state-of-the-art $5M H200 GPU cluster that activates thousands of machines simultaneously.As a Software Engineer specializing in Distributed Data Systems, you will be responsible for designing and implementing the data infrastructure that drives our operations—from crawling billions of web pages to training sophisticated embedding models and delivering real-time search functionalities. You will enjoy significant autonomy in creating systems capable of scaling to hundreds of petabytes. This is your opportunity to work on data pipelines at an unprecedented scale.

Dec 19, 2025
Apply
companyCloudflare, Inc. logo
Full-time|Hybrid|Hybrid

Join Cloudflare as a Senior Systems Engineer specializing in our Data Platform. In this pivotal role, you'll be responsible for designing and implementing scalable systems that process vast amounts of data, ensuring performance and reliability across our services. Your expertise will drive innovative solutions that enhance our platform capabilities.

Feb 6, 2026
Apply
companyDatabricks logo
Full-time|$192K/yr - $260K/yr|On-site|San Francisco, California

P-186 At Databricks, we are passionate about empowering data teams to tackle some of the world’s most challenging problems, from security threat detection to cancer drug development. Our mission is to build and operate the leading data and AI infrastructure platform, enabling our customers to concentrate on the high-value challenges that are integral to their own objectives. Founded in 2013 by the original creators of Apache Spark™, Databricks has rapidly evolved from a small office in Berkeley, California, to a global powerhouse with over 1000 employees. Trusted by thousands of organizations, from startups to Fortune 100 companies, we are recognized as one of the fastest-growing SaaS companies worldwide. Our engineering teams create highly sophisticated products that address significant needs in the industry. We continuously push the limits of data and AI technology while maintaining the resilience, security, and scalability essential for our customers' success on our platform. We manage one of the largest-scale software platforms, consisting of millions of virtual machines that generate terabytes of logs and process exabytes of data daily. At this scale, we frequently encounter cloud hardware, network, and operating system faults, and our software must effectively shield our customers from these challenges. Modern data analysis leverages advanced techniques, such as machine learning, that far exceed the capabilities of traditional SQL query engines. As a Software Engineer on the Runtime team at Databricks, you will be instrumental in developing the next generation of distributed data storage and processing systems that outshine specialized SQL query engines in relational query performance, while providing the flexibility and programming abstractions to support a variety of workloads, from ETL to data science. Examples of projects you may work on include: Apache Spark™: Contributing to the de facto open-source framework for big data. Data Plane Storage: Developing reliable, high-performance services and client libraries for storing and accessing vast amounts of data on cloud storage backends like AWS S3 and Azure Blob Store. Delta Lake: A storage management system that merges the scalability and cost-effectiveness of data lakes with the performance and reliability of data warehouses, featuring low latency streaming. Its higher-level abstractions and guarantees, including ACID transactions and time travel, significantly reduce the complexity of real-world data engineering architectures. Delta Pipelines: Aiming to simplify the management of data engineering pipelines.

Jan 30, 2026
Apply
companyCloudflare, Inc. logo
Full-time|Hybrid|Hybrid

Join Cloudflare as a Distributed Systems Engineer within our dynamic Data Platform team, focusing on Analytics and Alerts. In this position, you will play a pivotal role in building and optimizing distributed systems that power our data analytics capabilities, providing real-time insights and alerts to enhance our customer experience.

Mar 4, 2026
Apply
companyGranica logo
Full-time|On-site|Bay Area Office

About GranicaGranica is an innovative AI research and infrastructure firm dedicated to creating reliable and steerable representations of enterprise data.We build trust through our product Crunch, a policy-driven health layer that ensures large tabular datasets remain efficient, reliable, and reversible. On this solid foundation, we are developing Large Tabular Models—systems designed to learn cross-column and relational structures in order to provide trustworthy answers and automation with inherent provenance and governance.Our MissionAI is currently hampered not only by the design of models but also by the inefficiencies of the data that supports them. Every redundant byte, poorly organized dataset, and inefficient data pathway contributes to significant costs, latency, and energy waste as we scale.Granica aims to eliminate these inefficiencies. We merge cutting-edge research in information theory, probabilistic modeling, and distributed systems to craft self-optimizing data infrastructures: systems that consistently enhance the representation and utilization of information by AI.Our engineering team collaborates closely with the Granica Research group led by Prof. Andrea Montanari of Stanford University, bridging advancements in information theory and learning efficiency with large-scale distributed systems. Together, we firmly believe that the next major advancement in AI will stem from breakthroughs in efficient systems rather than merely larger models.Your ContributionsGlobal Metadata Substrate: Design a transactional and metadata substrate that facilitates time-travel, schema evolution, and atomic consistency across massive petabyte-scale tabular datasets.Adaptive Engines: Develop systems that autonomously reorganize data, learning from access patterns and workloads to maintain peak efficiency without the need for manual tuning.Intelligent Data Layouts: Optimize bit-level organization (including encoding, compression, and layout) to maximize signal extraction per byte read.Autonomous Compute Pipelines: Create distributed compute systems that scale predictably, adapt to dynamic loads, and ensure reliability under failure conditions.Research to Production: Apply new algorithms in compression, representation, and optimization that emerge from ongoing research. We encourage opportunities to publish and open-source your work.Latency as Intelligence: Design systems that inherently minimize latency as a measure of intelligence.

Nov 7, 2025
Apply
companyCloudflare, Inc. logo
Full-time|Hybrid|Hybrid

Join Cloudflare as a Distributed Systems Engineer focusing on our Data Platform, where you will play a pivotal role in developing analytics and alert systems that enhance our services. You will collaborate with a talented team to design scalable and efficient systems to manage and analyze vast amounts of data. Your work will directly impact the performance and reliability of our offerings, ensuring our customers have the best possible experience.

Apr 2, 2026
Apply
companyOpenAI logo
Full-time|On-site|San Francisco

About Our TeamAt OpenAI, our mission is to develop safe artificial general intelligence (AGI) that benefits all of humanity. This ambitious goal unites some of the brightest scientists, engineers, and business professionals in a collaborative environment aimed at achieving groundbreaking advancements.The Enterprise Platform team is pivotal in supporting our Sales teams, empowering them to effectively deploy our advanced AI products to clients across various sectors. Comprising experts in Sales, Solutions, Customer Success, Support, Marketing, and Partnerships, our team strives to create impactful solutions that democratize access to AI.About the RoleAs part of our Go-To-Market (GTM) team, we are dedicated to illustrating the transformative capabilities of OpenAI’s models. We are seeking a Salesforce Engineer who will collaborate with our Growth, Marketing, and Sales teams to develop systems that enhance customer engagement, activate lifecycles, and create sales pipelines. This role emphasizes strong data enrichment and orchestration across our GTM stack.Your main responsibilities will involve lead management, developing sales and marketing engagement workflows, and orchestrating customer data across multiple sources. You will ensure that Salesforce and related GTM systems operate with precise, actionable data to support scalable pipeline development.Key Responsibilities:Develop Lead-to-Opportunity Solutions in Salesforce: Create user-friendly experiences, automation processes, and data enrichment workflows to efficiently capture, qualify, and route inbound and outbound demand, ensuring leads are enriched with accurate firmographic, technographic, and behavioral insights as they progress through the early stages of the pipeline.Integrate Top-of-Funnel Systems: Connect Salesforce with various marketing automation platforms, enrichment providers, sales engagement tools, routing engines, and real-time data sources to facilitate comprehensive customer and account data orchestration, optimizing lead management and ensuring timely Marketing-to-Sales transitions.Test, Troubleshoot, and Document: Address and resolve production issues affecting lead capture, routing, enrichment, and sales engagement workflows, while maintaining thorough documentation that aligns with scalable intake, change management, and deployment best practices for data movement across GTM systems.

Feb 24, 2026
Apply
companyUnify logo
Full-time|$58K/yr - $285K/yr|On-site|San Francisco Office

Join Unify as a Senior Data Engineer, EnrichmentAt Unify, we are pioneering the first AI-driven action system for revenue teams, enabling organizations to transform their outbound efforts into a powerful growth engine. Our technology makes go-to-market strategies observable, repeatable, and scalable. Established in 2023 by visionaries from Ramp and Scale AI, our team boasts experience from industry giants such as Airbnb, Meta, Waymo, and Perplexity.In 2024, we achieved an impressive 8x revenue increase and cater to esteemed clients like Perplexity, Cursor, SoFi, and Justworks. We are a dynamic and high-energy team that has successfully raised $58M from prominent investors including Thrive, Emergence, and OpenAI. Help us shape the future of go-to-market strategies - come aboard!Role OverviewAs a Senior Data Engineer focusing on Enrichment, you will take ownership of the systems that create and uphold the most extensive and high-quality first-party contact dataset available. Your responsibilities will include designing data pipelines that assimilate information from various vendors, developing intelligence to evaluate data source reliability, and establishing a quality framework that secures Unify's data as a competitive advantage.

Jan 29, 2026
Apply
companyOpenAI logo
Full-time|On-site|San Francisco

About Our TeamAt OpenAI, we are dedicated to ensuring our innovative products are effectively monetized to meet the diverse needs of our customers. Our Financial Engineering team works closely with the Go-To-Market (GTM) and Finance departments to continuously adapt our billing architecture to align with our dynamic internal requirements.About the OpportunityWe are seeking a talented Engineering Manager to oversee and enhance the workflows that drive quoting, tracking, and fulfillment for all OpenAI sales. This role is pivotal in developing essential billing and invoicing capabilities while collaborating on customer-facing billing experiences to uphold financial integrity, ensure auditability, and provide a seamless onboarding and billing journey for enterprise clients.Key Responsibilities:Lead and mentor a team of engineers focused on automating order management, prioritizing reliability, accuracy, and a positive customer onboarding experience.Own the design and roadmap for order data flows into various downstream systems, including internal provisioning, billing, invoicing, and revenue management.Create and maintain resilient workflows that automate entitlements, provisioning, usage controls, SKU attribution, invoice generation, and revenue recognition—streamlining processes while maximizing accuracy and traceability.Enhance the accuracy and timeliness of provisioning, billing, and invoicing through automation, validation, and reconciliation, reducing manual intervention.Establish robust operational practices (observability, alerting, runbooks, on-call) to ensure system health with minimal human oversight.Collaborate extensively with Sales Operations, Finance, Accounting, Support, Product, Security, and Compliance teams to translate complex requirements into resilient, auditable workflows.Navigate ambiguous problem spaces and evolving product offerings, creating scalable frameworks and abstractions as OpenAI's commercial footprint grows.Uphold high engineering standards through technical direction, design reviews, mentoring, and fostering a culture of ownership and continuous improvement.Exhibit strong leadership by mentoring engineers, recruiting and retaining top talent, managing stakeholder expectations, and balancing customer needs with deliverable realities.You Will Excel in This Role If You:Possess a passion for leading engineering teams and driving process improvements.Have a proven track record of managing complex engineering projects and fostering collaboration across diverse teams.Enjoy tackling challenges with innovative solutions while maintaining a customer-centric approach.

Feb 5, 2026
Apply
companyGranica logo
Full-time|On-site|Bay Area Office

About GranicaGranica is a pioneering AI research and infrastructure company dedicated to creating reliable and steerable representations of enterprise data.We build trust through Crunch, a policy-driven health layer designed to keep extensive tabular datasets efficient, reliable, and reversible. From this foundation, we are developing Large Tabular Models—systems that learn cross-column and relational structures to provide trustworthy answers and automation, complete with built-in provenance and governance.Our MissionThe current limitations of AI are not solely due to model design but also to the inefficiencies of the data that supports it. At scale, every redundant byte, poorly organized dataset, and inefficient data path contributes to significant costs, latency, and energy waste.Granica’s mission is to eliminate these inefficiencies. We leverage cutting-edge research in information theory, probabilistic modeling, and distributed systems to create self-optimizing data infrastructures that continuously enhance how information is represented and utilized by AI.Our engineering team collaborates closely with the Granica Research group led by Prof. Andrea Montanari from Stanford University, merging advancements in information theory and learning efficiency with large-scale distributed systems. We believe that the next major breakthrough in AI will stem from innovations in efficient systems, rather than simply larger models.What You Will CreateGlobal Metadata Substrate. Design and refine the global metadata and transactional substrate that enables atomic consistency and schema evolution across exabyte-scale data systems.Adaptive Engines. Architect systems that self-optimize, reorganizing and compressing data according to access patterns, achieving unprecedented efficiency improvements.Intelligent Data Layouts. Innovate new encoding and layout strategies that challenge the theoretical limits of signal per byte read.Autonomous Compute Pipelines. Spearhead the development of distributed compute platforms that scale predictively and maintain reliability even under extreme load and failure conditions.Research to Production. Partner with Granica Research to transform advances in compression and probabilistic modeling into production-ready, industry-leading systems.Latency as Intelligence. Propel systems forward by optimizing for latency as a key aspect of intelligence.

Nov 7, 2025
Apply
companyOpenAI logo
Full-time|Hybrid|San Francisco

About Our TeamJoin the innovative Sora team at OpenAI, where we are at the forefront of developing multimodal capabilities for our foundation models. As a dynamic hybrid of research and product development, we focus on seamlessly integrating advanced multimodal functionalities into our AI offerings, ensuring they are not only reliable and user-friendly but also aligned with our mission to foster broad societal benefits.About the PositionWe are seeking a dedicated Software Engineer specializing in Distributed Data Systems to architect and enhance the infrastructure that supports large-scale multimodal training and evaluation at OpenAI. In this role, you will oversee distributed data pipelines and collaborate closely with our researchers to translate their requirements into robust, high-performance systems. You will play a crucial role in fortifying the pipelines that underpin Sora’s rapid innovation cycles.We are looking for engineers with a keen eye for detail, substantial experience with distributed systems, and a proven track record of building reliable infrastructures in high-stakes environments.This position is based in San Francisco, CA, and follows a hybrid work model requiring three days in the office each week. We also provide relocation assistance to new team members.Key Responsibilities:Design, build, and maintain data infrastructure systems including distributed computing, data orchestration, distributed storage, streaming infrastructure, and machine learning infrastructure, ensuring they are scalable, reliable, and secure.Ensure our data platform can scale dramatically while maintaining high levels of reliability and efficiency.Collaborate with researchers to deeply understand their needs and translate them into production-ready systems.Harden, optimize, and maintain vital data infrastructure systems that drive multimodal training and evaluation.Ideal Candidates Will Have:Extensive experience with distributed systems and large-scale infrastructure, coupled with a strong passion for data.A detail-oriented mindset and a commitment to building and maintaining dependable systems.Solid software engineering fundamentals and exceptional organizational skills.Comfort with ambiguity and rapid changes in a fast-paced environment.About OpenAIOpenAI is a pioneering AI research and deployment organization dedicated to ensuring that general-purpose artificial intelligence benefits all of humanity. We strive to advance digital intelligence in a way that is safe and beneficial, pushing the boundaries of innovation and technology.

Nov 14, 2025
Apply
companyOpenAI logo
Full-time|Hybrid|San Francisco

About Our TeamJoin the innovative Sora team at OpenAI, where we are at the forefront of developing multimodal capabilities for our foundation models. Our hybrid research and product team is dedicated to seamlessly integrating multimodal functionalities into our AI solutions, ensuring they are dependable, user-centric, and aligned with our vision of benefiting society at large.Role OverviewAs a Machine Learning Engineer specializing in Distributed Data Systems, you will be instrumental in designing and scaling the infrastructure that facilitates large-scale multimodal training and evaluation at OpenAI. Your role will involve managing complex distributed data pipelines, collaborating closely with researchers to convert their requirements into robust, production-ready systems, and enhancing pipelines that are essential for Sora's rapid iteration cycles.We are seeking detail-oriented engineers with extensive experience in distributed systems who thrive in high-stakes environments and excel in building resilient infrastructure.This position is located in San Francisco, CA, and follows a hybrid work model, requiring three days in the office each week. We also provide relocation assistance for new team members.Key Responsibilities:Design, implement, and maintain data infrastructure systems, including distributed computing, data orchestration, distributed storage, streaming infrastructure, and machine learning systems, with a focus on scalability, reliability, and security.Ensure our data platform can scale exponentially while maintaining high reliability and efficiency.Collaborate with researchers to gain a deep understanding of their requirements, translating them into production-ready systems.Strengthen, optimize, and manage critical data infrastructure systems that support multimodal training and evaluation.You Will Excel in This Role If You:Possess strong experience with distributed systems and large-scale infrastructure, coupled with a keen interest in data.Exhibit meticulous attention to detail and a commitment to building and maintaining reliable systems.Demonstrate solid software engineering fundamentals and effective organizational skills.Thrive in environments characterized by ambiguity and rapid change.About OpenAIOpenAI is a trailblazing AI research and deployment organization committed to ensuring that general-purpose artificial intelligence serves humanity. We continuously push the boundaries of AI capabilities and strive to create technology that benefits everyone.

Feb 6, 2026

Sign in to browse more jobs

Create account — see all 8,770 results

Tailoring 0 resumes

We'll move completed jobs to Ready to Apply automatically.