Staff Machine Learning Infrastructure Engineer
Clicking Apply Now takes you to AutoApply where you can tailor your resume and apply.
Experience Level
Experience
Similar jobs
Browse all companies, explore by city & role, or SEO search pages. View directory listings: all jobs, search results, location & role pages.
At GRAIL, our mission is to revolutionize cancer detection by identifying it at an early stage when treatment can be most effective. We strive to alter the landscape of cancer mortality by uniting various stakeholders in the adoption of groundbreaking, safe, and efficient technologies that can truly transform cancer care.As a pioneering healthcare company, G…
Periodic Labs
Periodic Labs is an AI and physical sciences company based in Menlo Park. The team focuses on advancing scientific discovery by building advanced models that drive progress in materials, energy, and related fields. The company operates with a strong sense of ownership and a drive to push scientific boundaries, supported by leading investors and a rapidly growing organization. Role overview The Machine Learning Systems Engineer will own the systems layer that powers model training and inference. This work is closely tied to the reinforcement learning (RL) feedback loop at the heart of Periodic Labs' research process, where models propose experiments, experiments generate data, and that data improves future models. The role blends deep infrastructure work with research collaboration, focusing on both performance and integration with the scientific workflow. What you will do Develop scheduling solutions for GB series GPUs using platforms like Ray, Slurm, and Kubernetes. Aim to minimize latency and maximize resource utilization across different cluster setups. Create profiling tools, both online and offline, to identify and resolve bottlenecks in the training and inference stack. Implement direct S3 checkpoint streaming to remove I/O bottlenecks during large-scale training runs. Benchmark RL training configurations across model sizes, batch strategies, and hardware architectures to find optimal setups. Write and optimize communication and GPU kernels to increase hardware throughput. Design and implement zero-copy RDMA weight synchronization between training and inference systems, keeping the RL loop fast and efficient. Develop sandbox execution environments for rapid algorithm testing and iteration. Key focus areas Scheduling, kernels, RDMA, weight synchronization, and communication primitives Collaboration with researchers to co-design algorithms and infrastructure Accelerating the RL feedback loop that drives scientific discovery at Periodic Labs
Robinhood Markets, Inc.
About the Role Robinhood Markets, Inc. is hiring a Senior Data Scientist focused on Machine Learning for the Brokerage team. This role is based in Menlo Park, CA or New York, NY. What You Will Do Apply advanced data analysis and machine learning techniques to strengthen the trading platform. Work closely with colleagues across product, engineering, and operations to develop data-driven tools and features. Identify patterns and generate insights that support a better user experience and greater operational efficiency. Collaboration This position partners with cross-functional teams to design and implement solutions that address real needs within the Brokerage division.
Robinhood Markets, Inc.
Join us in shaping the future of finance.Our mission is to democratize finance for everyone. An estimated $124 trillion of assets will be transferred to younger generations in the next two decades, marking the largest wealth transfer in history. If you’re eager to be at the forefront of this monumental cultural and financial transition, we invite you to apply.About the Team and RoleWe are assembling a top-tier team, leveraging cutting-edge technologies to tackle the most significant financial challenges globally. We seek innovative thinkers, exceptional problem-solvers, and motivated builders ready to make a difference. Robinhood is not for the complacent; it’s a place where ambitious individuals achieve their career best. Our high-performing, dynamic team prioritizes ethics in all aspects of our work, where high expectations yield equally high rewards.The Incentives Data Science team operates at the crossroads of Product, Marketing, Finance, and Machine Learning. Our goal is to facilitate sustainable, data-driven growth by developing modeling, measurement, and optimization systems that enhance activation, retention, and revenue. We collaborate closely with various teams at Robinhood to design, assess, and implement incentive programs that effectively acquire, activate, and retain customers.As a Senior Data Scientist, Machine Learning, you will spearhead the comprehensive design, optimization, and advancement of Robinhood’s incentive systems. You’ll create predictive and causal ML models, establish experimentation frameworks, and build decision-making and allocation algorithms that directly shape how millions of users interact with Robinhood. This is a unique chance to influence impactful ML systems while defining incentive strategies at a company-wide level!This role is based in our Menlo Park, CA office, with a requirement for in-person attendance of at least 3 days per week.At Robinhood, we value in-person engagement to foster innovation, accelerate progress, and strengthen community. Our office environment is designed to be purposeful, invigorating, and fully supportive of high-performing teams.What You Will DoDevelop, implement, and refine predictive and causal models for incentive targeting.Design and assess experiments to quantify the incremental impact, payback, and ROI of promotional initiatives.Collaborate cross-functionally with Product, Finance, Marketing, and Engineering to provide insights for scalable solutions.
Robinhood Markets, Inc.
Be Part of the Future of Finance!At Robinhood, our mission is to democratize finance for everyone. With an estimated $124 trillion expected to be inherited by younger generations in the coming decades, this is the largest wealth transfer in history. If you're ready to be at the forefront of this monumental shift, we want to hear from you!About Our Team and Your RoleWe are assembling a world-class team focused on leveraging advanced technologies to tackle the most pressing challenges in finance. We seek innovative thinkers, adept problem-solvers, and passionate builders who are driven to create impact. Robinhood is a dynamic environment where ambitious talent can excel and achieve the best work of their careers. Our high-performing team is fast-paced, values ethics, and rewards excellence.The People Systems team specializes in developing and managing robust HR technology that ensures security and scalability for our workforce. Collaborating with People Operations, Compensation, Payroll, Finance, and Corporate Engineering, we enhance employee experiences with reliable systems. Our focus is on optimizing system performance, enabling new capabilities, and supporting organizational growth through strategic design. This team is vital in facilitating acquisitions and enhancing our global workforce operations!As a Staff Workday Engineer, you will be responsible for designing, configuring, and enhancing Workday solutions in areas such as HCM, Compensation, Benefits, and Payroll. You will work closely with People and business teams to translate their requirements into effective system configurations, reports, and integrations. This role is pivotal in driving system improvements and major initiatives, including acquisitions and the development of new features. You will also play a key role in evaluating and integrating AI capabilities within HR systems to enhance efficiency and decision-making.This position is based in our Menlo Park, CA office, with an expectation of in-person attendance at least 3 days per week.At Robinhood, we champion the power of in-person collaboration to drive progress, foster innovation, and build community. Our office environment is designed to be engaging and supportive of high-performing teams.
Periodic Labs
About Periodic LabsPeriodic Labs is an innovative AI and physical sciences laboratory dedicated to developing cutting-edge models that facilitate groundbreaking scientific discoveries. With substantial funding and rapid growth, our team members are empowered as owners, tackling challenges without the constraints of bureaucracy. We embrace continuous learning, adopting new tools and scientific methods to advance our mission.Role OverviewIn this role, you will train advanced frontier models to become highly knowledgeable scientific experts, forming the backbone of reinforcement learning initiatives. You will devise techniques for large-scale synthetic data generation, model distillation, and continual learning. Additionally, you will collaborate closely with reinforcement learning researchers, physicists, and chemists to craft evaluations that enhance scientific data curation. Working alongside supercompute engineers, you will scale compute-efficient large language model (LLM) training across thousands of GPUs. You will also develop high-performance tools to explore the interplay between data and intelligence.Ideal Candidates Will Possess Experience In:Training large language models (LLMs) utilizing curated datasets comprising trillions of tokens.Calculating scaling laws and identifying compute-optimal hyperparameters.Generating billions of tokens of high-quality synthetic datasets.Constructing evaluations that correlate with downstream task performance.
Speechify builds tools that turn written content into audio, helping over 50 million people read and learn in new ways. Our products span iOS, Android, Mac, Chrome, and the web. Google named us Chrome Extension of the Year, and Apple recognized our design in 2025. Our fully remote team includes nearly 200 people with backgrounds at Amazon, Microsoft, Google, Stanford, Stripe, Vercel, and other top tech companies and universities. Role Overview The Data team within Speechify’s AI division is looking for a Software Engineer focused on data infrastructure and acquisition. This position centers on building and maintaining the systems that gather the large-scale datasets needed for model training. The work blends infrastructure, engineering, and research to deliver high-quality data at petabyte scale while keeping costs in check. What You Will Do Find and connect new sources of audio data to our ingestion pipeline. Maintain and improve our cloud infrastructure for data ingestion, currently running on Google Cloud Platform and managed with Terraform. Work closely with Scientists to optimize for cost, throughput, and data quality, supporting richer datasets for our models. Collaborate with the AI team and company leadership to shape the roadmap for datasets that power future consumer and enterprise products. Location Remote (company operates without a physical office). Menlo Park, CA, USA listed as company location.
Robinhood Markets, Inc.
Be a Part of the Future of Finance.At Robinhood, we are on a mission to democratize finance for everyone. With an anticipated $124 trillion in assets set to be transferred to younger generations over the next two decades, we are at the forefront of the largest wealth transfer in history. If you want to play a pivotal role in this transformative era, we invite you to join us.Team Overview + Position DetailsWe are assembling a top-tier team dedicated to tackling some of the most significant challenges in finance using cutting-edge technology. We seek innovative thinkers and skilled problem-solvers who are eager to make a meaningful impact. Robinhood is a dynamic workplace where ambitious individuals excel. We foster a high-performance culture where ethics underpin our actions and high expectations correlate with substantial rewards.As a Senior Web Infrastructure Engineer, you will spearhead the web development lifecycle, bringing products from concept to launch. You will collaborate closely with design, product management, and internal platform teams to implement high-impact features within diverse product teams.Your expertise will be instrumental in addressing challenges associated with our web-first, high-performance trading platform.This position is based in our Menlo Park, CA, or Toronto, ON office(s), with a minimum in-person attendance of three days per week.We believe in the power of face-to-face collaboration to expedite progress, ignite innovation, and strengthen community bonds. Our office environment is intentional, invigorating, and designed to support high-achieving teams.Your ResponsibilitiesLead and execute medium to large-scale infrastructure projects that enhance and expand web development capabilities at Robinhood.Develop and sustain reliable, high-performance tools for CI/CD, build systems (e.g., Bazel), and local development environments.Collaborate with product teams to drive feature development and improve user experience.
BillionToOne
Join BillionToOne as a Senior AI Engineer I, where you will leverage your expertise in artificial intelligence and machine learning to drive innovation. In this role, you will be instrumental in developing cutting-edge AI solutions that empower our mission of advancing health technology. Collaborate with a dynamic team of engineers and researchers to create impactful applications that make a difference.
Periodic Labs
About Periodic LabsPeriodic Labs is an innovative AI and physical sciences laboratory that is pioneering advanced models to drive groundbreaking scientific discoveries. With robust funding and a rapid growth trajectory, we foster a culture of ownership where our team members proactively identify and tackle challenges without the constraints of bureaucracy. We thrive on learning new tools and scientific principles to propel our mission forward.About the RoleWe are seeking a product-oriented generalist engineer to transform Periodic's advanced models into practical tools for scientists, both within our labs and in external customer environments. You will prototype comprehensive products that integrate AI models with real-world laboratory systems, collaborating closely with scientists, model researchers, and customers to refine product features and inform model optimization and deployment strategies.As one of the first product engineers at Periodic, you will play a pivotal role in shaping the product engineering culture within our organization, influencing both internal tools and customer-facing applications.ResponsibilitiesRapidly prototype and iteratively develop sharp, effective products that scientists rely on daily, adapting based on direct feedback from laboratory settings.Design lightweight applications such as chat interfaces, dashboards, and experiment planners that leverage our cutting-edge models and scientific data.Work collaboratively with model researchers to establish product parameters, telemetry, and signals that shape model objectives and reinforcement learning reward mechanisms.Take full ownership of the development stack, from schema design to API integration and minimal frontend deployment, ensuring observability throughout.Engage directly with materials scientists to observe workflows, identify bottlenecks, and proactively build useful solutions before they are requested.
Robinhood Markets, Inc.
Be a Part of the Financial Revolution.Our vision is to democratize finance for everyone. An estimated $124 trillion of assets will transition to younger generations over the next two decades, marking the largest wealth transfer in history. If you are eager to be at the forefront of this monumental cultural and financial transformation, we invite you to continue reading.About Our Team and Your RoleWe are assembling an elite team committed to leveraging cutting-edge technologies to tackle the world’s most pressing financial challenges. We seek innovative thinkers, adept problem-solvers, and builders who are driven to make a significant impact. Robinhood is not a place for mediocrity; it is where ambitious individuals achieve the pinnacle of their careers. Our culture is characterized by high-performance, rapid pace, and a commitment to ethics in all that we do. The stakes are high, but so are the rewards.The Crypto Custody team is dedicated to creating a state-of-the-art custody platform that supports both custodial and non-custodial blockchain products, ensuring exceptional security and reliability. We manage Robinhood’s foundational crypto custody infrastructure, directly contributing to the launch of new products, asset diversification, and global expansion.As a Staff Software Engineer, you will be responsible for designing and scaling secure systems for digital asset custody, transforming manual processes into automated, policy-driven solutions. You will lead critical custody initiatives, enhance system reliability and compliance, and work collaboratively with cross-functional teams to ensure the secure management of digital assets on a large scale.This position is primarily based in our New York, NY office, with a requirement for in-person attendance at least three days a week.At Robinhood, we recognize the value of in-person collaboration in accelerating progress, inspiring innovation, and fostering community. Our office environment is intentional, invigorating, and tailored to support high-performing teams.
BillionToOne
Join BillionToOne as an AI Engineering Intern, where you will have the opportunity to work on cutting-edge artificial intelligence projects that impact the future of healthcare. This role is perfect for innovative thinkers eager to apply their academic knowledge in a real-world setting, contributing to product development and data analysis.
Mainspring Energy
Join Mainspring Energy as a Staff Electrical Engineer specializing in Power Electronics. In this pivotal role, you will leverage your expertise in electrical engineering to drive innovations in power electronics, contributing to the development of clean, reliable energy solutions. Collaborate with a dynamic team of engineers and researchers dedicated to transforming the energy landscape.
Periodic Labs
About Periodic LabsPeriodic Labs is an innovative AI and physical sciences laboratory dedicated to constructing cutting-edge models aimed at facilitating groundbreaking scientific discoveries. With substantial funding and rapid growth, our team members are empowered as owners who proactively identify and solve challenges without the constraints of bureaucracy. We are passionate about embracing new tools and scientific insights to advance our mission.About the RoleAs a Distributed Training Engineer, you will be at the forefront of optimizing, operating, and developing large-scale distributed LLM training systems that drive AI scientific research. Collaborating closely with researchers, you will support mid-training and reinforcement learning workflows, troubleshoot issues, and maintain seamless operations. You will also build tools and directly contribute to pioneering experiments, ensuring that Periodic Labs remains the premier AI and science lab for physicists, computational materials scientists, AI researchers, and engineers. Additionally, you will play a role in advancing open-source large-scale LLM training frameworks.
At GRAIL, our mission is clear: to detect cancer at its earliest stages, offering a chance for successful treatment. We are dedicated to altering the course of cancer mortality and uniting various stakeholders to embrace groundbreaking, safe, and effective technologies that revolutionize cancer care.As a pioneering healthcare organization, we are at the forefront of innovative technologies aimed at early cancer detection. Our multidisciplinary team consists of scientists, engineers, and physicians who harness the power of next-generation sequencing (NGS), large-scale clinical studies, and cutting-edge data science to tackle one of medicine's most formidable challenges.Headquartered in the vibrant Bay Area, GRAIL also has locations in Washington, D.C., North Carolina, and the United Kingdom, backed by prominent global investors and leaders in the pharmaceutical, technology, and healthcare sectors.To learn more about our transformative work, please visit grail.comWe are seeking a driven and impactful Senior Staff Product Security Engineer to be a pivotal technical leader in advancing our product security initiatives across the organization. Reporting to the Director of Product Security, you will play an essential role in ensuring the security and resilience of our products, vital to GRAIL's life-saving mission.As a senior individual contributor, you will lead the technical execution of our Product Security roadmap, collaborating closely with Engineering and Product teams, and mentoring fellow security engineers. Your influence will extend to architectural and development decisions throughout the product lifecycle, enabling teams to navigate the evolving threat landscape while ensuring efficient delivery in a regulated environment.Flexible Work Arrangement – Menlo Park (MPK) – 3 Days in OfficeThis position is located in Menlo Park, California, transitioning to Sunnyvale, California in Fall 2026. GRAIL offers a flexible work environment, allowing for a mix of on-site and remote work. Currently, our policy requires a minimum of 60% (24 hours) of your workweek to be on-site, with specific schedules determined collaboratively with your manager to align with team and business needs.
Robinhood Markets, Inc.
Robinhood Markets, Inc. is advancing its AI capabilities to support the future of finance. The Artificial Intelligence team designs and maintains the machine learning infrastructure that powers both new features and internal improvements across the Robinhood platform. Role overview The Senior Engineering Manager, AI, will oversee the team responsible for building and scaling Robinhood’s core AI systems. This includes managing the design and rollout of machine learning models for a variety of products and internal tools. The position plays a key part in helping product and engineering teams deliver and refine AI-driven features that address customer needs. What you will do Lead the development of foundational AI and machine learning infrastructure Manage teams working on the creation and deployment of ML and AI models for both customer products and internal processes Guide the adoption and use of Large Language Models (LLMs) to create scalable solutions Influence the engineering organization’s approach to advanced modeling and developer platforms Requirements Extensive experience in software engineering Proven track record applying machine learning at scale Strong leadership background, particularly with developer-facing platform teams Strategic vision for driving AI adoption and innovation within a large organization Location and collaboration This role is based onsite in either Menlo Park, CA or Bellevue, WA. In-person attendance is required five days per week. Robinhood emphasizes in-person collaboration to accelerate progress, encourage innovation, and strengthen its community. For additional context on the evolving landscape in finance, see this report on the $124 trillion wealth transfer expected by 2048.
Robinhood Markets, Inc.
Be a Pioneer in the Future of Finance.Join us at Robinhood, where our mission is to democratize finance for everyone. With an anticipated $124 trillion of wealth set to transfer to younger generations over the next two decades, we're at the forefront of this historic shift. If you're ready to play a pivotal role in transforming the financial landscape, we want to hear from you!About Our Team and Your RoleWe are assembling a top-tier team dedicated to tackling the most pressing financial challenges using cutting-edge technologies. We're seeking innovative thinkers and builders eager to make a substantial impact. At Robinhood, you'll take ownership of your work and help enhance financial accessibility for all while upholding high standards, accountability, and a profound commitment to security and ethics in our creations.The Red Team's objective is to pinpoint and mitigate genuine security threats across Robinhood by emulating adversary tactics and testing our defenses. As a Staff Offensive Security Engineer, you'll lead security evaluations across applications, infrastructure, and physical environments. You will closely collaborate with engineering and security teams to bolster detection and response strategies, prioritize risks, aid in remediation efforts, and create tools and techniques that enhance our security testing processes. Your contributions will be vital in ensuring the safety and reliability of our products, which serve millions of customers.This role is based in our Menlo Park, CA office, with in-person participation required at least 3 days per week.Robinhood champions the benefits of in-person collaboration to accelerate innovation, spark creativity, and foster community. Our office environment is designed to be engaging and supportive for high-achieving teams.
Location: Menlo Park, CAEmployment Type: Full-TimeCompensation: Competitive salary + significant milestone-based equityAbout Glade.aiGlade.ai is revolutionizing AI-driven automation to empower service industries to operate with enhanced efficiency and speed. Our comprehensive platform automates workflows, simplifies processes, and facilitates team scalability, allowing sectors such as legal, insurance, and professional services to prioritize delivering exceptional value to their clients.The OpportunityWe are seeking an experienced Staff Software Engineer to take on a pivotal role as a technical leader and mentor within our expanding product engineering team. This is an extraordinary chance to influence the architecture, engineering culture, and product trajectory of a fast-growing company from its inception.In this role, you will guide a team of software engineers, fostering innovation and ensuring the delivery of high-impact, scalable, and elegant solutions.What You’ll DoCollaborate with Product, Design, and other stakeholders to identify customer requirements and devise effective solutions.Lead the design and execution of significant product and platform modifications across Glade’s software infrastructure.Elevate the standards of the engineering organization by offering technical guidance and mentorship, developing tooling, and enhancing engineering workflows.Navigate seamlessly through the entire product lifecycle, from design and implementation to production.What We’re Looking For8+ years of experience in software engineering, with substantial experience in leadership or principal roles.Strong grasp of engineering principles, industry standards, and emerging technologies, particularly within our tech stack: TypeScript, React, Node.js, PostgreSQL.Proven experience in constructing and refining production systems to support new product functionalities and accommodate growing demand.Experience in developing or prototyping LLM-powered workflows using inference providers like OpenAI.A strong commitment to enhancing the end user experience and a passion for creating products that address customer needs.A humble disposition, eagerness to learn, and a collaborative spirit.
Robinhood Markets, Inc.
Be a part of shaping the future of finance.Our mission is to make finance accessible for everyone. With an estimated $124 trillion poised to be inherited by younger generations over the next two decades, we are at the forefront of a monumental wealth transfer. If you're prepared to engage in this pivotal cultural and financial evolution, we invite you to explore this opportunity.About the Team and RoleWe are assembling a premier team dedicated to tackling significant financial challenges through innovative technologies. We seek bold visionaries, adept problem-solvers, and builders motivated to create meaningful change. At Robinhood, complacency has no place; we are where ambitious individuals excel and achieve their career best. Our high-performing, dynamic team operates with ethics at the core of our work. High expectations yield high rewards.The Learning & Development team at Robinhood enhances performance on a large scale by providing impactful learning experiences. We empower our employees to excel through carefully crafted programs and streamlined, data-driven learning processes that support their journey—from onboarding through professional growth and compliance training. We believe that the most effective learning experiences are practical, integrated into daily workflows, and focused on driving performance excellence. Our team harmonizes creativity with practicality, developing engaging, scalable programs aligned with Robinhood’s culture, brand, and business objectives.As the Manager of Learning & Development, you will oversee the design, quality, and implementation of learning experiences throughout Robinhood. You will lead a dedicated team and collaborate with various business units to transform complex content into impactful, scalable training programs. Your role will set high standards for the quality of learning design, ensure uniformity across programs, and spearhead innovation in learning methods and tools. This position demands expertise in both strategic oversight and hands-on execution—leading a team while occasionally participating in content design.This position is based in our Menlo Park, CA office, with an expectation of in-person attendance at least four days a week.At Robinhood, we recognize the value of in-person collaboration in fostering progress, igniting innovation, and building community. Our office environment is intentionally designed to be invigorating and fully supportive of high-performing teams.ResponsibilitiesLead and nurture the learning design team, fostering a culture of continuous improvement and innovation.
Robinhood Markets, Inc.
Be a Part of the Financial Revolution.At Robinhood, our objective is to make finance accessible to everyone. With an anticipated $124 trillion in assets set to be passed down to younger generations in the coming two decades, we are positioned at the forefront of a monumental cultural and financial transition. If you are excited about playing a vital role in this change, we want to hear from you.About Our Team and Your RoleWe are assembling a top-tier team dedicated to applying cutting-edge technologies to resolve the world’s most pressing financial challenges. We seek innovative thinkers and adept problem-solvers—individuals who are driven to make a significant impact. Robinhood is a dynamic environment where ambitious professionals thrive and deliver their best work. Our high-performing team is guided by strong ethical principles, with high expectations and equally high rewards.The Product and Application Security team is responsible for establishing and managing systems that empower engineers to identify and mitigate security risks early in the software development lifecycle. Our focus is on creating actionable safeguards—such as libraries and frameworks—that make secure development the standard practice. You will collaborate closely with engineering teams to minimize risks while facilitating efficient product development.As a Staff Security Engineer, you will act as a technical leader and expert, guiding the integration of security into development practices throughout Robinhood. You will be responsible for designing and implementing systems that provide clear security guidelines for engineers, advising on architecture and threat models, and influencing long-term security strategy. This position offers a high-impact opportunity to shape the future of application security within our organization!This role is based in our Menlo Park, CA office, with a requirement for in-person attendance at least three days per week.At Robinhood, we value in-person collaboration as a means to accelerate progress, ignite innovation, and foster community. Our office environment is intentionally crafted to be stimulating and fully supportive of high-performing teams.
Sign in to browse more jobs
Create account — see all 137 results
Browse all companies, explore by city & role, or SEO search pages. View directory listings: all jobs, search results, or location & role pages.
