Senior Engineering Manager Ai Ml Serving Platform jobs in San Francisco – Browse 10,254 openings on RoboApply Jobs

Senior Engineering Manager Ai Ml Serving Platform jobs in San Francisco

Open roles matching “Senior Engineering Manager Ai Ml Serving Platform” with location signals for San Francisco. 10,254 active listings on RoboApply Jobs.

10,254 jobs found

1 - 20 of 10,254 Jobs
Apply
companyPinterest logo
Full-time|$208.6K/yr - $429.5K/yr|Remote|San Francisco, CA, US; Remote, US

About Pinterest:At Pinterest, our platform inspires millions of people around the globe to explore creative ideas, envision new possibilities, and create lasting memories. We are dedicated to providing the inspiration needed to build a fulfilling life, starting with the talented individuals who drive our product development.Join us in a career that sparks innovation for millions, transforms passion into opportunities for growth, and celebrates the diverse experiences of our team members, all while enjoying the flexibility to perform at your best. Building a career you love is within reach.Position Overview:We are looking for a Senior Engineering Manager to spearhead our AI/ML Serving Platform team, which develops the core tools and infrastructure utilized by numerous AI/ML engineers across Pinterest. This includes systems for recommendations, advertisements, visual search, notifications, and trust and safety. Our goal is to enhance the efficiency, quality, and speed of AI/ML systems, ensuring they are production-ready and reliable for iterative model development.Key Responsibilities:Lead the team in driving continuous improvements in advanced model architectures, optimizing resource usage, and boosting AI/ML developer productivity.Establish the technical vision for the team aligned with company and organizational priorities.Mentor and cultivate talent within the team.Qualifications:Proven experience in managing engineering teams with diverse cross-organizational clients.Expertise in developing large-scale distributed serving systems.Familiarity with AI/ML inference technologies (e.g., PyTorch, TensorFlow) for web-scale online serving.Bachelor's degree in Computer Science or a related field, or equivalent professional experience.

Feb 11, 2026
Apply
companyDatabricks logo
Full-time|$217K/yr - $312.2K/yr|On-site|San Francisco, California

At Databricks, we are dedicated to empowering data teams to tackle the most challenging global issues—whether it's transforming transportation or speeding up medical advancements. We achieve this by constructing and managing the world's leading data and AI infrastructure platform, enabling our clients to leverage deep data insights for business enhancement. The Model Serving product at Databricks offers enterprises a cohesive, scalable, and governed platform for deploying and managing AI/ML models—from conventional ML to sophisticated, proprietary large language models. It facilitates real-time, low-latency inference while providing governance, monitoring, and lineage capabilities. As AI adoption surges, Model Serving becomes a central component of the Databricks platform, allowing customers to operationalize models efficiently and cost-effectively. As a Senior Engineering Manager, you will lead a team responsible for both the product experience and the underlying infrastructure of Model Serving. This role involves shaping user-facing features while architecting for scalability, extensibility, and performance across CPU and GPU inference. You will collaborate closely with various teams across the platform, product, infrastructure, and research domains.

Feb 1, 2026
Apply
companyDatabricks logo
Full-time|$166K/yr - $225K/yr|On-site|San Francisco, California

At Databricks, we are dedicated to empowering data teams to tackle some of the most challenging issues of our time—from realizing the future of transportation to speeding up medical innovations. We achieve this by developing and maintaining the premier data and AI infrastructure platform, allowing our clients to leverage profound data insights to enhance their operations. Our Model Serving product equips organizations with a cohesive, scalable, and governed platform for deploying and overseeing AI/ML models, spanning traditional ML to specialized large language models. It provides real-time, low-latency inference, governance, monitoring, and lineage capabilities. With the rapid rise of AI adoption, Model Serving stands as a fundamental component of the Databricks platform, enabling clients to operationalize models efficiently and cost-effectively at scale. As a Senior Engineer, your role will be pivotal in transforming both the product experience and the underlying infrastructure of Model Serving. You will design and create systems enabling high-throughput, low-latency inference across CPU and GPU workloads, influence architectural strategies, and work closely with platform, product, infrastructure, and research teams to deliver an exceptional serving platform.

Jan 30, 2026
Apply
companyWhatnot logo
Full-Time|On-site|San Francisco, CA

Embrace the Future of Commerce with Whatnot!Whatnot stands as North America and Europe’s premier live shopping platform, dedicated to transforming the way you buy, sell, and discover your favorite items. We are on a mission to redefine e-commerce by seamlessly merging community engagement, shopping, and entertainment into a unique experience tailored just for you. As part of a remote, co-located team, we thrive on innovation while being firmly rooted in our core values. With operational hubs across the US, UK, Germany, Ireland, and Poland, we are collaboratively shaping the future of online marketplaces.Our live auctions span a diverse range of categories from fashion and beauty to electronics and collectibles, including trading cards, comic books, and even live plants. There’s truly something for everyone!And this is just the beginning! As one of the fastest-growing marketplaces, we are in search of bold, innovative problem solvers across all functional areas. Stay updated with the latest Whatnot news through our news and engineering blogs, and join us in empowering individuals to transform their passions into thriving businesses, fostering connections through commerce. Your RoleWe are seeking hands-on leaders—intellectually curious and technically proficient individuals ready to influence the future of AI and ML at Whatnot. In this pivotal role, you will spearhead the development and scaling of the foundational infrastructure that supports machine learning and self-hosted large language model applications across our organization. Collaborating closely with machine learning scientists, you will drive the implementation of innovative models powered by near-real-time features, enhancing product experiences. This entails building robust systems that ensure advanced ML is both reliable and efficient at scale—from low-latency deep learning model serving and streaming feature ingestion to distributed training and high-throughput GPU inference. As a managerial role, a strong technical foundation is essential, and potential candidates should be enthusiastic about diving deep into the details. You will elevate architectural discussions, provide insightful technical feedback, and dedicate at least one day a week to coding.Your Responsibilities:Lead the infrastructure supporting AI and ML models across critical business areas, enhancing growth, recommendations, trust and safety, fraud detection, seller tooling, and more.Oversee the prototyping, deployment, and productionization of innovative ML architectures, ensuring they align with our strategic objectives.

Jan 15, 2026
Apply
companyScale AI logo
Full-time|$216.2K/yr - $270.3K/yr|On-site|San Francisco, CA; New York, NY

Join our dynamic Machine Learning Infrastructure team as a Senior AI Infrastructure Engineer, where you will play a pivotal role in designing and constructing platforms that ensure the scalable, reliable, and efficient serving of Large Language Models (LLMs). Our innovative platform supports a range of cutting-edge research and production systems, catering to both internal and external applications across diverse environments.The ideal candidate will possess a solid foundation in machine learning principles coupled with extensive experience in backend system architecture. You will thrive in a collaborative environment that bridges research and engineering, working diligently to provide seamless experiences for our customers and accelerating innovation across the organization.

Mar 26, 2026
Apply
companySciforium logo
Full-time|On-site|San Francisco

At Sciforium, we are at the forefront of AI infrastructure, dedicated to the development of advanced multimodal AI models and an innovative serving platform that emphasizes high efficiency. With substantial funding and direct collaboration from AMD, our team is rapidly expanding to create the complete stack for pioneering AI models and dynamic real-time applications.Role OverviewThis position provides a distinct opportunity to engage with the fundamental systems that drive Sciforium's multimodal AI models. You will play a crucial role in constructing the model serving platform, working with C++, Python, runtime execution, and distributed infrastructure to design a swift, dependable engine for real-time AI applications.You will acquire practical experience in performance engineering, discover how large AI models are optimized and deployed at scale, and collaborate closely with ML researchers and seasoned systems engineers. If you thrive in low-level programming and are passionate about performance, this role offers both impactful contributions and significant growth opportunities.

Nov 15, 2025
Apply
companyParafin logo
Full-time|On-site|San Francisco, CA

About Us:At Parafin, our mission is to empower small businesses to thrive in today's competitive landscape. We understand that small businesses form the backbone of our economy, yet they often face challenges in accessing essential financial resources. Our innovative technology streamlines access to vital financial tools directly on the platforms they already utilize for sales. Partnering with industry leaders such as DoorDash, Amazon, Worldpay, and Mindbody, we provide small businesses with fast, flexible funding, efficient spend management, and effective savings solutions through simple integrations. Parafin manages the complexities of capital markets, underwriting, servicing, compliance, and customer support to ensure seamless experiences for our partners and their small business clients.We are composed of a dynamic team of innovators with backgrounds from top firms like Stripe, Square, Plaid, Coinbase, Robinhood, and CERN, all driven by a passion for developing tools that facilitate small business success. Backed by esteemed venture capitalists including GIC, Notable Capital, Redpoint Ventures, Ribbit Capital, and Thrive Capital, Parafin stands as a Series C company with over $194M raised in equity and $340M in debt facilities. Join us in shaping a future where every small business has access to the financial tools they need.About The PositionWe are on the lookout for a skilled Software Engineer to join our Infrastructure team and spearhead the advancement of our Machine Learning (ML) Platform. This pivotal role is essential for constructing reliable, scalable, and developer-centric systems for model experimentation, training, evaluation, inference, and retraining that drive underwriting and other ML-powered products for small businesses.As a Software Engineer, you will design, build, and maintain the core frameworks and platforms that empower data scientists to deploy high-quality models into production efficiently and safely. You'll work closely with Data Science and Platform Engineering, taking ownership of the ML platform from end-to-end, and develop both batch and real-time underwriting infrastructure.What You'll DoTransform notebooks into reliable software. Break down data scientist training and inference notebooks into reusable, well-tested components (libraries, pipelines, templates) with clear interfaces and documentation.Develop user-friendly ML abstractions. Create SDKs, CLIs, and templates that simplify the definition of features, model training and evaluation, and deployment to batch or real-time targets with minimal boilerplate.Construct our real-time ML inference platform. Establish and scale low-latency model serving capabilities.Enhance batch ML inference processes. Optimize scheduling, parallelism, cost controls, and observability to improve efficiencies.

Jan 5, 2026
Apply
companyPinterest, Inc. logo
Full-time|$145.7K/yr - $300.1K/yr|Remote|San Francisco, CA, US; Remote, US

Join Pinterest:At Pinterest, we empower millions globally to discover creative ideas, envision new possibilities, and curate lasting memories. Our mission is to inspire everyone to create a life they love, driven by the talented individuals behind our innovative platform.Embark on a career where you can fuel innovation for millions, transform your passions into growth opportunities, value diverse experiences, and enjoy the flexibility to thrive. Building a career you love? It’s absolutely achievable!At Pinterest, AI is not just an enhancement; it’s a critical partner enhancing creativity and expanding our reach. We seek candidates eager to embrace this journey. Throughout our interview process, we prioritize your ability to articulate your thought processes, showcasing not only your knowledge but also your collaborative skills with AI. Discover more about our AI interview philosophy and its role in our recruitment process here.The Team:As a Technical Program Manager at Pinterest, you will take ownership with a proactive approach and technical acumen. The Platform Team oversees program governance in Infrastructure, Infra Finance, Data Engineering and Security, Compliance, and cloud budget management.Your Responsibilities:As a Staff Technical Program Manager with a focus on cross-engineering projects, you will lead strategic initiatives vital to enhancing Pinterest’s ML/AI Platform and foundational infrastructure.Lead Strategic ML/AI Platform Programs: Champion and execute high-impact, cross-engineering initiatives essential for advancing Pinterest's ML/AI Platform, GenAI infrastructure, and Agent Platform, ensuring outcomes from initial concept through measurable execution.AI-First Execution Mindset: Employ GenAI as the primary model for program execution, producing AI-assisted drafts of core program documents and modernizing high-effort workflows.

Apr 7, 2026
Apply
companyAir Apps logo
Full-time|On-site|San Francisco

Join Our Team at Air AppsAt Air Apps, we are on a mission to revolutionize resource management through innovative technology. Founded in 2018 in Lisbon, Portugal, we have expanded our reach with offices in both Lisbon and San Francisco, boasting over 100 million downloads globally. Our vision is to create the world’s first AI-powered Personal & Entrepreneurial Resource Planner (PRP), and we are looking for passionate individuals to help us achieve this goal.Our commitment to challenging the status quo drives us to push the boundaries of AI-driven solutions that make a real impact. Here, you will have the opportunity to be a creative force, developing products that empower individuals worldwide.Join us as we embark on this journey to redefine how people plan, work, and live.

Feb 25, 2025
Apply
companyFaire logo
Full-time|$268K/yr - $368.5K/yr|On-site|San Francisco, CA

About FaireFaire is a transformative online wholesale marketplace, driven by the conviction that local businesses are the future. Independent retailers around the globe generate more revenue than massive corporations like Walmart and Amazon combined, yet individually, they remain small. At Faire, we harness technology, data, and machine learning to connect this vibrant community of entrepreneurs. Think of your favorite local boutique — we empower them to discover and sell the best products from around the world. With our innovative tools and insights, we aim to level the playing field, enabling small businesses to thrive against larger competitors.By championing the growth of independent businesses, Faire positively impacts local economies on a global scale. We’re in search of intelligent, resourceful, and passionate individuals to join us in fueling the shop local movement. If you value community, we invite you to be part of ours.About this RoleAs the Senior Staff Machine Learning Platform Engineer, you will spearhead the technical vision and evolution of Faire's ML platform. You will establish standards, influence organization-wide architecture, and lead intricate, cross-functional initiatives that enhance data science velocity at scale. This position is crucial for adapting ML workflows to leverage modern AI productivity tools. You will not only develop models but also design the systems that enable those models to empower tens of thousands of small retailers in competing and growing their local businesses.

Mar 4, 2026
Apply
companyLyft, Inc. logo
Full-time|$185K/yr - $222K/yr|On-site|San Francisco, CA

Lyft’s Self-Serve Intelligence team builds the systems that help riders and drivers resolve issues on their own. Part of the Safety & Customer Care organization, this group focuses on backend services, APIs, and AI-powered products that let customers get help without waiting for an agent. The team’s work includes AI Assist (such as AI Agents), automations, and self-service workflows, all designed to make support fast and reliable. Role overview As a Senior Software Engineer on this team, the main responsibility is to design, build, deploy, and maintain backend systems and AI-driven tools that handle customer problems automatically. These solutions use Generative AI and automation to deliver scalable, dependable self-service experiences for millions of Lyft riders and drivers. What you will do Design and develop backend services and APIs for AI-powered self-service products Build and maintain AI Agents and automation tools that resolve customer issues without agent involvement Oversee the full development lifecycle: system design, prototyping, deployment, and ongoing operations Work closely with product managers, designers, data scientists, and operations teams to deliver robust solutions Focus on reliability, scalability, and operational excellence in all systems Location This role is based in San Francisco, CA.

Apr 17, 2026
Apply
companyCloudflare, Inc. logo
Full-time|Hybrid|Hybrid

Join our dynamic team at Cloudflare as a Senior/Principal Systems Engineer specializing in Workers AI (AI/ML). In this pivotal role, you will leverage your expertise in artificial intelligence and machine learning to develop cutting-edge solutions that enhance our platform's capabilities. You will collaborate with cross-functional teams to drive innovation and improve our systems, ensuring we remain at the forefront of technology.

Feb 6, 2026
Apply
company
Full-time|On-site|SF Bay Area

About UsAt Lemurian Labs, we are dedicated to democratizing AI technology while prioritizing sustainability. Our mission is to create solutions that minimize environmental impact, ensuring that artificial intelligence serves humanity positively. We are committed to responsible innovation and the sustainable growth of AI.We are in the process of developing a state-of-the-art, portable compiler that empowers developers to 'build once, deploy anywhere.' This technology ensures seamless cross-platform integration, allowing for model training in the cloud and deployment at the edge, all while maximizing resource efficiency and scalability.If you are passionate about scaling AI sustainably and are eager to make AI development more powerful and accessible, we invite you to join our team at Lemurian Labs. Together, we can build a future that is innovative and responsible.The RoleWe are seeking a Senior ML Performance Engineer to take charge of designing and leading our Performance Testing Platform from inception. In this pivotal role, you will be recognized as the technical expert in measuring, validating, and enhancing the performance of large language models (including Llama 3.2 70B, DeepSeek, and others) prior to and following compiler optimization on cutting-edge GPU architectures.This is a critical position that will significantly impact our product quality and customer success. You will work at the intersection of Machine Learning systems, GPU architecture, and performance engineering, constructing the infrastructure that substantiates the value of our compiler.

Oct 31, 2025
Apply
company
Full-time|Remote|San Francisco

At Runway ML, we are revolutionizing the intersection of art and science through innovative AI technology. Our mission is to build sophisticated world models that transcend traditional artificial intelligence limitations. We believe that to tackle the most pressing challenges—such as robotics, disease, and scientific breakthroughs—we need systems that can learn from experiences just like humans do. By simulating these experiences, we can expedite progress in ways that were previously unimaginable.Our diverse and driven team consists of creative thinkers who are passionate about pushing boundaries and achieving the extraordinary. If you share this ambition and are eager to contribute to our groundbreaking work, we invite you to join us.About the Role*We are open to hiring remotely across North America. We also have offices in NYC, San Francisco, and Seattle.We are on the lookout for a highly skilled and intellectually inquisitive Technical Accounting Manager to be our go-to authority on intricate accounting issues. This position offers significant visibility and is ideal for a professional adept at interpreting complex accounting guidelines, formulating sound conclusions, and translating technical insights into practical accounting practices.

Mar 17, 2026
Apply
company
Full-time|On-site|San Francisco

OverviewPluralis Research is at the forefront of Protocol Learning, innovating a decentralized approach to train and deploy AI models that democratizes access beyond just well-funded corporations. By aggregating computational resources from diverse participants, we incentivize collaboration while safeguarding against centralized control of model weights, paving the way for a truly open and cooperative environment for advanced AI.We are seeking a talented Machine Learning Training Platform Engineer to design, develop, and scale the core infrastructure that powers our decentralized ML training platform. In this role, you will have ownership over essential systems including infrastructure orchestration, distributed computing, and service integration, facilitating ongoing experimentation and large-scale model training.ResponsibilitiesMulti-Cloud Infrastructure: Create resource management systems that provision and orchestrate computing resources across AWS, GCP, and Azure using infrastructure-as-code tools like Pulumi or Terraform. Manage dynamic scaling, state synchronization, and concurrent operations across hundreds of diverse nodes.Distributed Training Systems: Design fault-tolerant infrastructure for distributed machine learning, including GPU clusters, NVIDIA runtime, S3 checkpointing, large dataset management and streaming, health monitoring, and resilient retry strategies.Real-World Networking: Develop systems that simulate and manage real-world network conditions—such as bandwidth shaping, latency injection, and packet loss—while accommodating dynamic node churn and ensuring efficient data flow across workers with varying connectivity, as our training occurs on consumer nodes and non-co-located infrastructure.

Apr 1, 2026
Apply
companyDraup logo
Full-time|On-site|San Francisco, CA

Draup, based in San Francisco, is an AI company with Series A funding that builds intelligence solutions for large enterprises. The platform analyzes over 1 billion job descriptions and 850 million professional profiles, serving more than 250 enterprise clients, including several Fortune 10 companies. Draup’s data comes from more than 100 labor databases, supporting clients with deep workforce insights. Role overview The engineering team in Silicon Valley is expanding. Draup seeks experienced AI/ML Engineers interested in advancing both research and product development in artificial intelligence. What you will do Develop and maintain production-grade large language model (LLM) pipelines and agentic workflows. Design and enhance retrieval-augmented generation (RAG) architectures at scale, using vector databases such as Pinecone, FAISS, and Weaviate. Implement advanced agentic systems with tools like LangGraph or LlamaIndex, focusing on tool use, multi-agent coordination, and reasoning loops. Lead prompt engineering, manage model versioning, oversee evaluation (including RAGAS and DeepEval), and instrument LLMOps. Integrate AI features into large-scale data pipelines, ensuring observability in production and compliance with guardrails. Location This position is based in San Francisco, CA (Silicon Valley).

Apr 22, 2026
Apply
companyHinge Health logo
Full-time|Hybrid|San Francisco-HQ

About the RoleHinge Health is a leading digital health company dedicated to delivering innovative, evidence-based solutions for musculoskeletal (MSK) pain management. Our unique approach combines personalized exercise therapy with virtual care, empowering individuals to effectively manage chronic pain and enhance their quality of life, all while reducing healthcare costs. By partnering with employers and health plans, we aim to scale our solutions and improve overall employee health and productivity.Join our dynamic AI platform team at Hinge Health, where we are at the forefront of revolutionizing how businesses harness the power of Artificial Intelligence. We are searching for a Senior Software Engineer with a robust background in software engineering and a passion for AI to contribute to our mission.As a Senior Software Engineer, you will play a crucial role in developing and maintaining vital components of our AI infrastructure. You will collaborate closely with engineers and data scientists to ensure the platform effectively supports the intricate needs of AI models and machine learning workflows.Our team thrives in a continuous deployment DevOps culture, taking pride in maintaining high standards of code in production. Our production systems leverage technologies such as React Native, React, Node.js, TypeScript, Python, Nestjs, GraphQL, Docker, AWS, Postgres, Redis, Apollo, and Redux. We follow a trunk-based CI/CD workflow and uphold the highest security and compliance standards, including HIPAA, HITRUST, SOC 2, and CCPA.What You'll AccomplishCollaborate with the AI/ML team to effectively integrate models and agents into the platform, ensuring seamless deployment.Write clean, maintainable code with a focus on performance, reliability, and scalability.Develop and sustain APIs and microservices that facilitate AI model and agent deployment and data processing.Troubleshoot and resolve platform performance, integration, and reliability issues.Work alongside cross-functional teams to gather technical requirements and deliver efficient solutions.Contribute to ongoing improvements in the AI platform’s systems and workflows, providing ideas and feedback during architectural discussions.Hinge Health Hybrid ModelWe recognize that both remote and in-person work offer unique benefits, and we aim to capitalize on the strengths of both approaches. Employees in hybrid roles enjoy the flexibility of working from home while also engaging in person as needed.

Mar 10, 2026
Apply
company
Full-time|On-site|San Francisco, CA

Quizlet, Inc. supports millions of learners each month by combining cognitive science with advanced machine learning. The platform serves two-thirds of U.S. high school students and half of college students, powering over 2 billion learning interactions monthly. Quizlet’s mission centers on making education more personal and effective for students, professionals, and lifelong learners. The AI & Data Platform team underpins Quizlet’s applied AI initiatives. This group develops and maintains the systems behind personalization, recommendations, the AI Coach, content generation, and emerging agentic experiences. The team oversees the full machine learning model lifecycle: data and feature engineering, training, evaluation, deployment, and inference. Reliability, speed, security, and observability guide their work. Their approach blends managed Google Cloud services, top vendor tools, open-source solutions, and custom internal abstractions to achieve efficient, reliable outcomes. Role overview The Senior Staff Engineer, AI Platform, is a senior individual contributor who defines the technical direction for Quizlet’s next generation of machine learning and large language model infrastructure. This hands-on role involves architecting core platform systems, steering build-versus-buy decisions, and collaborating with teams across Applied AI, Data Science, Product Engineering, and Infrastructure. The position sets standards for how models and LLM-driven systems are trained, evaluated, deployed, and governed at scale. This role is well suited to an engineer who excels at the senior-staff level in large organizations but values the autonomy and impact of a smaller, cloud-native setting. The technology stack includes Google Cloud, Kubernetes and GKE, distributed training, MLflow workflows, data and feature platforms, online and asynchronous inference, and the evaluation and observability tools needed to operate predictive ML and LLM systems at scale. Work location and schedule This is an onsite position based in San Francisco, CA. Team members are expected in the office at least three days a week: Monday, Wednesday, and Thursday, to foster collaboration.

Apr 23, 2026
Apply
companySigma Computing logo
Full-time|$240K/yr - $270K/yr|On-site|San Francisco, CA

Role Overview Sigma Computing is building the next generation of data interaction. The platform lets users explore and analyze billions of data rows in seconds, all within a familiar spreadsheet-like interface. Sigma aims to make it simple to analyze, present, and build data-driven applications at scale. AI is central to Sigma's vision for the future. The company is expanding its use of artificial intelligence to help users build in Sigma, surface insights, and make decisions faster. What You Will Do As a Senior AI/ML Engineer, join a team focused on shaping the AI architecture behind Sigma's platform. This work directly impacts thousands of enterprises that depend on Sigma for their data workflows. The team is responsible for designing and implementing the systems that will power Sigma's AI-driven features for years to come. Location This position is based in San Francisco, CA.

Apr 25, 2026
Apply
companyfabrion logo
Full-time|On-site|San Francisco Bay Area

ML/AI Research Engineer - Founding Team at Agentic AI LabLocation: San Francisco Bay AreaType: Full-TimeCompensation: Competitive salary + meaningful equity (founding tier)At fabrion, backed by 8VC, we are assembling a top-tier team dedicated to addressing one of the most pressing infrastructure challenges in the industry.About the RoleJoin us in shaping the future of enterprise AI infrastructure, focusing on agents, retrieval-augmented generation (RAG), knowledge graphs, and multi-tenant governance.As an ML/AI Research Engineer, you will spearhead the design, training, evaluation, and optimization of agent-native AI models. Your work will integrate LLMs, vector search, graph reasoning, and reinforcement learning, establishing the intelligence layer for our enterprise data fabric.This role goes beyond prompt engineering; it encompasses the entire ML lifecycle—from data curation and fine-tuning to thorough evaluation, interpretability, and deployment, all while considering cost-effectiveness, alignment, and agent coordination.Core ResponsibilitiesFine-tune and assess open-source LLMs (e.g., LLaMA 3, Mistral, Falcon, Mixtral) for enterprise applications, leveraging both structured and unstructured data.Construct and enhance RAG pipelines utilizing LangChain, LangGraph, LlamaIndex, or Dust, integrating with our vector databases and internal knowledge graphs.Train agent architectures (ReAct, AutoGPT, BabyAGI, OpenAgents) using enterprise task datasets.Develop embedding-based memory and retrieval chains employing token-efficient chunking strategies.Create reinforcement learning pipelines to enhance agent behaviors (e.g., RLHF, DPO, PPO).Establish scalable evaluation harnesses for LLM and agent performance, including synthetic evaluations, trace capture, and explainability tools.Contribute to model observability, drift detection, error classification, and alignment efforts.Optimize inference latency and GPU resource utilization across both cloud and on-premises environments.Desired ExperienceModel Training:Deep understanding of machine learning principles and hands-on experience with model training.

Aug 28, 2025

Sign in to browse more jobs

Create account — see all 10,254 results

Tailoring 0 resumes

We'll move completed jobs to Ready to Apply automatically.