Gpu Performance Engineer Member Of Technical Staff jobs in San Francisco – Browse 5,951 openings on RoboApply Jobs

Gpu Performance Engineer Member Of Technical Staff jobs in San Francisco

Open roles matching “Gpu Performance Engineer Member Of Technical Staff” with location signals for San Francisco. 5,951 active listings on RoboApply Jobs.

5,951 jobs found

1 - 20 of 5,951 Jobs
Apply
Gimlet Labs logo
Full-time|On-site|San Francisco

At Gimlet Labs, we are pioneering the first heterogeneous neocloud tailored for AI workloads. As the demand for AI systems grows, traditional infrastructure faces significant limitations in terms of power, capacity, and cost. Our innovative platform addresses these challenges by decoupling AI workloads from the hardware, intelligently partitioning tasks, and…

Mar 10, 2026
Apply
Reka logo
Full-time|Remote|US, UK, Singapore, Remote

Join our dynamic team at Reka as a GPU Performance Engineer, where you will leverage your expertise in Python and large-scale model training to enhance our training infrastructure. You will play a pivotal role in optimizing model performance, contributing to critical technical decisions, and improving our post-training processes, including reinforcement learning and fine-tuning. Your contributions will also focus on enhancing the efficiency and scalability of our model serving infrastructure.

Jan 8, 2026
Apply
Liquid AI logo
Full-time|On-site|San Francisco

Join the Innovative Team at Liquid AIFounded as a spin-off from MIT’s CSAIL, Liquid AI is at the forefront of developing cutting-edge AI systems that operate seamlessly across various platforms, including data center accelerators and on-device hardware. Our technology is designed to ensure low latency, efficient memory usage, privacy, and reliability. We collaborate with leading enterprises in sectors such as consumer electronics, automotive, life sciences, and financial services as we rapidly scale our operations. We are seeking talented individuals who are passionate about technology and innovation.Your Role in Our TeamAs a GPU Performance Engineer, your expertise will be critical in enhancing our models and workflows beyond the capabilities of standard frameworks. You will be responsible for designing and deploying custom CUDA kernels, conducting hardware-level profiling, and transforming research concepts into production code that yields tangible improvements in our pipelines (training, post-training, and inference). Our dynamic team values initiative and ownership, and we are looking for a candidate who thrives on tackling complex challenges related to memory hierarchies, tensor cores, and profiling outputs.While San Francisco and Boston are preferred, we welcome applications from other locations.

Jul 29, 2025
Apply
Genmo logo
Full-time|On-site|San Francisco HQ

At Genmo, we are at the forefront of advancing artificial intelligence through innovative research in video generation. Our mission is to construct open, cutting-edge models that will ultimately contribute to the realization of Artificial General Intelligence (AGI). As part of our dynamic team, you will play a pivotal role in redefining the future of AI and expanding the horizons of video creation.We are looking for a skilled GPU Performance Engineer who can extract maximum performance from our H100 infrastructure and fine-tune our model serving stack to achieve unparalleled efficiency. If you are passionate about optimizing performance, particularly at the microsecond level, and thrive on pushing hardware to its limits, this is the perfect opportunity for you.Key ResponsibilitiesUtilize advanced profiling tools such as Nsight Systems and nvprof to analyze and enhance GPU workloads.Develop high-performance CUDA and Triton kernels to optimize essential model functions.Reduce cold start latency from seconds to mere milliseconds in our serving infrastructure.Optimize memory access patterns, implement kernel fusion, and maximize GPU utilization.Collaborate closely with machine learning engineers to optimize model implementations.Diagnose and resolve performance issues throughout the application and hardware stack.Implement custom memory pooling and allocation strategies to enhance performance.Promote performance optimization techniques and foster a culture of excellence across teams.

Jul 17, 2025
Apply
Inferact logo
Full-time|$200K/yr - $400K/yr|Remote|San Francisco

At Inferact, we are on a mission to establish vLLM as the premier AI inference engine, significantly enhancing the speed and reducing the cost of AI inference. Our founders, the visionaries behind vLLM, have spent years bridging the gap between advanced models and cutting-edge hardware.About the RoleWe are seeking a skilled performance engineer dedicated to maximizing the computational efficiency of modern accelerators. In this role, you'll develop kernels and implement low-level optimizations that position vLLM as the fastest inference engine globally. Your contributions will be pivotal as your code will execute across a broad spectrum of hardware accelerators, from NVIDIA GPUs to the latest silicon innovations. You'll collaborate closely with hardware vendors to ensure we fully leverage the capabilities of each new generation of chips.

Jan 22, 2026
Apply
Wafer logo
Full-time|On-site|San Francisco

About the PositionAt Wafer, we are on a mission to enhance the intelligence per watt by developing AI systems that can self-optimize. Our journey begins with GPU kernels, and we aim to revolutionize every aspect of ML systems and AI infrastructure. We are a compact, dynamic team of four, supported by renowned investors including Fifty Years, Y Combinator, Jeff Dean, and Woj Zaremba, co-founder of OpenAI. We are seeking passionate engineers eager to innovate at the convergence of AI agents and systems programming.In this role, you will collaborate closely with our founding team to create the systems that power our GPU optimization platform. Your projects will range from the agent framework that refines kernels to the profiling infrastructure that interfaces with NCU and ROCprofiler, as well as the compiler tools that scrutinize PTX and SASS.

Feb 4, 2026
Apply
Prime Intellect logo
Full-time|On-site|San Francisco

Join Our Mission to Build Open Superintelligence InfrastructureAt Prime Intellect, we are pioneering the development of an open superintelligence stack that encompasses cutting-edge agentic models and the infrastructure that empowers anyone to create, train, and deploy these advanced AI systems. Our innovative approach aggregates and orchestrates global computational resources into a cohesive control plane, complemented by a comprehensive reinforcement learning (RL) post-training toolkit that includes environments, secure sandboxes, verifiable evaluations, and our asynchronous RL trainer. We provide researchers, startups, and enterprises with the capabilities to execute end-to-end reinforcement learning at unparalleled scale, adapting models to real-world tools, workflows, and deployment scenarios.As a Solutions Architect for GPU Infrastructure, you will be the technical authority responsible for translating customer needs into robust, production-ready systems designed to train the world’s most sophisticated AI models.With a recent funding round raising $15 million (totaling $20 million) led by Founders Fund, alongside contributions from Menlo Ventures and illustrious angels such as Andrej Karpathy (Tesla, OpenAI), Tri Dao (Together AI), Dylan Patel (SemiAnalysis), Clem Delangue (Huggingface), and Emad Mostaque (Stability AI), we are poised for significant growth and innovation.Key Technical ResponsibilitiesThis role requires a blend of deep technical knowledge and hands-on implementation skills. Your contributions will be crucial in:Customer Architecture & DesignCollaborating with clients to comprehend workload specifications and architect optimal GPU cluster solutions.Drafting technical proposals and conducting capacity planning for clusters ranging from 100 to over 10,000 GPUs.Formulating deployment strategies for large language model (LLM) training, inference, and high-performance computing (HPC) tasks.Delivering architectural recommendations to both technical teams and executive stakeholders.Infrastructure Deployment & OptimizationImplementing and configuring orchestration frameworks such as SLURM and Kubernetes for distributed workloads.Establishing high-performance networking through InfiniBand, RoCE, and NVLink interconnects.Enhancing GPU utilization, memory management, and inter-node communication.Setting up parallel file systems (Lustre, BeeGFS, GPFS) to maximize I/O efficiency.Tuning system performance, from kernel parameters to CUDA configurations.Production Operations & SupportEnsuring the reliability and performance of GPU infrastructure through continuous monitoring and support.Collaborating with cross-functional teams to troubleshoot and optimize operational workflows.Documenting processes and creating training materials for team members and clients.

Aug 30, 2025
Apply
Catalog logo
Full-time|On-site|San Francisco

At Catalog, we are pioneering the commerce infrastructure for AI—creating the essential framework that enables digital agents to not only explore the web but also comprehend, analyze, and engage with products. Our innovations drive the future of AI-driven shopping experiences, fundamentally transforming how consumers discover and purchase items online.Role OverviewAs a Technical Staff Member, you will be instrumental in developing core systems, shaping our engineering culture, and transitioning our vision from prototype to a robust platform. This role requires full-stack expertise and a commitment to owning and resolving challenges from start to finish.Who You AreYou have experience creating beloved and trusted products from the ground up.You combine technical proficiency with a keen product sense and data-driven intuition.You are well-versed in AI technologies.You prioritize speed, write clean code, and ensure thorough instrumentation.You seek a high level of ownership within a small, talent-rich team based in San Francisco.Challenges You Will TackleDevelop and deploy agentic-search APIs that deliver structured and real-time product data in milliseconds.Build checkout systems enabling agents to conduct transactions with any merchant.Create an embeddings and retrieval layer that optimizes recall, precision, and cost efficiency.Establish a product graph and ranking pipeline that adapts based on actual user outcomes.Preferred QualificationsProven experience shipping data-centric products in a live environment.Experience with recommendation systems or information retrieval methodologies.Familiarity with API development, search indexing, and data pipeline construction.Our Work CultureWe operate with a small, high-trust, and highly motivated team, fostering an environment of in-person collaboration in North Beach, San Francisco. Our process involves debate, decision-making, and execution.If your profile aligns with our needs, we will contact you to arrange 2-3 brief technical interviews, followed by an onsite meeting in our office where you will collaborate on a small project, exchange ideas, and meet the team.

Oct 15, 2025
Apply
magic.dev logo
Full-time|On-site|San Francisco

At Magic, our mission is to create safe AGI that propels humanity forward in addressing the world’s most critical challenges. We believe that the key to achieving safe AGI lies in automating research and code generation to enhance models and resolve alignment issues more effectively than humans alone. Our unique approach integrates frontier-scale pre-training, domain-specific reinforcement learning, ultra-long context, and inference-time computation to realize this vision.Role OverviewAs a vital member of our Supercomputing Platform & Infrastructure team, you will be instrumental in designing, constructing, and managing the extensive GPU infrastructure that underpins Magic’s model training and inference processes.A key aspect of your role will involve leveraging Terraform-driven infrastructure-as-code methodologies to build and maintain our infrastructure, ensuring reproducibility, reliability, and operational clarity across clusters comprising thousands of GPUs.Magic’s long-context models exert continuous demands on compute, networking, and storage systems. The infrastructure must support long-running distributed jobs, high-throughput data movement, and stringent availability requirements, necessitating designs that are automated, observable, and resilient. You will take ownership of the systems and IaC foundations that facilitate these capabilities.This position has the potential to expand into broader responsibilities encompassing supercomputing platform architecture, influencing how Magic scales GPU clusters and enhances infrastructure reliability as model workloads expand.Key ResponsibilitiesDesign and manage large-scale GPU clusters for model training and inference.Construct and sustain infrastructure utilizing Terraform across both cloud and hybrid environments.Develop modular, scalable IaC frameworks for provisioning compute, networking, and storage resources.Enhance deployment reproducibility, maintain environment consistency, and ensure operational safety.Optimize networking and storage architectures for high-throughput AI workloads.Automate fault detection and recovery mechanisms across distributed clusters.Diagnose complex cross-layer issues involving hardware, drivers, networking, storage, operating systems, and cloud environments.Enhance observability, monitoring, and reliability of essential platform systems.QualificationsStrong foundation in systems engineering principles.Extensive hands-on experience with Terraform, including module design, state management, environment isolation, and large-scale implementations.

Jan 25, 2024
Apply
Databricks logo
Full-time|$190.9K/yr - $232.8K/yr|On-site|San Francisco, California

P-1285 About This Role Join our dynamic team at Databricks as a Staff Software Engineer specializing in GenAI Performance and Kernel. In this pivotal role, you will take charge of designing, implementing, and optimizing high-performance GPU kernels that drive our GenAI inference stack. Your expertise will lead the development of finely-tuned, low-level compute paths, balancing hardware efficiency with versatility, while mentoring fellow engineers in the intricacies of kernel-level performance engineering. Collaborating closely with machine learning researchers, systems engineers, and product teams, you will elevate the forefront of inference performance at scale. What You Will Do Lead the design, implementation, benchmarking, and maintenance of essential compute kernels (such as attention, MLP, softmax, layernorm, memory management) tailored for diverse hardware backends (GPU, accelerators). Steer the performance roadmap for kernel-level enhancements, focusing on areas like vectorization, tensorization, tiling, fusion, mixed precision, sparsity, quantization, memory reuse, scheduling, and auto-tuning. Integrate kernel optimizations seamlessly with higher-level machine learning systems. Develop and uphold profiling, instrumentation, and verification tools to identify correctness, performance regressions, numerical discrepancies, and hardware utilization inefficiencies. Conduct performance investigations and root-cause analyses to address inference bottlenecks, such as memory bandwidth, cache contention, kernel launch overhead, and tensor fragmentation. Create coding patterns, abstractions, and frameworks to modularize kernels for reuse, cross-backend compatibility, and maintainability. Influence architectural decisions to enhance kernel efficiency (including memory layout, dataflow scheduling, and kernel fusion boundaries). Guide and mentor fellow engineers focused on lower-level performance, conducting code reviews and establishing best practices. Collaborate with infrastructure, tooling, and machine learning teams to implement kernel-level optimizations in production and assess their impacts.

Jan 30, 2026
Apply
Composio logo
Full-time|On-site|sf

At Composio, we are developing advanced infrastructure that enables agents to seamlessly interact with essential work tools such as GitHub, Gmail, Notion, Salesforce, and more. Our dedicated team of engineers is committed to tackling challenges ranging from contextual understanding to search functionalities, ensuring we provide an exceptional bridge between your agents and their tools.Having secured a $25M Series A funding from Lightspeed, alongside prominent angel investors like Guillermo Rauch (CEO of Vercel), Dharmesh Shah (CTO of HubSpot), and Gokul Rajaram, we have experienced remarkable growth, tripling our ARR at the start of this year. Our clientele includes notable names from Y Combinator cohorts to Wabi, Glean, Zoom, and beyond.Your RoleEnhance the experience of teams utilizing our platform by refining our core APIs and SDK.Create intuitive interfaces for both frontend and SDK applications.Take ownership of product development from concept through to production.Collaborate closely with customers to cultivate their loyalty while enhancing the product.Craft clear and concise documentation.

Feb 10, 2026
Apply
TierZero logo
Full-time|Hybrid|SF HQ

TierZero builds tools that help engineering teams deliver and manage code efficiently. The platform enables quicker incident response, clearer operational visibility, and shared knowledge among engineers. Backed by $7 million from investors like Accel and SV Angel, TierZero supports clients such as Discord, Drata, and Framer as they strengthen infrastructure for AI-driven work. This in-person role is based at TierZero's San Francisco headquarters, with a hybrid schedule requiring three days onsite each week. As a founding member of the technical staff, work directly with the CEO, CTO, and customers to influence the direction of TierZero’s core products and systems. The position calls for flexibility as priorities shift and close collaboration across the company. What you will do Design and develop AI systems that handle large volumes of unstructured data. Build full-stack product features, informed by direct feedback from users. Enhance the product so agents are intelligent, reliable, and easy for engineers to use. Create systems to automatically evaluate outputs from large language models and improve agentic reasoning through self-play and feedback. Construct machine learning pipelines, including data ingestion, feature creation, embedding stores, retrieval-augmented generation (RAG) pipelines, vector search, and graph databases. Experiment with open-source and emerging large language models to compare different approaches. Develop scalable infrastructure for long-running, multi-step agents, including memory, state management, and asynchronous workflows. Requirements Interest in working with large language models, managed cloud platforms, cloud infrastructure, and observability tools. At least 5 years of professional experience or significant open-source contributions. Comfort with shifting priorities and tackling new technical problems. Strong product focus and commitment to customer outcomes. Openness to learning from a team with a track record of delivering over $10 billion in value. Ability to work onsite in San Francisco three days per week. Bonus: Experience in a startup setting and familiarity with startup dynamics.

Apr 24, 2026
Apply
tierzero logo
Full-time|Hybrid|SF HQ

About TierZero TierZero helps engineering teams use AI to build and ship code more efficiently. The platform targets the bottleneck of human speed in production, giving teams tools for faster incident response, better operational visibility, and shared knowledge. TierZero is backed by $7M in funding from investors including Accel and SV Angel. Companies like Discord, Drata, and Framer trust TierZero to strengthen their infrastructure for AI-driven engineering. Role Overview: Founding Member of Technical Staff This is an on-site role based at TierZero’s San Francisco headquarters, with three days a week in the office. As a founding member, direct collaboration with the CEO, CTO, and early customers shapes the direction of both product and systems. The work spans hands-on development and close engagement with users and leadership. What You Will Do Design and build intelligent AI systems to analyze large volumes of unstructured data. Deliver full-stack features based on real user feedback. Improve the product experience so AI agents are both reliable and easy for engineers to use. Develop systems that automatically evaluate LLM outputs and advance agentic reasoning using self-play and feedback loops. Create machine learning pipelines, including data ingestion, feature generation, embedding stores, retrieval-augmented generation (RAG), vector search, and graph databases. Prototype with open-source and new LLMs, comparing their strengths and weaknesses. Build scalable infrastructure for long-running, multi-step agents, with attention to memory, state, and asynchronous workflows. What We Look For Over five years of relevant professional or open-source experience. Comfort working in environments with uncertainty and evolving challenges. Strong product focus and a drive for customer satisfaction. Interest in large language models (LLMs), Model Control Planes (MCPs), cloud infrastructure, and observability tools. Previous startup experience is a plus. Location This position is based in San Francisco. Expect to work on-site three days per week at TierZero’s HQ.

Apr 15, 2026
Apply
tierzero logo
Full-time|Hybrid|SF HQ

Join Us If You:Are eager to learn from a group of experienced engineers who have successfully delivered over $10 billion in value.Prefer to work in our San Francisco office three days a week.Excel in navigating uncertainty.Possess a product-oriented mindset with a strong emphasis on customer satisfaction.Are passionate about working with Large Language Models (LLMs), Multi-Cloud Platforms (MCPs), Cloud Infrastructure, and Observability tools.Bring at least five years of professional or open-source experience.Bonus: Have previous experience in a startup environment and understand the dynamics involved.About TierZeroAt TierZero, we are redefining how engineering teams leverage AI to enhance the speed and efficiency of code deployment. While AI accelerates the development cycle, the actual process of productionizing code remains a challenge. Our platform empowers agile engineering teams to manage code in production effectively, ensuring quicker incident response times, comprehensive operational visibility, and shared knowledge among all team members.Backed by $7 million in funding from leading investors like Accel and SV Angel, TierZero is trusted by industry leaders such as Discord, Drata, and Framer to operate their high-scale systems and create the foundational layer for AI-driven engineering teams.The RoleAs a founding member of our team, you will play a crucial role in conceptualizing and developing our core product and systems from the ground up. Collaborating closely with the CEO, CTO, and our valued customers, you will be engaged in a variety of dynamic projects, including:Designing and implementing intelligent AI systems capable of analyzing extensive unstructured data.Delivering full-stack features informed by direct user feedback.Enhancing the product experience to ensure agents are not only intelligent but also user-friendly and reliable for engineers.Creating systems that autonomously assess LLM outputs, enhancing agent reasoning through iterative self-play and feedback mechanisms.Developing machine learning pipelines encompassing data ingestion, feature generation, embedding stores, retrieval-augmented generation (RAG) pipelines, vector search infrastructure, and graph databases.Investigating and prototyping with open-source and cutting-edge LLMs to assess their capabilities and trade-offs.Establishing scalable infrastructure to support long-running, multi-step agents, addressing aspects like memory management, state handling, and asynchronous workflows.

Apr 30, 2026
Apply
Sciforium logo
Full-time|On-site|San Francisco

At Sciforium, we are at the forefront of AI infrastructure, innovating next-generation multimodal AI models and a proprietary high-efficiency serving platform. With substantial funding and direct collaboration from AMD, supported by their engineers, our team is rapidly expanding to develop the complete stack that powers cutting-edge AI models and real-time applications.About the RoleWe are on the lookout for a talented GPU Kernel Engineer who is eager to explore and maximize performance on modern accelerators. In this role, you will be responsible for designing and optimizing custom GPU kernels that drive our advanced large-scale AI systems. You will navigate the hardware-software stack, engaging in low-level kernel development and integrating optimized operations into high-level machine learning frameworks for large-scale training and inference.This position is perfect for someone who excels at the intersection of GPU programming, systems engineering, and state-of-the-art AI workloads, and aims to contribute significantly to the efficiency and scalability of our machine learning platform.Key ResponsibilitiesDevelop, implement, and enhance custom GPU kernels utilizing C++, PTX, CUDA, ROCm, Triton, and/or JAX Pallas.Profile and fine-tune the end-to-end performance of machine learning operations, particularly for large-scale LLM training and inference.Integrate low-level GPU kernels into frameworks such as PyTorch, JAX, and our proprietary internal runtimes.Create performance models, pinpoint bottlenecks, and deliver kernel-level enhancements that significantly boost AI workloads.Collaborate with machine learning researchers, distributed systems engineers, and model-serving teams to optimize computational performance across the entire stack.Engage closely with hardware vendors (NVIDIA/AMD) and stay updated on the latest GPU architecture and compiler/toolchain advancements.Contribute to the development of tools, documentation, benchmarking suites, and testing frameworks ensuring correctness and performance reproducibility.Must-Haves5+ years of industry or research experience in GPU kernel development or high-performance computing.Bachelor’s, Master’s, or PhD in Computer Science, Computer Engineering, Electrical Engineering, Applied Mathematics, or a related discipline.Strong programming proficiency in C++, Python, and familiarity with machine learning frameworks.

Dec 6, 2025
Apply
Baseten logo
Full-time|On-site|San Francisco

ABOUT BASETENAt Baseten, we empower the world's leading AI firms—such as Cursor, Notion, and OpenEvidence—by delivering mission-critical inference solutions. Our unique blend of applied AI research, robust infrastructure, and user-friendly developer tools enables AI pioneers to effectively deploy groundbreaking models. With our recent achievement of a $300M Series E funding round supported by esteemed investors like BOND and IVP, we're on an exciting growth trajectory. Join our dynamic team and contribute to the platform that drives the next generation of AI products.THE ROLEWe are looking for an experienced Senior GPU Kernel Engineer to join our innovative team at the forefront of AI acceleration. In this role, your programming expertise will directly enhance the performance of cutting-edge machine learning models. You'll be responsible for developing highly efficient GPU kernels that optimize computational processes, allowing for transformative AI applications.You'll thrive in a fast-paced, intellectually challenging environment where your technical skills are pivotal. Your contributions will directly affect production systems that serve millions of users across various platforms. This position offers exceptional opportunities for career advancement for engineers enthusiastic about low-level optimization and impactful systems engineering.EXAMPLE INITIATIVESAs part of our Model Performance team, you will engage in projects like:Baseten Embeddings Inference: The quickest embeddings solution availableThe Baseten Inference StackEnhancing model performance optimizationRESPONSIBILITIESCore Engineering ResponsibilitiesDesign and develop high-performance GPU kernels for essential machine learning operations, including matrix multiplications and attention mechanisms.Collaborate with cross-functional teams to drive performance improvements and implement optimizations.Debug and refine kernel code to achieve maximal efficiency and reliability.Stay abreast of the latest advancements in GPU technology and machine learning frameworks.

Jul 17, 2025
Apply
Adyen logo
Full-time|On-site|San Francisco

Join our dynamic team at Adyen as a Technical Staff Member in San Francisco! We are seeking innovative minds passionate about technology and problem-solving. In this role, you will collaborate with cross-functional teams to craft solutions that enhance our services and improve customer experiences.

Mar 6, 2026
Apply
tierzero logo
Full-time|On-site|SF HQ

tierzero is looking for a Founding Member of Technical Staff to help shape the direction of its technology from the ground up. This role is based at the company's San Francisco headquarters. Role overview As an early technical hire, you will work closely with engineers and product managers to build new products and features. The work centers on designing, coding, and delivering software solutions that address client needs and support tierzero's growth. Impact Contributions in this role will directly influence the company's future. The team values initiative and hands-on problem solving, giving each member a chance to make a visible difference in how the company evolves. Collaboration This position involves regular collaboration with a small, focused team. Input and ideas from every member help guide product direction and technical decisions.

Apr 29, 2026
Apply
TierZero logo
Full-time|Hybrid|SF HQ

Are you ready to take a leap into innovation?Join a team of expert engineers who have collectively contributed over $10 billion in value.Be present in our San Francisco office three days a week, collaborating closely with your peers.Flourish in a dynamic environment where adaptability is key.Adopt a product-driven and customer-centric mindset.Engage with cutting-edge technologies including LLMs, MCPs, Cloud Infrastructures, and Observability tools.Bring over five years of professional experience or open-source contributions to the table.Bonus points if you've previously thrived in a startup environment.About TierZeroAt TierZero, we are transforming the landscape of software engineering with AI. Our mission is to enhance the speed at which engineering teams build and deploy code, addressing the bottlenecks that slow down production. With $7 million raised from esteemed investors like Accel and SV Angel, our solutions are trusted by leading companies such as Discord, Drata, and Framer to optimize their high-scale systems and infrastructure for the AI-driven future.Your RoleAs a founding member, you will play a pivotal role in creating and developing our core products and systems. Collaborating closely with our CEO, CTO, and our valued customers, you will be engaged in a variety of tasks, including:Designing and implementing intelligent AI systems capable of reasoning over vast amounts of unstructured data.Deploying full-stack features based on direct user feedback.Enhancing the product experience to ensure that our AI agents are not only intelligent but also reliable and user-friendly for engineers.Building systems that automatically assess LLM outputs, enhancing reasoning through self-play and feedback loops.Developing machine learning pipelines for data ingestion, feature generation, embedding storage, RAG pipelines, vector search infrastructure, and graph databases.Experimenting with open-source and frontier LLMs to assess various trade-offs.Creating scalable infrastructure to support long-running, multi-step agents, including memory, state management, and asynchronous workflows.

May 1, 2026
Apply
tierzero logo
Full-time|Hybrid|SF HQ

About tierzero tierzero helps engineering teams build and deploy code with greater speed and operational clarity in an AI-driven world. The company focuses on improving incident response, operational visibility, and knowledge sharing for engineers. Backed by $7 million in funding from investors like Accel and SV Angel, tierzero supports large-scale systems for clients such as Discord, Drata, and Framer. Role Overview: Founding Member of Technical Staff This role is based at tierzero's San Francisco headquarters. In-person work is required three days a week. As a founding member of the technical team, you will help design and build core products and systems from the ground up. Collaboration is central: expect to work closely with the CEO, CTO, and customers. Projects span a wide range of technical challenges and product areas. What You Will Do Design and implement intelligent AI systems that process and reason over large volumes of unstructured data. Develop full-stack features, incorporating direct feedback from users. Improve the product experience so intelligent agents are practical and reliable for engineers. Create systems that automatically evaluate LLM outputs and refine agent reasoning using self-play and feedback loops. Build machine learning pipelines covering data ingestion, feature generation, embedding stores, RAG pipelines, vector search, and graph databases. Prototype and experiment with open-source and advanced LLMs to weigh different approaches. Set up scalable infrastructure for long-running, multi-step agents, including memory management, state handling, and asynchronous workflows. What We Look For At least 5 years of professional or open-source experience in a relevant technical field. Comfort working in a setting that changes and evolves quickly. Strong product focus and an understanding of customer needs. Interest in LLMs, MCPs, cloud infrastructure, and observability tools. Ability to learn from and collaborate with engineers who have delivered over $10 billion in value. Commitment to working onsite in San Francisco three days per week. Startup experience is a plus.

Apr 20, 2026

Sign in to browse more jobs

Create account — see all 5,951 results

Tailoring 0 resumes

We'll move completed jobs to Ready to Apply automatically.

Gpu Performance Engineer Member Of Technic… | RoboApply Jobs