AnthropicSan Francisco, CA | New York City, NY | Seattle, WA
Remote Full-time
Clicking Apply Now takes you to AutoApply where you can tailor your resume and apply.
Unlock Your Potential
Generate Job-Optimized Resume
One Click And Our AI Optimizes Your Resume to Match The Job Description.
Is Your Resume Optimized For This Role?
Find Out If You're Highlighting The Right Skills And Fix What's Missing
Experience Level
Mid to Senior
Qualifications
We are seeking candidates with a strong background in database design and software engineering. Ideal applicants will have:Proficiency in SQL and NoSQL databases. Experience with cloud database solutions. A solid understanding of data modeling and data architecture. Strong programming skills in languages such as Python, Java, or Go. Excellent problem-solving abilities and a passion for technology. Experience in a collaborative development environment.
About the job
Join Anthropic as a Staff+ Software Engineer focusing on Databases, where you'll be at the forefront of our innovative technology solutions. You'll work closely with a collaborative team to design, implement, and maintain robust database systems that empower our AI models and enhance user experience. Your expertise will contribute significantly to our mission of advancing AI safety and usability.
About Anthropic
Anthropic is a cutting-edge AI safety and research company dedicated to ensuring that artificial intelligence technologies are aligned with human intentions. We foster a culture of collaboration, innovation, and ethical responsibility, striving to create AI systems that are safe and beneficial for all.
Join Cloudflare as a Database Reliability Engineer, where you will play a crucial role in ensuring the reliability and performance of our database systems. You will work collaboratively with our engineering teams to develop, implement, and maintain robust database solutions that support our mission of making the internet safer and faster.Your responsibilities will include monitoring database performance, troubleshooting issues, and optimizing queries to enhance system efficiency. If you are passionate about databases and eager to make an impact in a dynamic environment, we encourage you to apply!
Full-time|$90K/yr - $140K/yr|Remote|San Francisco - Remote
Rithum™ stands as the leading commerce network globally, revolutionizing the collaboration between brands, suppliers, and retailers to create frictionless e-commerce experiences. Our platform empowers brands and retailers to drive exponential growth, streamline operations across various channels, scale product offerings, and improve profit margins. Today, over 40,000 companies trust Rithum to enhance their business across numerous channels, collectively representing over $50 billion in annual GMV. With our comprehensive commerce, marketing, and delivery solutions, our clients craft optimized consumer shopping experiences from start to finish. Overview The Database Reliability Engineering (DBRE) team at Rithum is crucial in ensuring the availability, reliability, and observability of our database systems. We leverage automation to minimize manual tasks while constantly seeking enhancements to our processes. Our team manages and optimizes an extensive SQL Server environment that spans hundreds of instances across a hybrid infrastructure (on-prem VMware and AWS), in addition to various relational and NoSQL database platforms such as MongoDB, DynamoDB, Elasticsearch, MySQL, Postgres, and Redis. These systems are integral to all facets of our operations. Our team thrives on a robust culture of curiosity, transparency, collaboration, and continuous learning. As a Senior Database Reliability Engineer, you will embody these values and cultivate them within your team. You will oversee diverse database systems and design and lead your projects with a technical focus.
Welcome to Demandbase:Demandbase is pioneering the only AI-driven pipeline platform that empowers go-to-market (GTM) teams to automate robust growth at scale. Our platform offers a cohesive view of data, insights, actions, and outcomes, enabling B2B enterprises to effectively align and execute their account-based GTM strategies with assurance. Trusted by thousands of businesses, Demandbase maximizes revenue, minimizes waste, and streamlines data and tech stacks—all within a single platform.We are dedicated to nurturing careers just as much as we are to developing cutting-edge technology. Our commitment extends to our people, culture, and the community surrounding us. Demandbase has been consistently recognized as one of the Best Places to Work in the San Francisco Bay Area by Fortune and one of the 60 Best Companies to Sell For by Selling Power. Our offices are located in San Francisco, New York, Austin, Seattle, India, and the United Kingdom.Role OverviewAs a Senior Database Platform Engineer, you will architect, develop, and enhance scalable database platforms that support Demandbase’s cloud-native applications. You will be instrumental in shaping the future of database reliability by integrating automation, observability, and compliance into database provisioning, operation, and scalability across AWS environments.This is a senior individual contributor role requiring strong technical ownership and the ability to influence cross-functional teams. You will lead a global Database Reliability Engineering (DBRE) organization, collaborate closely with product and platform teams, and modernize legacy database systems while facilitating a transition towards service-owned, self-service models.Note: The base compensation range for this position applies to candidates based in San Francisco, CA. For all other locations, the compensation range is determined by the primary work location of the candidate. Actual compensation packages are tailored to each candidate and depend on various factors, including skillset, years of experience, and depth of expertise.
Role overview Scale AI seeks a Database Engineer to strengthen and refine its data infrastructure. The position centers on designing, building, and maintaining database systems that deliver high availability and dependable performance. What you will do Design and implement database solutions that align with business requirements Maintain and tune database systems to ensure reliability and speed Collaborate with engineering, product, and operations teams to improve data processing and management Location This role is based in San Francisco, CA or New York, NY.
Join our dynamic team at 360 IT Professionals as a Senior Database Engineer. In this pivotal role, you will leverage your extensive knowledge of SQL to design, implement, and optimize robust database solutions. You will collaborate with cross-functional teams to ensure data integrity and support the organization’s data-driven decision-making.As a Senior Database Engineer, your responsibilities will include developing complex SQL queries, ensuring database performance, and implementing best practices for data management. You will also play a key role in mentoring junior engineers and sharing your expertise throughout the organization.
Become a vital part of the engineering teams that responsibly bring OpenAI’s transformative technologies to the world!At OpenAI, our Applied Engineering team collaborates across research, engineering, product management, and design to deliver AI solutions to both consumers and businesses. We are committed to learning from our deployments, maximizing the benefits of AI, and ensuring that this powerful technology is utilized both safely and ethically. Our priority is safety over unchecked growth.About the RoleAs OpenAI continues to expand, we are seeking seasoned engineers who excel in problem-solving to enhance the scalability of our systems. Our achievements hinge on our ability to rapidly iterate on product development while ensuring optimal performance and reliability. You will thrive in a collaborative, fast-paced environment, playing a key role in delivering our technology to millions globally, with a focus on safety and reliability. As a reliability engineer, you will lead efforts to maintain and improve the stability, scalability, and performance of our dynamic infrastructure. You will collaborate closely with cross-functional teams, including software engineers, product managers, and data scientists, to construct and sustain robust systems capable of accommodating our growing user base and workload.Your Responsibilities Include:Designing and implementing solutions to scale our infrastructure to meet increasing demands effectively.Developing and maintaining load, chaos, and synthetic testing software that enhances the reliability of systems designed by development teams.Creating and managing automation tools to streamline repetitive tasks and bolster system reliability.Overseeing the lifecycle management platform for CPU/storage, GPU, and network resources to foster efficiency and support dynamic optimization.Implementing fault-tolerant and resilient design patterns to minimize service interruptions.Establishing and maintaining service level objectives (SLOs) and service level indicators (SLIs) to ensure system reliability.Collaborating with researchers, engineers, product managers, and designers to introduce new features and research advancements to the world.Participating in an on-call rotation to address critical incidents and ensure 24/7 system availability.Your Impact: Your contributions will be essential in guaranteeing the reliability and performance of our platforms as we continue to scale our operations.
Full-time|Remote|San Francisco, CA | New York City, NY | Seattle, WA
Join Anthropic as a Staff+ Software Engineer focusing on Databases, where you'll be at the forefront of our innovative technology solutions. You'll work closely with a collaborative team to design, implement, and maintain robust database systems that empower our AI models and enhance user experience. Your expertise will contribute significantly to our mission of advancing AI safety and usability.
About GridwareGridware is an innovative technology firm headquartered in San Francisco, committed to safeguarding and enhancing the reliability of the electrical grid. We have pioneered a revolutionary approach to grid management known as Active Grid Response (AGR), which meticulously monitors the electrical, physical, and environmental factors influencing grid safety and reliability. Our state-of-the-art AGR platform leverages high-precision sensors to identify potential issues at an early stage, facilitating proactive maintenance and fault resolution. This holistic strategy is designed to bolster safety, minimize outages, and ensure optimal grid performance. We are proud to be supported by prominent climate-tech and Silicon Valley investors. To learn more, visit www.Gridware.io.About the RoleWe are seeking a skilled Senior Hardware Reliability Engineer to lead reliability testing, analysis, and lifetime modeling of various outdoor electronic assemblies. This pivotal role will concentrate on the electronic components of our products, collaborating closely with our mechanical-focused Reliability Engineer and engaging with the broader hardware and cross-functional teams.
About Multiply LabsMultiply Labs is an innovative startup located in San Francisco, California, backed by renowned investors in technology and life sciences such as Casdin Capital, Lux Capital, and Y Combinator. Our goal is to develop the world's leading robotic systems and utilize them to make groundbreaking life-saving therapies accessible to everyone.We are transforming the manufacturing process of cell therapies through the creation of advanced robotic systems that automate and scale the production of these crucial treatments. Our cutting-edge robots enable biopharma companies to produce cell therapies efficiently without overhauling their existing processes, thus minimizing regulatory hurdles and risks. Unlike traditional methods that are labor-intensive and costly (often exceeding $1M per patient), our robotic solutions aim to make these vital treatments more affordable and reachable for those who need them.To discover more and view our robots in action, please visit www.multiplylabs.com and follow us on LinkedIn.Position OverviewWe are looking for a dedicated Hardware Reliability Engineer to become an essential part of Multiply Labs’ Reliability Engineering team. As a founding member, you will collaborate closely with the Hardware Product and Systems Integration teams to enhance our designs throughout the entire development lifecycle, from initial prototypes to fully deployed GMP production systems. Your contributions will directly support the delivery of life-saving therapies by ensuring our robots operate seamlessly within the high-stakes biotech environment.
Full-time|Remote|Denver, Colorado, United States; San Francisco, California, United States
Join Checkr as a Software Engineer focusing on Reliability, where your contributions will enhance our platform's robustness and performance. You will be part of a dynamic team dedicated to building and scaling systems that support our growth and ensure outstanding service delivery to our clients.
Join Our Innovative TeamAt OpenAI, our Hardware organization is pioneering cutting-edge silicon and system-level solutions tailored to meet the demands of advanced AI workloads. We pride ourselves on developing next-generation AI-native silicon while collaborating with software and research partners to create hardware that is intricately integrated with AI models. Our mission includes delivering high-performance silicon for OpenAI’s supercomputing infrastructure and designing custom tools and methodologies that accelerate innovations, specifically optimized for AI applications.Your Role in Our MissionWe are on the lookout for a dynamic and experienced Reliability/DFX Engineer who possesses extensive knowledge in scaling machine learning systems. As an integral member of our hardware team, you will collaborate with chip design, platform design, hardware health, and the wider industry ecosystem to architect, implement, and deploy dependable next-generation AI accelerator systems. You will take a holistic approach to evaluate system and chip architecture, pinpointing high-ROI opportunities that enhance reliability and availability throughout the stack while translating these insights into actionable strategies and silicon features.Key Responsibilities:Lead the architecture, implementation, and execution of DFX strategies in silicon from concept to high-volume deployment, proposing impactful features to boost reliability and fault tolerance. Your focus will encompass design for testability, reliability, availability, and serviceability of high-performance AI hardware.Develop system-level reliability models based on empirical data to guide the organization’s DFX and reliability strategy, necessitating a deep understanding of chip and system architecture, design, implementation, and component-level reliability.Collaborate with chip and platform architecture/design teams to explore and implement DFX features, including the specification and integration of digital/mixed-signal IP, firmware/system software, and DFX methodologies.Work alongside hardware health and platform design teams to enhance reliability and fault tolerance in New Product Introduction (NPI) and High-Volume Manufacturing (HVM), driving continuous, data-driven improvements across the stack through optimized operating conditions and data analysis.Act as the DFX/reliability advocate, aligning the broader industry ecosystem with OpenAI’s strategic objectives and roadmap.Qualifications:Bachelor’s degree in Engineering or related field with 15+ years of experience, or a Master’s degree with 10+ years of relevant experience.Proven expertise in DFX methodologies and reliability engineering for high-performance hardware.Strong analytical and problem-solving skills, with a track record of improving system reliability and performance.Excellent collaboration and communication abilities, capable of working effectively in a cross-functional team environment.Familiarity with AI workloads and associated hardware requirements is highly desirable.
As a Platform Engineer specializing in Databases and Storage at WorldLabs, you will play a critical role in designing, implementing, and maintaining robust database and storage solutions that drive our innovative platform. You will collaborate with cross-functional teams to ensure optimal performance, scalability, and security of our data systems.
Full-time|$130K/yr - $180K/yr|On-site|San Francisco
Astranis is at the forefront of satellite technology, crafting advanced satellites designed for high orbits to broaden humanity's exploration of the solar system. Our satellites deliver dedicated, secure networks to a diverse range of esteemed clients worldwide, including large enterprises, government entities, and the US military. With five satellites currently operational and several more set to launch, we are addressing a robust backlog of over $1 billion in commercial contracts.We take pride in being the leading choice for satellite communications among clients with demanding standards for uptime, data security, network visibility, and customization. Having secured over $750 million from top-tier investors such as Andreessen Horowitz, Blackrock, and Fidelity, our team of 450 engineers and entrepreneurs operates from our expansive 153,000 sq. ft. headquarters in Northern California, USA.Senior Reliability Test EngineerAs a Senior Reliability Test Engineer, you will play a pivotal role in collaborating across all engineering disciplines to ensure our hardware achieves exceptional quality and reliability standards. With Astranis ramping up satellite production, your expertise will be essential in establishing a comprehensive reliability test program that supports the development of new product designs, monitors manufacturing processes, and identifies long-term reliability issues. The ideal candidate will possess extensive engineering experience with high-reliability products, demonstrate autonomy, and have the capability to design a reliability test program from the ground up.
Full-time|$135K/yr - $235K/yr|On-site|San Francisco
Astranis is revolutionizing satellite technology by creating advanced spacecraft designed for high orbits, thereby extending humanity's presence in the solar system. Our satellites deliver dedicated and secure networks to an elite clientele, including large corporations, government entities, and the U.S. military. With five satellites successfully launched and a robust pipeline of over $1 billion in commercial contracts, Astranis is set for growth as we prepare for numerous upcoming launches.We are the go-to satellite communications partner for clients demanding exceptional uptime, data security, network visibility, and tailored solutions. Backed by over $750 million from industry-leading investors such as Andreessen Horowitz, Blackrock, and Fidelity, our team of 450 engineers and entrepreneurs thrives in our 153,000 sq. ft. headquarters in Northern California.Senior Electrical Reliability EngineerAs a Senior Reliability Engineer at Astranis, you will be pivotal in ensuring that our spacecraft electronics and systems fulfill our reliability and availability requirements. Collaborating with a multidisciplinary engineering team, you will push the boundaries of geo-synchronous spacecraft design and achieve previously unattainable performance in space. Your expertise will ensure that Design for Reliability remains central to our engineering efforts.
About Our Team:Join the innovative Database Systems team at OpenAI, where we specialize in high-performance distributed databases. We are the architects behind Rockset, a cutting-edge real-time search, analytics, and vector database that powers all vector search and retrieval augmented generation (RAG) at OpenAI. Rockset underpins core functionalities across all OpenAI product lines and supports various critical internal applications.About the Role:We are in search of engineers who are passionate about distributed systems, performance optimization at a low level (with our core engine developed in C++), and constructing scalable database infrastructures from scratch. As a member of the Database Systems team, you will play a key role in enhancing the core database engine, making significant contributions to ingestion, query execution, indexing, and storage improvements. You will collaborate with multiple teams across OpenAI to unlock new product capabilities and ensure the reliability and scalability of our online database as usage expands exponentially.Your Responsibilities Will Include:Design, develop, and maintain high-performance distributed systems.Identify and address performance bottlenecks to elevate infrastructure capabilities.Define and guide the long-term technical vision and evolution of the system.Collaborate with product, engineering, and research teams to deliver robust and scalable infrastructure.Investigate complex production issues across the entire technology stack.Contribute to incident response, retrospective analyses, and establishing best practices for system reliability.You Will Excel In This Role If You:Possess substantial experience in building, scaling, and optimizing distributed systems.Exhibit a keen interest in database internals, storage engines, or low-latency query systems.Enjoy tackling complex performance challenges in high-throughput systems.Have experience managing and operating production clusters at scale (e.g., Kubernetes or similar orchestration tools).Approach scalability, correctness, and reliability with a rigorous mindset.Thrive in a fast-paced environment where you can make a significant impact.Qualifications:4+ years of relevant industry experience with a focus on distributed systems.Proficiency in C++ or similar low-level programming languages.Strong problem-solving skills and attention to detail.Experience with performance monitoring and optimization tools.Excellent collaboration and communication skills.
Full-time|$166.9K/yr - $225.9K/yr|Hybrid|Hybrid - San Francisco
Drata helps organizations demonstrate their commitment to security and integrity. The platform supports companies as they build and maintain trust with users, customers, partners, and prospects. Values Built on Trust: Consistency shapes decisions and actions. Integrity: Choosing to do what is right, every time. Customer-Obsessed: Prioritizing customer needs above all else. Competitive Fire: Striving for higher standards and greater achievements. Diversity: Welcoming different perspectives to encourage creative solutions. Automation First: Pursuing efficiency by saving time and resources wherever possible. How the Team Works Drata blends high standards with a supportive environment focused on growth. Team members are encouraged to own their work, improve continuously, and deliver meaningful results. The company values quick, informed decisions that drive immediate impact, while always keeping the mission and customer needs at the center. The San Francisco-based team uses a hybrid work model. Colleagues collaborate in the office Tuesday through Thursday, focusing on alignment and innovation. Mondays and Fridays offer flexibility for deep work or personal needs. Growth and Culture Drata has expanded to over 600 professionals worldwide, recognized for a culture that values trust, speed, and continuous learning. The environment supports both personal and professional development. See the Speed: CEO Adam Markowitz discusses Drata’s rapid journey to $100M ARR in four years. Hear the Voice of the Team: Employee stories highlight collaboration and growth at Drata.
Full-time|On-site|San Francisco, California, United States
We are seeking a talented and motivated Reliability Engineer to join our innovative team at Redwood Materials. In this role, you will be responsible for ensuring the reliability and performance of our cutting-edge energy storage systems. You will collaborate with cross-functional teams to develop and implement reliability engineering strategies that enhance product performance and longevity.
Join unify as a Staff Backend Engineer specializing in Reliability. In this pivotal role, you will be responsible for designing, developing, and maintaining backend systems that ensure the reliability and performance of our services. Collaborate with cross-functional teams to implement robust solutions and drive continuous improvement initiatives.
Join Our TeamAt Cognition, we are at the forefront of applied AI innovation, developing cutting-edge software agents that redefine the engineering landscape. Our flagship products, Devin, the pioneering AI software engineer, and Windsurf, an AI-native IDE, embody our commitment to creating AI that collaborates with engineers as a true partner.Our team is composed of elite talent including competitive programming champions, visionary founders, and researchers from top AI institutions such as Scale AI, Palantir, Cursor, Google DeepMind, and more.Your MissionAs a Site Reliability Engineer, you will play a crucial role in ensuring the reliability of our user-focused products, which are utilized by hundreds of thousands of developers daily. Your mission is to preemptively address potential issues and swiftly resolve any incidents that may arise, maintaining a seamless experience for our users.You will be responsible for overseeing production reliability and enhancing our platform engineering practices, encompassing SLOs, incident response, and on-call duties, alongside CI/CD pipelines, deployment infrastructure, and developer tools. At Cognition, we believe in integrating reliability into our systems rather than treating it as an afterthought, and we strive to cultivate a culture that reflects this philosophy.Your AchievementsProduction Reliability: Establish and manage SLOs, SLIs, and error budgets for our products. Develop robust monitoring, alerting, and observability systems to maintain a transparent view of service health.Incident Management: Spearhead incident response with precision and promptness. Conduct blameless postmortems to derive actionable insights from outages, and create effective runbooks and tools to enhance on-call sustainability.Platform Engineering: Oversee deployment pipelines and internal developer tools, ensuring rapid, reliable shipping of code while minimizing unnecessary toil for engineers.Infrastructure as Code: Manage cloud infrastructure via code, creating reproducible, auditable environments that can scale with product demands and mitigate configuration drift.Capacity Planning: Analyze growth trends, anticipate resource requirements, and ensure our infrastructure is always ahead of user demand, optimizing system performance proactively.Security and Reliability: Integrate security protocols with reliability practices to create a robust framework that safeguards our infrastructure.
Internship|$166K/yr - $225K/yr|On-site|San Francisco, California
P-97 At Databricks, we are on a mission to fundamentally simplify the entire data lifecycle—from ingestion and ETL to BI and ultimately to ML/AI—through a unified platform. We envision a future where the traditional data warehouse architecture is transformed by an innovative architectural model known as the Lakehouse (CIDR 2021 paper). This open platform merges data warehousing with advanced analytics, effectively addressing critical challenges such as data staleness, reliability, total cost of ownership, data lock-in, and limited use-case support. A key component in realizing this vision is the development of a next-generation decoupled query engine and structured storage system that surpasses the performance of specialized data warehouses while maintaining the flexibility of general-purpose systems like Spark™ to cater to a wide range of workloads, from ETL processes to data science applications. As a vital member of our team, you will engage in the design and implementation of these next-generation systems that aim to leapfrog the current state-of-the-art in the following areas: Query compilation and optimization Distributed query execution and scheduling Vectorized execution engine Data security Resource management Transaction coordination Efficient storage structures (encodings, indexes) Automatic physical data optimization
Jan 30, 2026
Sign in to browse more jobs
Create account — see all 5,176 results
Tailoring 0 resumes…
Tailoring 0 resumes…
We'll move completed jobs to Ready to Apply automatically.