Staff Site Reliability Engineer Federal jobs in Bellevue – Browse 186 openings on RoboApply Jobs
Staff Site Reliability Engineer Federal jobs in Bellevue
Open roles matching “Staff Site Reliability Engineer Federal” with location signals for Bellevue. 186 active listings on RoboApply Jobs.
186 jobs found
Staff Site Reliability Engineer - Federal
ZscalerBellevue, Washington, USA; San Jose, California, USA
On-site Full-time
Clicking Apply Now takes you to AutoApply where you can tailor your resume and apply.
Unlock Your Potential
Generate Job-Optimized Resume
One Click And Our AI Optimizes Your Resume to Match The Job Description.
Is Your Resume Optimized For This Role?
Find Out If You're Highlighting The Right Skills And Fix What's Missing
Experience Level
Experience
Qualifications
We are looking for candidates with a strong background in software engineering, systems architecture, and cloud services. Ideal applicants should have:Proficiency in programming languages such as Go, Python, or Java. Experience with containerization technologies like Docker and Kubernetes. Strong understanding of network protocols and security principles. Demonstrated ability to troubleshoot complex systems and implement automation solutions. Experience working in agile development environments.
About the job
Join Zscaler as a Staff Site Reliability Engineer focused on Federal missions. In this role, you will leverage your expertise in reliability engineering to enhance our cloud-based security platform while collaborating with cross-functional teams to optimize performance and scalability. Your contributions will be crucial in ensuring seamless, secure, and high-availability services for our government clients.
About Zscaler
Zscaler is a leading cloud security company that enables organizations to securely connect users to applications, no matter where they are. With a commitment to innovation, Zscaler's platform is designed to protect data and applications in the cloud, ensuring safety and compliance for businesses worldwide.
Full-time|On-site|Bellevue, Washington, USA; San Jose, California, USA
Join Zscaler as a Staff Site Reliability Engineer focused on Federal missions. In this role, you will leverage your expertise in reliability engineering to enhance our cloud-based security platform while collaborating with cross-functional teams to optimize performance and scalability. Your contributions will be crucial in ensuring seamless, secure, and high-availability services for our government clients.
Full-time|$194K/yr - $267K/yr|On-site|Bellevue, Washington; Chicago, Illinois; New York, New York; Washington, DC
Empower Every Identity, from AI to HumanAt Okta, we believe that identity is the cornerstone of unlocking the potential of AI. By building a trusted and neutral infrastructure, we enable organizations to confidently navigate this new era. This mission demands individuals who are relentless problem solvers, tackling complex issues with real-world significance. We seek builders and owners who act with urgency and execute with excellence.This is your chance to engage in career-defining work. If you share our commitment to this mission, let’s connect.Join the Workforce Identity Cloud TeamThe Okta Workforce Identity Cloud (WIC) facilitates secure, seamless access for your workforce, allowing you to prioritize strategic initiatives like cost reduction and enhanced customer service.If you thrive on challenges and are passionate about addressing large-scale automation, testing, and tuning issues, we would love to hear from you. The ideal candidate embodies the principle: "If you must do something more than once, automate it" and possesses a strong ability to quickly learn new tools and concepts.Position Overview:The Site Reliability Engineer (SRE) will be pivotal in designing and managing Kubernetes platforms that support cloud-native applications and services. This role emphasizes architecting and overseeing reliable, scalable, and secure Kubernetes-based environments on AWS, ensuring optimal performance and high availability while managing costs and automation. The perfect candidate will have hands-on experience with AWS infrastructure, Kubernetes platform development, Helm charts, Karpenter for scaling, and Istio service mesh.Key Responsibilities:Kubernetes Platform Development: Design, implement, and maintain Kubernetes platforms that are highly available, scalable, and fault-tolerant, ensuring they are optimized for production workloads.AWS Infrastructure Management: Build, manage, and optimize AWS cloud infrastructure, including EKS, ECS, S3, VPCs, RDS, IAM, and more, while implementing best practices for cost management and security.Helm Management: Use Helm to automate and streamline application and service deployment to Kubernetes clusters, creating and maintaining Helm charts for production-ready deployments.Karpenter Implementation: Implement and manage Karpenter for dynamic scaling of Kubernetes clusters to meet workload demands.Istio Service Mesh Management: Configure and manage Istio to facilitate service-to-service communication and security.
Are you prepared to transform the advertising landscape? At Cognitiv, we are not merely another AdTech firm—we are pioneers reshaping media buying with our advanced Deep Learning Advertising Platform. Since our inception in 2015, we have been leveraging state-of-the-art deep learning technologies and data science to redefine how brands engage with their audiences. Our mission is clear: to infuse intelligence into advertising, delivering unmatched precision, relevance, and impact at scale. Our innovative platform provides advertisers with unparalleled flexibility—whether activating Dynamic Deals through their preferred DSP, utilizing our managed service DSP, or tapping into our groundbreaking ContextGPT product. Joining Cognitiv means being at the forefront of AI-driven advertising solutions, leading change, and achieving remarkable growth in a fast-paced industry. We are currently expanding!The RoleWe are seeking a Senior Site Reliability Engineer to enhance our global network of datacenters and elevate service management across Cognitiv. Your primary focus will be on rapidly expanding our hybrid cloud infrastructure. As a growing organization, we strive to adhere to industry best practices. This position requires an experienced engineer who is eager to learn our environment quickly and help shape our long-term service management strategy.This role will be based in our Bellevue, WA office with a hybrid work schedule of 3 days in-office (Monday/Tuesday/Wednesday) and 2 days remote (Thursday/Friday).ResponsibilitiesDesign, implement, and maintain infrastructure across a widening footprint of co-located deployments.Assess existing physical and network architectures to ensure long-term scalability and growth.Collaborate with engineering and product teams to accurately scope projects based on core business requirements.Lead company-wide initiatives to enhance service management surrounding deployments, monitoring, and disaster recovery.Oversee and maintain shared infrastructure within our AWS environment.RequirementsUnderstanding of contemporary datacenter practices with experience in configuring multi-datacenter deployments.Extensive knowledge of AWS infrastructure, networking, and management practices.Demonstrated experience with infrastructure as code and related tools.
Full-time|$147K/yr - $202K/yr|On-site|Bellevue, Washington
About OktaOkta stands as the leader in identity solutions, empowering individuals to securely engage with any technology, on any device, and through any application. Our versatile products, including the Okta Platform and Auth0 Platform, ensure safe access and authentication, placing identity at the forefront of security and business growth.At Okta, we embrace diverse perspectives and experiences. We are not searching for someone who checks all the boxes; rather, we value lifelong learners who can enrich our team with their unique backgrounds.Join us in crafting a future where identity is truly yours.Position Overview:We are looking for a highly skilled Senior Observability Site Reliability Engineer with a focus on Splunk to take ownership and enhance our Splunk ecosystem. In this role, you will go beyond traditional monitoring, creating a comprehensive and scalable Observability Platform that empowers our SRE teams and business stakeholders. You will treat infrastructure as code, leveraging Terraform alongside proficient coding skills in Go, Python, or Ruby to automate deployment across complex distributed systems.Key ResponsibilitiesAutomated Infrastructure: Design, build, and maintain scalable observability infrastructure utilizing tools like Terraform.Splunk Engineering: Enhance the collection, processing, and storage of log data to ensure our Splunk services are highly reliable and low-latency.Incident Response: Engage in on-call rotations and lead post-incident reviews to drive systemic improvements and promote 'observability-driven development.'Automation: Minimize 'toil' by automating the deployment and scaling of observability agents and collectors.
Join CoreWeave as a Senior Site Reliability Engineer specializing in Data Infrastructure. In this pivotal role, you will ensure the reliability and sustainability of our data systems, working closely with our development teams to optimize performance and availability. You will be instrumental in enhancing our infrastructure to support the growing needs of our clients.
Full-time|$147K/yr - $202.4K/yr|On-site|Bellevue, Washington
Discover OktaAt Okta, we are redefining the identity landscape. As the World’s Identity Company, we empower individuals to securely access any technology, from any device or application, anywhere in the world. Our versatile products, including the Okta Platform and Auth0 Platform, focus on providing secure access, authentication, and automation—making identity central to business security and growth.We value diverse perspectives and experiences and believe that innovation comes from a team of lifelong learners. Join us in our mission to create a world where identity is truly yours.Senior Site Reliability Engineer (SRE) - Security and Data SystemsWe are on the lookout for an experienced Senior Site Reliability Engineer to join our dynamic team. As a leading SaaS company focused on securing extensive systems, this role merges software engineering with systems administration. You will be instrumental in developing and sustaining a highly reliable, scalable, and secure infrastructure. Your expertise will be vital in automating manual processes, proactively addressing complex challenges before they escalate into incidents, and responding to critical incidents, including participating in on-call shifts.
Full-time|On-site|Bellevue, Washington, USA; San Jose, California, USA
Join Zscaler as a Staff DevOps Engineer and play a pivotal role in enhancing our cloud security platform. In this position, you will collaborate with cross-functional teams to streamline our deployment processes, automate workflows, and improve system reliability. We are looking for a passionate professional who thrives in a dynamic environment and is eager to tackle complex challenges.
About the CompanyArmada is a pioneering startup focused on edge computing, aiming to provide advanced computing infrastructure to underserved remote areas with limited connectivity. We strive to facilitate local data processing for real-time analytics and AI applications at the edge. Our goal is to bridge the digital divide through cutting-edge technology that can be deployed swiftly in any location. About the RoleThe Director of Federal Engineering will be responsible for leading Armada’s engineering team focused on TS/SCI, FedRAMP, and DoD-accredited programs. This role requires a TS/SCI clearance and involves defining technical strategies, guiding multiple software teams, and ensuring the delivery of secure, high-performance systems in classified, air-gapped, and GovCloud environments. This position demands a unique blend of technical expertise, federal program leadership, and accountability to uphold Armada’s mission standards.Collaboration with our VP of Federal will be key, as you align engineering outputs with our Federal growth strategies, customer mission objectives, and compliance benchmarks. Together, you will ensure our technology meets the rigorous technical, operational, and regulatory expectations of our federal partners.Preferred Locations: Seattle, WA; Virginia / Washington, DC metro area; Austin, TX. What You Will Do (Key Responsibilities)Security & ComplianceOversee technical readiness for FedRAMP Moderate/High and DoD Impact Level 4/5 environments.Partner with Security & Compliance teams to achieve and maintain Authority to Operate (ATO) certifications.Implement Zero Trust architecture, enclave segmentation, and continuous monitoring in alignment with NIST 800-53, FISMA, and RMF.Lead vulnerability management, compliance automation, and audit evidence generation.Act as the primary engineering liaison for 3PAOs, assessors, and federal partners during accreditation processes.Technical LeadershipDefine and execute the ...
Discover OktaOkta is recognized as The World’s Identity Company, empowering individuals to securely utilize technology, no matter the device or application. Our versatile and neutral solutions, including the Okta Platform and Auth0 Platform, ensure secure access, robust authentication, and streamlined automation, positioning identity at the heart of business security and advancement.At Okta, we value diverse perspectives and experiences. We don’t seek someone who ticks every box; instead, we welcome lifelong learners who can enhance our team with their unique backgrounds.Join us in creating a world where identity truly belongs to you.The Infrastructure Platform and Shared Services TeamOkta manages the authentication, authorization, and provisioning for millions of users every day. Our services are hosted on Amazon Web Services (AWS), spanning multiple availability zones and geographically diverse regions, designed for high throughput and 99.999% availability. We are searching for a technical leader to help us scale our service with exceptional talent and reliable, cost-effective, and efficient infrastructure, processes, and tools.As the Senior Manager of Infrastructure Platform and Shared Services, you will lead multiple teams focused on Edge networking, Kubernetes (K8s) platforms, Continuous Integration/Continuous Deployment (CI/CD), observability, automation platforms, and tooling.Your ResponsibilitiesDirect the Infrastructure platform and shared services organization, driving initiatives across the SRE and Infrastructure teams.Steer the DevOps transformation, microservices journey, and next-generation infrastructure platform capabilities in collaboration with architects and product engineering.Create a world-class observability platform with advanced monitoring capabilities that enable self-service.Enhance SRE and product engineering velocity by developing robust platforms, powerful tools, and user-friendly self-service capabilities.Oversee the design and operation of scalable, self-service cloud infrastructure platforms (e.g., Kubernetes, service mesh, CI/CD pipelines, Infrastructure as Code (IaC), and Edge Infrastructure).Lead, mentor, and nurture a high-performing team of engineers and managers across platform, infrastructure, and shared services domains.Conduct engineering design evaluations and ensure project completion within resource, budget, and scheduling constraints.
Full-time|On-site|Bellevue, Washington; Seattle, Washington
Join Databricks as a Senior Staff Technical Program Manager specializing in Reliability. In this pivotal role, you will lead initiatives that enhance system reliability, ensure seamless operations, and drive innovation within our engineering teams. Your expertise will be critical in shaping our technical roadmap and delivering high-quality solutions that meet our customer needs.
About the CompanyArmada is an innovative edge computing startup dedicated to providing cutting-edge computing infrastructure in remote regions where connectivity and cloud services are scarce. Our mission is to bridge the digital divide by deploying advanced technology infrastructure that enables real-time analytics and AI capabilities at the edge. We are seeking exceptionally talented individuals to join us in our journey to transform how data is processed and utilized globally. About the RoleAs a Senior Software Engineer specializing in the open-source ecosystem, you will play a pivotal role in designing, developing, and maintaining applications and services that operate on container runtimes such as Docker. You will collaborate closely with our DevOps and Infrastructure teams to ensure efficient, scalable, and robust deployment processes. Your work will focus on delivering high-performance networking solutions tailored for software-defined networks, telecommunications, and IoT applications.Location: This position is office-based at our Bellevue, Washington office.Key ResponsibilitiesDevelop and maintain microservices and applications using Golang.Create features for dynamic network management, including auto-failover, load balancing, and path selection based on real-time network conditions.Implement monitoring and alerting systems to guarantee high availability and performance for deployed SD-WAN services.Design and build scalable APIs and services that facilitate network automation, policy enforcement, and optimized traffic routing.Work collaboratively with cross-functional teams to define, design, and deliver new features.Debug and resolve issues within Kubernetes clusters and applications.Adopt best practices for CI/CD pipelines, monitoring, and logging.Write comprehensive tests to ensure code reliability and stability.Stay informed on the latest industry trends and technologies in software-defined networks, Kubernetes, and cloud-native development.
Full-time|$114K/yr - $157.3K/yr|On-site|Bellevue, Washington; Chicago, Illinois; Washington, DC
About Okta Federal Okta secures identity for both AI and human users, providing trusted infrastructure for organizations navigating complex security challenges. The Federal team supports the U.S. Government’s identity infrastructure, focusing on excellence and real-world impact. Role Overview The Senior Technical Support Engineer - Federal (Night Shift) works directly with federal customers, supporting critical Identity and Access Management (IAM) systems in FedRAMP High and Moderate environments. This position is part of the frontline support team, handling technical issues for U.S. Federal Government clients. Key Responsibilities Work Monday through Friday, 3 PM to Midnight Pacific Time. Participate in regular on-call rotations, including weeklong, weekend, and holiday coverage. Provide end-to-end ownership of customer issues: from first contact, through troubleshooting and root cause analysis, to final resolution. Act as a customer advocate, ensuring business impacts are understood and problems are resolved promptly. Consistently meet or exceed KPIs for response quality, timeliness, and customer satisfaction. Serve as the main point of contact for both internal and external stakeholders to facilitate efficient issue resolution. Work with Engineering to gather details and document product issues affecting customers. Requirements U.S. citizenship and residency on U.S. soil (required due to federal contract requirements). Extensive experience with Identity and Access Management (IAM) systems. Background working in FedRAMP High or Moderate environments. Locations Bellevue, Washington; Chicago, Illinois; Washington, DC
Full-time|On-site|Bellevue, Washington; Seattle, Washington
Join our dynamic team as a Staff Database Engineer at Databricks, where you'll play a critical role in designing, implementing, and optimizing our database systems. This is an exciting opportunity to work with cutting-edge technology and collaborate with talented professionals in a fast-paced environment.
Discover OktaAt Okta, we are the leading identity company, empowering individuals to securely access any technology, anywhere, on any device or application. Our versatile products, including the Okta Platform and Auth0 Platform, ensure secure access, authentication, and automation while placing identity at the heart of business security and growth.We value diverse perspectives and experiences at Okta. We seek lifelong learners and individuals who can contribute to our mission with their unique insights. Join us in creating a world where identity is truly yours.We are currently seeking an experienced Staff Software Engineer to join our Auth0 Security Engineering team. In this role, you will design and implement security guardrails for our multi-cloud environment, translating intricate security and compliance standards into programmatic, code-driven policies.
Full-time|$182.4K/yr - $247K/yr|On-site|Bellevue, Washington
P-940 This position is open to our offices in both Seattle and Bellevue. At Databricks, we are dedicated to empowering data teams to tackle the world's most pressing challenges, from detecting security threats to innovating in cancer drug development. By constructing and managing the globe's premier data and AI infrastructure platform, we enable our clients to concentrate on the critical challenges central to their missions. Founded in 2013 by the original architects of Apache Spark™, Databricks has expanded from a modest office in Berkeley, California to a global powerhouse with over 1,000 employees. We are proud to be one of the fastest-growing SaaS companies, trusted by thousands of organizations ranging from startups to Fortune 100 companies with their most vital workloads. Our engineering teams develop highly technical products that address significant global needs. We continually push the limits of data and AI technology, while ensuring the resilience, security, and scalability essential for our customers' success on our platform. We operate one of the largest-scale software platforms, comprised of millions of virtual machines that generate terabytes of logs and process exabytes of data daily. Given our scale, we routinely encounter cloud hardware, network, and operating system faults, and our software is engineered to seamlessly shield customers from these issues. As a backend-focused software engineer, you will collaborate closely with your team and product management to prioritize, design, implement, test, and operate micro-services for the Databricks platform and products. This role involves writing software in Scala/Java, building data pipelines (Apache Spark™, Apache Kafka), integrating with third-party applications, and engaging with cloud APIs (AWS, Azure, CloudFormation, Terraform). Join one of our dynamic teams, such as: Data Science and Machine Learning Infrastructure: Develop services and infrastructure at the nexus of machine learning and distributed systems. Our technology powers the flagship collaborative workspace, notebooks, IDE integrations, and project management tools. We facilitate machine learning at scale with tools for environment management, distributed training, and managing the machine learning lifecycle through MLflow. Compute Fabric: Create the resource management infrastructure that supports all big data and machine learning workloads on the Databricks platform in a robust, flexible, and secure manner.
Full-time|On-site|Livingston, NJ / New York, NY / Sunnyvale, CA / Bellevue, WA
CoreWeave is seeking a Staff Security Engineer with deep experience in PKI and secrets management. This position is key to strengthening the security of our cloud infrastructure across multiple locations, including Livingston, NJ, New York, NY, Sunnyvale, CA, and Bellevue, WA. Role overview This role focuses on designing and implementing secure public key infrastructures (PKI) and managing sensitive secrets across our platforms. The Staff Security Engineer will take the lead on projects that directly impact the integrity and safety of our systems. What you will do Lead the creation and deployment of PKI solutions for cloud environments. Develop and maintain processes for secure secrets management. Work to ensure compliance and reduce security risks as CoreWeave grows. Requirements Expertise in PKI design and secrets management practices. Experience supporting cloud infrastructure security. Ability to lead initiatives and collaborate with technical teams.
Full-time|$181.3K/yr - $261K/yr|On-site|Bellevue, Washington
OverviewAt Databricks, we are dedicated to empowering data teams to tackle the world’s most challenging issues, ranging from security threat detection to cancer drug development. Our mission is to create and maintain the leading data and AI infrastructure platform, enabling our clients to concentrate on the critical challenges central to their missions.Our engineering teams develop sophisticated products that meet significant real-world needs. We consistently push the limits of data and AI technology while ensuring the resilience, security, and scalability essential for our customers' success on our platform.Our clients entrust us with their most sensitive data, and the Trust & Safety team is committed to establishing the most credible data analytics and machine learning platform globally. Security Engineering plays a crucial role within Trust & Safety, working diligently to safeguard customer data against malicious threats. We are searching for senior leaders like you to craft the vision and define the strategy for this vital area.
Full-time|On-site|Bellevue, Washington, United States; Seattle, Washington, United States
As a Staff Systems Engineer specializing in Warfighter Systems at Anduril Industries, you will play a crucial role in developing and integrating cutting-edge systems that support our military operations. You will work collaboratively with a talented team to innovate solutions that enhance our national security.
Be a Part of the Future of Finance!At Robinhood, we are dedicated to democratizing finance for everyone. With an estimated $124 trillion set to be inherited by younger generations in the next twenty years, you have the opportunity to be at the forefront of this monumental cultural and financial transformation.Join Our TeamWe are assembling an exceptional team that leverages advanced technologies to tackle the most significant challenges in finance. We seek innovative thinkers and adept problem-solvers—builders who are driven to create meaningful change. Robinhood is not a place for mediocrity; it's where motivated individuals achieve the best work of their careers. Our high-performing, fast-paced team operates with ethics at the center of our mission, and we offer high expectations alongside rewarding experiences.The Storage Platform team is responsible for developing and managing the platform that facilitates database access across Robinhood. We oversee relational databases (Postgres/Aurora), key-value stores (DynamoDB), and caching systems, alongside the SDKs and automation tools that enable safe, reliable access at scale. Our goal is to standardize and enhance how services connect to storage, boost reliability and performance, and minimize operational overhead via automation. We manage thousands of databases and numerous caching clusters that support millions of users and critical brokerage operations. Ensuring availability is paramount—our systems are crafted to meet rigorous uptime requirements, including zero downtime during market hours.As a Staff Software Engineer, you will architect and refine the core infrastructure that underlies Robinhood’s storage systems. You will lead complex distributed systems projects, such as horizontal sharding, proxy-based query routing, connection pooling, and cross-shard transactions. Your work will focus on enhancing database reliability, performance, and cost-effectiveness across multi-region deployments. Your contributions will directly influence system availability, latency, and resilience for both customer-facing products and internal engineering teams.Explore a few insights from our team:Listen to our director discuss the impactful work we are doing!Read our blog post detailing our innovative ecosystem.
Visa U.S.A. Inc., a part of Visa Inc., is seeking talented Staff Software Engineers (multiple openings) in Bellevue, WA to:Architect, design, and develop complex software applications and microservices using technologies like Java, Spring Boot, Kafka, and MySQL.Evaluate and integrate cutting-edge technology tools and processes that foster the development of innovative products and solutions, enhancing operational efficiency and opening new business avenues.Leverage generative AI within the development lifecycle to improve automation, code generation, problem-solving, and overall process optimization.Create and implement a unified control center and monitoring tools to assess, monitor, and report on platform health, data availability, and capacity utilization trends.Conduct thorough business and technical analyses, perform code reviews, and execute unit testing to ensure adherence to quality and development compliance.Design and implement modifications and fixes to existing software, including debugging routines and resolving codebase issues.Optimize database queries and write procedures for significant project deployments.Apply established standards, processes, and tools throughout the secure software development life cycle (SSDLC) to support engineering applications and products.Engage in cross-functional collaboration with architects, systems analysts, project managers, QA, and fellow developers to fulfill business requirements using state-of-the-art tools and technologies.Ensure timely project delivery, establish production support plans, and facilitate knowledge transfer for the long-term maintainability of upgrades, enhancements, and deployments.Mentor junior developers and contribute to fostering a culture of continuous improvement, scalability, and high performance in software solutions.Generate technical documentation for new developments, system enhancements, and production support.Monitor platform health, produce performance reports, and promote ongoing improvements.This position is based in the Bellevue, Washington office, with potential options for partial telecommuting.
Full-time|On-site|Bellevue, Washington, USA; San Jose, California, USA
Join Zscaler as a Staff Site Reliability Engineer focused on Federal missions. In this role, you will leverage your expertise in reliability engineering to enhance our cloud-based security platform while collaborating with cross-functional teams to optimize performance and scalability. Your contributions will be crucial in ensuring seamless, secure, and high-availability services for our government clients.
Full-time|$194K/yr - $267K/yr|On-site|Bellevue, Washington; Chicago, Illinois; New York, New York; Washington, DC
Empower Every Identity, from AI to HumanAt Okta, we believe that identity is the cornerstone of unlocking the potential of AI. By building a trusted and neutral infrastructure, we enable organizations to confidently navigate this new era. This mission demands individuals who are relentless problem solvers, tackling complex issues with real-world significance. We seek builders and owners who act with urgency and execute with excellence.This is your chance to engage in career-defining work. If you share our commitment to this mission, let’s connect.Join the Workforce Identity Cloud TeamThe Okta Workforce Identity Cloud (WIC) facilitates secure, seamless access for your workforce, allowing you to prioritize strategic initiatives like cost reduction and enhanced customer service.If you thrive on challenges and are passionate about addressing large-scale automation, testing, and tuning issues, we would love to hear from you. The ideal candidate embodies the principle: "If you must do something more than once, automate it" and possesses a strong ability to quickly learn new tools and concepts.Position Overview:The Site Reliability Engineer (SRE) will be pivotal in designing and managing Kubernetes platforms that support cloud-native applications and services. This role emphasizes architecting and overseeing reliable, scalable, and secure Kubernetes-based environments on AWS, ensuring optimal performance and high availability while managing costs and automation. The perfect candidate will have hands-on experience with AWS infrastructure, Kubernetes platform development, Helm charts, Karpenter for scaling, and Istio service mesh.Key Responsibilities:Kubernetes Platform Development: Design, implement, and maintain Kubernetes platforms that are highly available, scalable, and fault-tolerant, ensuring they are optimized for production workloads.AWS Infrastructure Management: Build, manage, and optimize AWS cloud infrastructure, including EKS, ECS, S3, VPCs, RDS, IAM, and more, while implementing best practices for cost management and security.Helm Management: Use Helm to automate and streamline application and service deployment to Kubernetes clusters, creating and maintaining Helm charts for production-ready deployments.Karpenter Implementation: Implement and manage Karpenter for dynamic scaling of Kubernetes clusters to meet workload demands.Istio Service Mesh Management: Configure and manage Istio to facilitate service-to-service communication and security.
Are you prepared to transform the advertising landscape? At Cognitiv, we are not merely another AdTech firm—we are pioneers reshaping media buying with our advanced Deep Learning Advertising Platform. Since our inception in 2015, we have been leveraging state-of-the-art deep learning technologies and data science to redefine how brands engage with their audiences. Our mission is clear: to infuse intelligence into advertising, delivering unmatched precision, relevance, and impact at scale. Our innovative platform provides advertisers with unparalleled flexibility—whether activating Dynamic Deals through their preferred DSP, utilizing our managed service DSP, or tapping into our groundbreaking ContextGPT product. Joining Cognitiv means being at the forefront of AI-driven advertising solutions, leading change, and achieving remarkable growth in a fast-paced industry. We are currently expanding!The RoleWe are seeking a Senior Site Reliability Engineer to enhance our global network of datacenters and elevate service management across Cognitiv. Your primary focus will be on rapidly expanding our hybrid cloud infrastructure. As a growing organization, we strive to adhere to industry best practices. This position requires an experienced engineer who is eager to learn our environment quickly and help shape our long-term service management strategy.This role will be based in our Bellevue, WA office with a hybrid work schedule of 3 days in-office (Monday/Tuesday/Wednesday) and 2 days remote (Thursday/Friday).ResponsibilitiesDesign, implement, and maintain infrastructure across a widening footprint of co-located deployments.Assess existing physical and network architectures to ensure long-term scalability and growth.Collaborate with engineering and product teams to accurately scope projects based on core business requirements.Lead company-wide initiatives to enhance service management surrounding deployments, monitoring, and disaster recovery.Oversee and maintain shared infrastructure within our AWS environment.RequirementsUnderstanding of contemporary datacenter practices with experience in configuring multi-datacenter deployments.Extensive knowledge of AWS infrastructure, networking, and management practices.Demonstrated experience with infrastructure as code and related tools.
Full-time|$147K/yr - $202K/yr|On-site|Bellevue, Washington
About OktaOkta stands as the leader in identity solutions, empowering individuals to securely engage with any technology, on any device, and through any application. Our versatile products, including the Okta Platform and Auth0 Platform, ensure safe access and authentication, placing identity at the forefront of security and business growth.At Okta, we embrace diverse perspectives and experiences. We are not searching for someone who checks all the boxes; rather, we value lifelong learners who can enrich our team with their unique backgrounds.Join us in crafting a future where identity is truly yours.Position Overview:We are looking for a highly skilled Senior Observability Site Reliability Engineer with a focus on Splunk to take ownership and enhance our Splunk ecosystem. In this role, you will go beyond traditional monitoring, creating a comprehensive and scalable Observability Platform that empowers our SRE teams and business stakeholders. You will treat infrastructure as code, leveraging Terraform alongside proficient coding skills in Go, Python, or Ruby to automate deployment across complex distributed systems.Key ResponsibilitiesAutomated Infrastructure: Design, build, and maintain scalable observability infrastructure utilizing tools like Terraform.Splunk Engineering: Enhance the collection, processing, and storage of log data to ensure our Splunk services are highly reliable and low-latency.Incident Response: Engage in on-call rotations and lead post-incident reviews to drive systemic improvements and promote 'observability-driven development.'Automation: Minimize 'toil' by automating the deployment and scaling of observability agents and collectors.
Join CoreWeave as a Senior Site Reliability Engineer specializing in Data Infrastructure. In this pivotal role, you will ensure the reliability and sustainability of our data systems, working closely with our development teams to optimize performance and availability. You will be instrumental in enhancing our infrastructure to support the growing needs of our clients.
Full-time|$147K/yr - $202.4K/yr|On-site|Bellevue, Washington
Discover OktaAt Okta, we are redefining the identity landscape. As the World’s Identity Company, we empower individuals to securely access any technology, from any device or application, anywhere in the world. Our versatile products, including the Okta Platform and Auth0 Platform, focus on providing secure access, authentication, and automation—making identity central to business security and growth.We value diverse perspectives and experiences and believe that innovation comes from a team of lifelong learners. Join us in our mission to create a world where identity is truly yours.Senior Site Reliability Engineer (SRE) - Security and Data SystemsWe are on the lookout for an experienced Senior Site Reliability Engineer to join our dynamic team. As a leading SaaS company focused on securing extensive systems, this role merges software engineering with systems administration. You will be instrumental in developing and sustaining a highly reliable, scalable, and secure infrastructure. Your expertise will be vital in automating manual processes, proactively addressing complex challenges before they escalate into incidents, and responding to critical incidents, including participating in on-call shifts.
Full-time|On-site|Bellevue, Washington, USA; San Jose, California, USA
Join Zscaler as a Staff DevOps Engineer and play a pivotal role in enhancing our cloud security platform. In this position, you will collaborate with cross-functional teams to streamline our deployment processes, automate workflows, and improve system reliability. We are looking for a passionate professional who thrives in a dynamic environment and is eager to tackle complex challenges.
About the CompanyArmada is a pioneering startup focused on edge computing, aiming to provide advanced computing infrastructure to underserved remote areas with limited connectivity. We strive to facilitate local data processing for real-time analytics and AI applications at the edge. Our goal is to bridge the digital divide through cutting-edge technology that can be deployed swiftly in any location. About the RoleThe Director of Federal Engineering will be responsible for leading Armada’s engineering team focused on TS/SCI, FedRAMP, and DoD-accredited programs. This role requires a TS/SCI clearance and involves defining technical strategies, guiding multiple software teams, and ensuring the delivery of secure, high-performance systems in classified, air-gapped, and GovCloud environments. This position demands a unique blend of technical expertise, federal program leadership, and accountability to uphold Armada’s mission standards.Collaboration with our VP of Federal will be key, as you align engineering outputs with our Federal growth strategies, customer mission objectives, and compliance benchmarks. Together, you will ensure our technology meets the rigorous technical, operational, and regulatory expectations of our federal partners.Preferred Locations: Seattle, WA; Virginia / Washington, DC metro area; Austin, TX. What You Will Do (Key Responsibilities)Security & ComplianceOversee technical readiness for FedRAMP Moderate/High and DoD Impact Level 4/5 environments.Partner with Security & Compliance teams to achieve and maintain Authority to Operate (ATO) certifications.Implement Zero Trust architecture, enclave segmentation, and continuous monitoring in alignment with NIST 800-53, FISMA, and RMF.Lead vulnerability management, compliance automation, and audit evidence generation.Act as the primary engineering liaison for 3PAOs, assessors, and federal partners during accreditation processes.Technical LeadershipDefine and execute the ...
Discover OktaOkta is recognized as The World’s Identity Company, empowering individuals to securely utilize technology, no matter the device or application. Our versatile and neutral solutions, including the Okta Platform and Auth0 Platform, ensure secure access, robust authentication, and streamlined automation, positioning identity at the heart of business security and advancement.At Okta, we value diverse perspectives and experiences. We don’t seek someone who ticks every box; instead, we welcome lifelong learners who can enhance our team with their unique backgrounds.Join us in creating a world where identity truly belongs to you.The Infrastructure Platform and Shared Services TeamOkta manages the authentication, authorization, and provisioning for millions of users every day. Our services are hosted on Amazon Web Services (AWS), spanning multiple availability zones and geographically diverse regions, designed for high throughput and 99.999% availability. We are searching for a technical leader to help us scale our service with exceptional talent and reliable, cost-effective, and efficient infrastructure, processes, and tools.As the Senior Manager of Infrastructure Platform and Shared Services, you will lead multiple teams focused on Edge networking, Kubernetes (K8s) platforms, Continuous Integration/Continuous Deployment (CI/CD), observability, automation platforms, and tooling.Your ResponsibilitiesDirect the Infrastructure platform and shared services organization, driving initiatives across the SRE and Infrastructure teams.Steer the DevOps transformation, microservices journey, and next-generation infrastructure platform capabilities in collaboration with architects and product engineering.Create a world-class observability platform with advanced monitoring capabilities that enable self-service.Enhance SRE and product engineering velocity by developing robust platforms, powerful tools, and user-friendly self-service capabilities.Oversee the design and operation of scalable, self-service cloud infrastructure platforms (e.g., Kubernetes, service mesh, CI/CD pipelines, Infrastructure as Code (IaC), and Edge Infrastructure).Lead, mentor, and nurture a high-performing team of engineers and managers across platform, infrastructure, and shared services domains.Conduct engineering design evaluations and ensure project completion within resource, budget, and scheduling constraints.
Full-time|On-site|Bellevue, Washington; Seattle, Washington
Join Databricks as a Senior Staff Technical Program Manager specializing in Reliability. In this pivotal role, you will lead initiatives that enhance system reliability, ensure seamless operations, and drive innovation within our engineering teams. Your expertise will be critical in shaping our technical roadmap and delivering high-quality solutions that meet our customer needs.
About the CompanyArmada is an innovative edge computing startup dedicated to providing cutting-edge computing infrastructure in remote regions where connectivity and cloud services are scarce. Our mission is to bridge the digital divide by deploying advanced technology infrastructure that enables real-time analytics and AI capabilities at the edge. We are seeking exceptionally talented individuals to join us in our journey to transform how data is processed and utilized globally. About the RoleAs a Senior Software Engineer specializing in the open-source ecosystem, you will play a pivotal role in designing, developing, and maintaining applications and services that operate on container runtimes such as Docker. You will collaborate closely with our DevOps and Infrastructure teams to ensure efficient, scalable, and robust deployment processes. Your work will focus on delivering high-performance networking solutions tailored for software-defined networks, telecommunications, and IoT applications.Location: This position is office-based at our Bellevue, Washington office.Key ResponsibilitiesDevelop and maintain microservices and applications using Golang.Create features for dynamic network management, including auto-failover, load balancing, and path selection based on real-time network conditions.Implement monitoring and alerting systems to guarantee high availability and performance for deployed SD-WAN services.Design and build scalable APIs and services that facilitate network automation, policy enforcement, and optimized traffic routing.Work collaboratively with cross-functional teams to define, design, and deliver new features.Debug and resolve issues within Kubernetes clusters and applications.Adopt best practices for CI/CD pipelines, monitoring, and logging.Write comprehensive tests to ensure code reliability and stability.Stay informed on the latest industry trends and technologies in software-defined networks, Kubernetes, and cloud-native development.
Full-time|$114K/yr - $157.3K/yr|On-site|Bellevue, Washington; Chicago, Illinois; Washington, DC
About Okta Federal Okta secures identity for both AI and human users, providing trusted infrastructure for organizations navigating complex security challenges. The Federal team supports the U.S. Government’s identity infrastructure, focusing on excellence and real-world impact. Role Overview The Senior Technical Support Engineer - Federal (Night Shift) works directly with federal customers, supporting critical Identity and Access Management (IAM) systems in FedRAMP High and Moderate environments. This position is part of the frontline support team, handling technical issues for U.S. Federal Government clients. Key Responsibilities Work Monday through Friday, 3 PM to Midnight Pacific Time. Participate in regular on-call rotations, including weeklong, weekend, and holiday coverage. Provide end-to-end ownership of customer issues: from first contact, through troubleshooting and root cause analysis, to final resolution. Act as a customer advocate, ensuring business impacts are understood and problems are resolved promptly. Consistently meet or exceed KPIs for response quality, timeliness, and customer satisfaction. Serve as the main point of contact for both internal and external stakeholders to facilitate efficient issue resolution. Work with Engineering to gather details and document product issues affecting customers. Requirements U.S. citizenship and residency on U.S. soil (required due to federal contract requirements). Extensive experience with Identity and Access Management (IAM) systems. Background working in FedRAMP High or Moderate environments. Locations Bellevue, Washington; Chicago, Illinois; Washington, DC
Full-time|On-site|Bellevue, Washington; Seattle, Washington
Join our dynamic team as a Staff Database Engineer at Databricks, where you'll play a critical role in designing, implementing, and optimizing our database systems. This is an exciting opportunity to work with cutting-edge technology and collaborate with talented professionals in a fast-paced environment.
Discover OktaAt Okta, we are the leading identity company, empowering individuals to securely access any technology, anywhere, on any device or application. Our versatile products, including the Okta Platform and Auth0 Platform, ensure secure access, authentication, and automation while placing identity at the heart of business security and growth.We value diverse perspectives and experiences at Okta. We seek lifelong learners and individuals who can contribute to our mission with their unique insights. Join us in creating a world where identity is truly yours.We are currently seeking an experienced Staff Software Engineer to join our Auth0 Security Engineering team. In this role, you will design and implement security guardrails for our multi-cloud environment, translating intricate security and compliance standards into programmatic, code-driven policies.
Full-time|$182.4K/yr - $247K/yr|On-site|Bellevue, Washington
P-940 This position is open to our offices in both Seattle and Bellevue. At Databricks, we are dedicated to empowering data teams to tackle the world's most pressing challenges, from detecting security threats to innovating in cancer drug development. By constructing and managing the globe's premier data and AI infrastructure platform, we enable our clients to concentrate on the critical challenges central to their missions. Founded in 2013 by the original architects of Apache Spark™, Databricks has expanded from a modest office in Berkeley, California to a global powerhouse with over 1,000 employees. We are proud to be one of the fastest-growing SaaS companies, trusted by thousands of organizations ranging from startups to Fortune 100 companies with their most vital workloads. Our engineering teams develop highly technical products that address significant global needs. We continually push the limits of data and AI technology, while ensuring the resilience, security, and scalability essential for our customers' success on our platform. We operate one of the largest-scale software platforms, comprised of millions of virtual machines that generate terabytes of logs and process exabytes of data daily. Given our scale, we routinely encounter cloud hardware, network, and operating system faults, and our software is engineered to seamlessly shield customers from these issues. As a backend-focused software engineer, you will collaborate closely with your team and product management to prioritize, design, implement, test, and operate micro-services for the Databricks platform and products. This role involves writing software in Scala/Java, building data pipelines (Apache Spark™, Apache Kafka), integrating with third-party applications, and engaging with cloud APIs (AWS, Azure, CloudFormation, Terraform). Join one of our dynamic teams, such as: Data Science and Machine Learning Infrastructure: Develop services and infrastructure at the nexus of machine learning and distributed systems. Our technology powers the flagship collaborative workspace, notebooks, IDE integrations, and project management tools. We facilitate machine learning at scale with tools for environment management, distributed training, and managing the machine learning lifecycle through MLflow. Compute Fabric: Create the resource management infrastructure that supports all big data and machine learning workloads on the Databricks platform in a robust, flexible, and secure manner.
Full-time|On-site|Livingston, NJ / New York, NY / Sunnyvale, CA / Bellevue, WA
CoreWeave is seeking a Staff Security Engineer with deep experience in PKI and secrets management. This position is key to strengthening the security of our cloud infrastructure across multiple locations, including Livingston, NJ, New York, NY, Sunnyvale, CA, and Bellevue, WA. Role overview This role focuses on designing and implementing secure public key infrastructures (PKI) and managing sensitive secrets across our platforms. The Staff Security Engineer will take the lead on projects that directly impact the integrity and safety of our systems. What you will do Lead the creation and deployment of PKI solutions for cloud environments. Develop and maintain processes for secure secrets management. Work to ensure compliance and reduce security risks as CoreWeave grows. Requirements Expertise in PKI design and secrets management practices. Experience supporting cloud infrastructure security. Ability to lead initiatives and collaborate with technical teams.
Full-time|$181.3K/yr - $261K/yr|On-site|Bellevue, Washington
OverviewAt Databricks, we are dedicated to empowering data teams to tackle the world’s most challenging issues, ranging from security threat detection to cancer drug development. Our mission is to create and maintain the leading data and AI infrastructure platform, enabling our clients to concentrate on the critical challenges central to their missions.Our engineering teams develop sophisticated products that meet significant real-world needs. We consistently push the limits of data and AI technology while ensuring the resilience, security, and scalability essential for our customers' success on our platform.Our clients entrust us with their most sensitive data, and the Trust & Safety team is committed to establishing the most credible data analytics and machine learning platform globally. Security Engineering plays a crucial role within Trust & Safety, working diligently to safeguard customer data against malicious threats. We are searching for senior leaders like you to craft the vision and define the strategy for this vital area.
Full-time|On-site|Bellevue, Washington, United States; Seattle, Washington, United States
As a Staff Systems Engineer specializing in Warfighter Systems at Anduril Industries, you will play a crucial role in developing and integrating cutting-edge systems that support our military operations. You will work collaboratively with a talented team to innovate solutions that enhance our national security.
Be a Part of the Future of Finance!At Robinhood, we are dedicated to democratizing finance for everyone. With an estimated $124 trillion set to be inherited by younger generations in the next twenty years, you have the opportunity to be at the forefront of this monumental cultural and financial transformation.Join Our TeamWe are assembling an exceptional team that leverages advanced technologies to tackle the most significant challenges in finance. We seek innovative thinkers and adept problem-solvers—builders who are driven to create meaningful change. Robinhood is not a place for mediocrity; it's where motivated individuals achieve the best work of their careers. Our high-performing, fast-paced team operates with ethics at the center of our mission, and we offer high expectations alongside rewarding experiences.The Storage Platform team is responsible for developing and managing the platform that facilitates database access across Robinhood. We oversee relational databases (Postgres/Aurora), key-value stores (DynamoDB), and caching systems, alongside the SDKs and automation tools that enable safe, reliable access at scale. Our goal is to standardize and enhance how services connect to storage, boost reliability and performance, and minimize operational overhead via automation. We manage thousands of databases and numerous caching clusters that support millions of users and critical brokerage operations. Ensuring availability is paramount—our systems are crafted to meet rigorous uptime requirements, including zero downtime during market hours.As a Staff Software Engineer, you will architect and refine the core infrastructure that underlies Robinhood’s storage systems. You will lead complex distributed systems projects, such as horizontal sharding, proxy-based query routing, connection pooling, and cross-shard transactions. Your work will focus on enhancing database reliability, performance, and cost-effectiveness across multi-region deployments. Your contributions will directly influence system availability, latency, and resilience for both customer-facing products and internal engineering teams.Explore a few insights from our team:Listen to our director discuss the impactful work we are doing!Read our blog post detailing our innovative ecosystem.
Visa U.S.A. Inc., a part of Visa Inc., is seeking talented Staff Software Engineers (multiple openings) in Bellevue, WA to:Architect, design, and develop complex software applications and microservices using technologies like Java, Spring Boot, Kafka, and MySQL.Evaluate and integrate cutting-edge technology tools and processes that foster the development of innovative products and solutions, enhancing operational efficiency and opening new business avenues.Leverage generative AI within the development lifecycle to improve automation, code generation, problem-solving, and overall process optimization.Create and implement a unified control center and monitoring tools to assess, monitor, and report on platform health, data availability, and capacity utilization trends.Conduct thorough business and technical analyses, perform code reviews, and execute unit testing to ensure adherence to quality and development compliance.Design and implement modifications and fixes to existing software, including debugging routines and resolving codebase issues.Optimize database queries and write procedures for significant project deployments.Apply established standards, processes, and tools throughout the secure software development life cycle (SSDLC) to support engineering applications and products.Engage in cross-functional collaboration with architects, systems analysts, project managers, QA, and fellow developers to fulfill business requirements using state-of-the-art tools and technologies.Ensure timely project delivery, establish production support plans, and facilitate knowledge transfer for the long-term maintainability of upgrades, enhancements, and deployments.Mentor junior developers and contribute to fostering a culture of continuous improvement, scalability, and high performance in software solutions.Generate technical documentation for new developments, system enhancements, and production support.Monitor platform health, produce performance reports, and promote ongoing improvements.This position is based in the Bellevue, Washington office, with potential options for partial telecommuting.
Feb 20, 2026
Sign in to browse more jobs
Create account — see all 186 results
Tailoring 0 resumes…
Tailoring 0 resumes…
We'll move completed jobs to Ready to Apply automatically.