1 - 20 of 5,662 Jobs

Search for Senior Cloud Infrastructure Software Engineer

5,662 results

Apply
companyClickHouse logo
Full-time|Remote|Canada(Remote)

About ClickHouseRanked among the top innovative cloud companies in the 2025 Forbes Cloud 100 list, ClickHouse is rapidly expanding its reach with over 3,000 clients and an annual recurring revenue (ARR) growth exceeding 250% year-over-year. Our expertise lies in real-time analytics, data warehousing, observability, and AI workloads.The company's remarkable progress was recently affirmed with a $400M Series D funding round. In just three months, we've welcomed prominent clients such as Capital One, Lovable, Decagon, Polymarket, and Airwallex, who are either adopting our platform or enhancing their existing implementations. These esteemed clients join a robust lineup of AI trailblazers and global brands like Meta, Cursor, Sony, and Tesla.We are dedicated to revolutionizing how organizations leverage data. Join us on this exciting journey!About the TeamThe Cloud Infrastructure Engineering team is responsible for creating and managing the essential components of the ClickHouse Cloud data plane from end to end. This encompasses compute, networking, security, and a multi-cloud, multi-region architecture to deliver a reliable and scalable managed ClickHouse experience for our customers. We are seeking highly skilled and experienced cloud infrastructure software engineers to design, deploy, and maintain our infrastructure.What will you do?Design and construct a resilient, scalable, and highly available distributed infrastructure.Develop an advanced cloud-native platform leveraging public cloud resources, along with automating our cloud resource management.Collaborate closely with our ClickHouse core database development and security teams to enhance our Software as a Service (SaaS) offerings.Enhance routing and traffic components to boost the reliability and scalability of our cloud services.Consistently improve system availability using industry best practices and distributed systems principles.Design and build security components to protect our infrastructure.

Mar 5, 2026
Apply
companyVeeva Systems Inc. logo
Full-time|On-site|Canada - Toronto

Role overview Veeva Systems is hiring a Senior Software Engineer focused on infrastructure in Toronto, Canada. This role centers on designing and building software that supports and improves our cloud-based platforms. The work directly impacts scalability and performance across our systems. What you will do Design and implement software solutions for infrastructure needs Work closely with teams from different disciplines to strengthen our cloud platforms Contribute to projects that improve system scalability and performance

Apr 14, 2026
Apply
companyVeeva Systems Inc. logo
Full-time|On-site|Canada - Ottawa

Role Overview Veeva Systems is looking for a Senior Software Engineer focused on Infrastructure in Ottawa, Canada. This role centers on designing and building infrastructure solutions that underpin Veeva’s software products. Collaboration with cross-functional teams is a key part of the job, with an emphasis on improving system performance and scalability.

Apr 14, 2026
Apply
companyHuawei Canada logo
Full-time|CA$127K/yr - CA$225K/yr|On-site|Markham, Ontario, Canada

Huawei Canada's Distributed Scheduling and Data Engine Lab in Markham has contributed to Huawei Cloud's technical growth since 2014. The team specializes in cloud-native databases, intelligent SQL engines, AI and agent infrastructure, and evaluating large language models. Close collaboration with industry experts shapes both new product development and the continuous improvement of cloud platforms. Role overview The Senior Engineer - Cloud AI Infrastructure role focuses on building and refining infrastructure to support AI and agentic workloads. This position blends research, systems engineering, and product delivery to advance cloud AI capabilities. What you will do Develop infrastructure for AI and agent workloads, combining technical research with hands-on engineering. Track trends in large language models, agentic AI, and multi-step agent workflows to inform infrastructure decisions. Identify and address performance bottlenecks related to GPU/NPU usage, data transfer, memory management, and distributed execution. Design and implement system-level architectures for agent execution, multi-model orchestration, and large-scale inference. Evaluate and optimize AI workload requirements on cloud and hybrid environments, balancing cost, performance, and scalability. Analyze the infrastructure stack, including distributed schedulers, inference pipelines, caching, and data access patterns. Work with engineering and product teams to prototype and deliver solutions based on research findings. Translate emerging AI trends and workload patterns into scalable infrastructure designs. Location Markham, Ontario, Canada

Apr 24, 2026
Apply
companyHuawei Canada logo
Full-time|On-site|Markham, Ontario, Canada

Join Huawei Canada as a Principal Software Engineer and be a part of our innovative team!About Us:Founded in 2014, the Distributed Scheduling and Data Engine Lab serves as Huawei Cloud’s technology innovation hub in Canada. This lab is dedicated to pioneering advanced cloud technologies, facilitating the productization and ongoing refinement of our technological breakthroughs. Our research spans various domains, including cloud-native databases, resource scheduling and prediction, middleware solutions, media engines, and user experience enhancements. We cultivate a dynamic technical environment that encourages collaboration with industry specialists to develop a competitive cloud platform. We are currently seeking a Principal Software Engineer to join our team.Job Responsibilities:Integrate AI frameworks with cloud infrastructure, optimizing the end-to-end architecture for AI inference and fine-tuning scenarios, with a focus on enhancing observability, reliability, and performance of AI services.Collaborate with team members to design and build concept prototypes, validating optimization strategies to ensure their effectiveness.Work closely with the product team to support prototype development, ensuring alignment with product constraints and requirements.

Apr 16, 2025
Apply
companyAfresh Technologies logo
Full-time|Remote|Remote - Ontario, Canada

As a Senior Infrastructure Software Engineer at Afresh Technologies, you will play a crucial role in enhancing our infrastructure to support cutting-edge software solutions. You will collaborate with cross-functional teams to design, implement, and maintain scalable systems that improve our operational efficiency and reliability.

Mar 28, 2026
Apply
companyVeeva Systems Inc. logo
Full-time|On-site|Canada - Toronto

Role Overview Veeva Systems Inc. is looking for a Senior Software Engineer focused on Infrastructure in Toronto, Canada. This role centers on designing, building, and improving infrastructure that supports our software products. What You Will Do Create and refine infrastructure solutions to support application development and deployment Work with teams across engineering, operations, and product to strengthen system reliability and performance Address scalability and security needs as our technology evolves Impact Your work will help shape Veeva’s technology foundation and support the growth of our software applications.

Apr 14, 2026
Apply
companyVanta logo
Full Time|Remote|Remote - Canada

At Vanta, we are dedicated to helping businesses cultivate and demonstrate trust by prioritizing security that is continuously monitored and verified. Our mission empowers companies to enhance their security practices and showcase their commitment effortlessly. Join our supportive and skilled team, where individuals from diverse backgrounds thrive—many have succeeded at Vanta without prior security experience.About the Role:As a key member of Vanta’s Infrastructure & Security team, you will play a vital role in building a robust platform that ensures the scalability, performance, and reliability of our core systems. With our rapidly expanding customer base, we require infrastructure that evolves alongside us, without sacrificing speed or developer efficiency. You'll engage in projects involving distributed systems, infrastructure components, and security upgrades, enabling our engineers to deliver dependable software at scale.Your contributions will significantly impact product engineering across the board. Our initiatives simplify the process for engineers to identify bugs, develop features, and provide value to our clients swiftly and securely.At Vanta, we harness modern frameworks and tools such as TypeScript, React, Node.js, MongoDB, GitHub Actions, and a variety of AWS services like Fargate and ECS to design and develop new product functionalities and infrastructure.If you're excited about the intersection of infrastructure and security, we would love to connect with you.To learn more about our team's work, visit our Vanta Engineering Blog.Your Responsibilities Will Include:Designing and constructing scalable infrastructure to facilitate rapid increases in data volume, service utilization, and engineering productivity.Leading projects concerning our cloud infrastructure, encompassing container orchestration (e.g., AWS Fargate, ECS), monitoring and alerting systems, networking, and database upkeep.Implementing and maintaining essential security infrastructures and controls, including service-to-service authentication, secrets management, application security features (e.g., rate limiting, encryption libraries), and infrastructure hardening.Identifying and resolving intricate security challenges to bolster our systems.

Dec 29, 2025
Apply
companycohere logo
Full-time|On-site|Toronto

Join Cohere as a Senior Software Engineer, specializing in Agent Infrastructure. In this role, you will lead the design and implementation of robust software solutions that enhance our agent infrastructure capabilities. Collaborate with cross-functional teams to drive innovation and optimize our systems for performance and scalability.

Mar 12, 2026
Apply
companyRobinhood Markets, Inc. logo
Full-time|$166K/yr - $195K/yr|On-site|Toronto, Canada

Be a Part of Shaping the Future of Finance.At Robinhood, our mission is to make finance accessible to everyone. With an estimated $124 trillion in assets poised to be passed down to younger generations over the next 20 years, we are at the forefront of this monumental wealth transfer. If you're passionate about being part of this transformative financial movement, we invite you to continue reading.About the Team and RoleOur elite Infrastructure team is dedicated to applying cutting-edge technologies to tackle some of the largest challenges in finance. We are seeking innovative thinkers and adept problem-solvers—individuals who are driven to make significant contributions. At Robinhood, complacency has no place; we strive for excellence and reward ambition. Our high-performing team operates with ethics as our guiding principle, ensuring that high expectations yield equally high rewards.The Infrastructure organization is responsible for developing and managing the foundational systems that power all Robinhood products and services. This team emphasizes reliability, scalability, and developer efficiency by providing platforms, tools, and systems that enhance the productivity of engineering teams across our organization. Within Infrastructure, you may join one of several specialized teams:The Backend Platform team focuses on facilitating rapid, secure, and maintainable backend development at scale. This team creates frameworks, dependency management systems, and developer tools while promoting the adoption of Go across Robinhood through shared libraries and enhanced tooling for engineers building production-grade Go services.The Provisioning team manages the lifecycle of our infrastructure across AWS and Kubernetes environments. This team develops systems and controllers that provision necessary cloud resources, assisting application teams and streamlining developer workflows for service deployment and management.The Kubernetes Compute team is tasked with building and operating a highly available, scalable Kubernetes-based compute platform, ensuring that our container infrastructure supports dependable application deployments and integrates essential platform capabilities for multi-region scalability.The Technical Assurance Platform team is responsible for creating and maintaining a centralized service catalog that tracks service ownership, performance, and reliability throughout Robinhood.

Mar 26, 2026
Apply
companymlabs logo
Full-time|Remote|Remote — Toronto, Ontario, Canada

Join our innovative team at mlabs as a Senior Infrastructure Engineer, where you will play a crucial role in designing, implementing, and managing our cloud infrastructure. You will work collaboratively with other engineers to ensure high availability and performance, leveraging cutting-edge technologies to drive our projects forward.

Mar 18, 2026
Apply
companyBitGo logo
Full-time|On-site|Toronto, Ontario, Canada

Role overview BitGo is hiring a Senior Infrastructure Engineer in Toronto, Ontario. This role focuses on designing and maintaining the company’s infrastructure. The position works closely with teams across the organization to improve system performance, scalability, and reliability.

Apr 14, 2026
Apply
companyDocker, Inc. logo
Full-time|CA$100K/yr - CA$100K/yr|Remote|Canada

At Docker, we simplify application development, allowing developers to focus on their core objectives. Our remote-first team is globally dispersed, driven by a collective passion for innovation and exceptional developer experiences. With over 20 million monthly users and 20 billion image pulls, Docker stands as the premier tool for building, sharing, and operating applications—trusted by both startups and Fortune 100 companies. We are experiencing rapid growth and are just getting started. Join us for an exciting journey!The Infrastructure Engineering team is responsible for building and managing the cloud-native platform that powers Docker’s product suite. We design resilient services, automate processes where beneficial, and measure key metrics to ensure hundreds of engineers can deploy safely to millions of users every day.A key focus of our team is self-service. We develop streamlined platform capabilities that empower internal teams to provision, deploy, observe, and manage services with minimal friction and robust guardrails. We treat our platform as a product, establishing clear contracts, well-defined defaults, and comprehensive documentation. Our success is evaluated based on user adoption and a reduction in support requests.How We OperateDocumentation and Iteration: We emphasize thorough documentation, code reviews, and incremental releases.Sustainable Reliability: Our priority is to address root causes, establish effective alerts, and implement automation, rather than relying on heroics.Cross-Functional Collaboration: We work closely with product and security teams by default.AI-Driven Execution: We create workflows that reduce manual tasks and enhance incident response, while ensuring guardrails, auditability, and human review.What You Will Focus OnMinimizing manual work through automation, including AI-assisted operational workflows.Creating self-service onboarding and deployment workflows that reduce ticket volume and accelerate delivery timelines.Scaling Kubernetes foundations and evolving our traffic and ingress stack.Key Responsibilities1) Self-Service Platform ServicesDevelop and manage internal platform services and APIs using Go, focusing on provisioning, quotas, policies, cost insights, and platform workflows.Establish streamlined pathways for self-service onboarding and ongoing operations, including access, deployment configurations, observability defaults, and governance frameworks.

Mar 25, 2026
Apply
companyjobgether logo
Full-time|Remote|Canada

Role overview As a Senior Software Engineer focused on AI Infrastructure at jobgether, the main responsibility is to shape and support the technical foundation for the company’s AI initiatives. The work involves both creating new systems and refining current technology to help meet evolving business objectives. What you will do Design and develop infrastructure that enables AI systems to operate efficiently and reliably Maintain and enhance existing AI technology stacks to ensure ongoing performance and scalability Apply technical expertise to solve complex engineering problems related to AI infrastructure Work environment This position is based in Canada and involves working closely with a collaborative team. The role provides opportunities to engage with advanced technologies in support of the company’s AI goals.

Apr 28, 2026
Apply
companyDialpad logo
Full-time|$168K/yr - $194.3K/yr|On-site|Vancouver, Canada

About DialpadDialpad stands at the forefront of innovation as a premier AI-driven customer communications platform, reshaping the way businesses engage with their clients. Trusted by over 50,000 organizations globally, including renowned names such as Netflix, RE/MAX, Uber, Randstad, and Tractor Supply, Dialpad empowers brands to strengthen customer relationships through real-time, AI-enhanced insights. Discover more by visiting dialpad.com.Being a DialerAt Dialpad, you will be an integral part of a dynamic team focused on our collective mission to ensure our customers and their employees achieve exceptional success. We believe every conversation is significant, and we enhance each interaction with a platform that delivers immediate insights and automation for our clients.We thrive in an environment of continuous improvement, where each team member utilizes cutting-edge AI technology to refine both our platform and personal skills. We are in search of individuals who not only meet our high expectations but also surpass them. Our ambitious goals require a team that operates at the utmost level of excellence. We seek individuals who are driven and embody the essential qualities for our success: Resourceful, Inquisitive, Optimistic, Tenacious, & Compassionate.Your RoleAs a Senior Software Engineer within the Tel Cloud division, you will take ownership of the components that drive our global communications infrastructure. You will develop features such as call routing, SMS/MMS messaging, spam and fraud detection, fleet deployments, porting, and number management, ensuring our platform remains scalable, robust, and secure.This position reports directly to an Engineering Manager in the Telephony Platform team.We welcome candidates from diverse engineering backgrounds, even if they lack direct experience in communications. What matters most is your enthusiasm to learn, collaborate, and contribute to building a fast, reliable, and sophisticated product that delights our customers.

Mar 23, 2026
Apply
companyLoopio logo
Full-time|On-site|Toronto, ON Hub

Elevate Your Career with Loopio! At Loopio, we are searching for a visionary Senior Engineering Leader to spearhead our Site Reliability Engineering (SRE), Infrastructure, and MLOps teams. In this pivotal role, you will architect the foundation of reliability, scalability, and cost efficiency for our platform's systems.You will guide teams responsible for designing, building, and operating our production infrastructure, ensuring our services remain resilient, observable, and primed for expansion as we incorporate cutting-edge AI and automated workflows. Collaborating closely with Product Engineering, Security, and Data teams, you will facilitate rapid, secure delivery while upholding operational excellence.Note: This position is an existing vacancy within our team. Key ResponsibilitiesLeadership & Team DevelopmentLead and nurture multiple teams across SRE, Cloud Infrastructure, and MLOps.Mentor engineering managers and senior contributors, cultivating a culture of ownership and high standards.Foster a 'Platform-as-a-Product' mindset, ensuring infrastructure and ML tools empower the wider engineering organization.Collaborate with Recruiting to attract and retain top-tier talent in the areas of cloud, reliability, and machine learning infrastructure.Reliability & Operational ExcellenceOversee the operational health of production systems, focusing on availability, latency, and durability.Define and refine SLIs, SLOs, and error budgets to promote data-driven reliability decisions.Lead incident response efforts, championing blameless postmortems and systemic improvements to minimize

Feb 4, 2026
Apply
companyINTRALOT Canada logo
Full-time|On-site|Vancouver, British Columbia, Canada

Join INTRALOT as an AWS Cloud Infrastructure Platform Engineer – Powering Gaming Experiences!At INTRALOT, we are at the forefront of revolutionizing the gaming industry through innovative technology. Our global presence and diverse teams foster a culture that prioritizes people. We are seeking a Platform Engineer who is passionate about advancing their career. At INTRALOT Canada, we are reshaping gaming with robust, scalable, and state-of-the-art systems. This is your opportunity to make a significant impact and grow with a collaborative and innovative team.Your Role:As a Platform Engineer – AWS Infrastructure, you will oversee the daily operations, reliability, and enhancement of our existing AWS infrastructure that supports production workloads.The platform consists of Red Hat Enterprise Linux (RHEL) systems and an OpenShift container platform hosted on AWS EC2 instances, facilitating critical application workloads. Your primary focus will be on ensuring stability, security, automation, and operational excellence to maintain a robust, scalable, and user-friendly platform.This position is part of the Operations team, dedicated to managing and improving AWS-based infrastructure and container platforms that drive our gaming services.What You’ll Do: Operate & Support Existing AWS InfrastructureManage and support production AWS environments across multiple accounts.Oversee core AWS services including EC2, networking, storage, IAM, and other supporting infrastructure services.Administer EC2-based Red Hat Enterprise Linux (RHEL) instances hosting application and platform components.Conduct OS lifecycle management, patching, hardening, and configuration management for Linux systems.Facilitate patch management and configuration automation using Ansible. Operate OpenShift Platform on AWSManage the OpenShift container platform deployed on AWS EC2 instances.Maintain and troubleshoot OpenShift cluster nodes, networking, storage integration, and workloads.Assist with cluster lifecycle activities, including node maintenance, upgrades, and configuration changes.Collaborate with development teams to ensure containerized applications operate efficiently and reliably on the platform.

Mar 11, 2026
Apply
companyInstacart logo
Full-time|$271K/yr - $286K/yr|Remote|Canada - Remote (ON, AB, BC, or NS Only)

Join Us in Revolutionizing the Grocery SectorAt Instacart, we believe that food brings people together. Our mission is to ensure everyone has access to the groceries they love while providing them with more time to savor those moments. What may seem like a simple grocery delivery service to some is, for us, an intricate web of opportunities to meet the diverse needs of our community. We are dedicated to delivering an essential service that our customers depend on for their groceries and household items while also creating safe and flexible earning opportunities for Instacart Personal Shoppers.Instacart has become a vital resource for millions, and we are expanding our team to drive our mission forward. If you're ready to contribute to something impactful and do your best work, we invite you to join our team.Flexibility at InstacartWe understand that there's no one-size-fits-all approach to productivity. Therefore, we provide our employees the freedom to choose their ideal work environment, whether it's from home, a local office, or a favorite café, while maintaining connectivity and community through regular in-person events. Discover more about our flexible work culture.Role OverviewAs a Senior Staff Software Engineer on the Data Infrastructure team at Instacart, you will play a pivotal role in shaping the technical landscape of our data platform, which is crucial for our company's data strategy. You'll be responsible for guiding the architecture roadmap that supports our storage and compute layers, streaming infrastructure, analytics tools, and governance systems. This position is ideal for a strategic thinker with deep technical expertise who can make a significant impact at a company-wide level.You will lead long-term architectural planning for our core data platform, influence major investment decisions, and serve as a thought leader within the engineering community and beyond. Your contributions will directly affect how Instacart scales its decision-making processes and will shape the economic framework of one of the most data-driven tech companies in the grocery sector.

Mar 31, 2026
Apply
companyINTRALOT logo
Full-time|On-site|Vancouver, British Columbia, Canada

Become part of INTRALOT as a Cloud Infrastructure Platform Engineer - AWS, and help us revolutionize gaming experiences!At INTRALOT, we are at the forefront of innovation in the gaming sector, leveraging technology to create engaging experiences. Our commitment to a diverse and inclusive culture empowers our teams to thrive and innovate. As a Cloud Infrastructure Platform Engineer, you will play a pivotal role in transforming the gaming industry with our state-of-the-art, scalable, and reliable systems. This is your opportunity to make a significant impact and develop your career within a vibrant team dedicated to collaboration and innovation.Your Responsibilities:In this role, you will oversee the operation, reliability, and ongoing development of our AWS infrastructure platform that supports critical production workloads.Your responsibilities will include managing Red Hat Enterprise Linux (RHEL) systems and an OpenShift container platform hosted on AWS EC2 instances. You will emphasize platform stability, security, automation, and operational excellence to ensure our systems are robust, scalable, and user-friendly.This position is integral to the Operations team, focusing on optimizing and enhancing our AWS-based infrastructure and container platforms that deliver our gaming services.Key Duties: Manage & Maintain Existing AWS InfrastructureAdminister and support production AWS environments across multiple accounts.Oversee key AWS services such as EC2, networking, storage, IAM, and essential infrastructure services.Assist with managing EC2-based Red Hat Enterprise Linux (RHEL) instances that host various application and platform components.Execute OS lifecycle management, patching, hardening, and configuration management for Linux systems.Implement patch management and configuration automation utilizing Ansible. Operate OpenShift Platform on AWSSupport the OpenShift container platform deployed on AWS EC2 instances.Maintain and troubleshoot OpenShift cluster nodes, networking, storage integration, and workloads.Assist in cluster lifecycle activities, including node maintenance, upgrades, and configuration changes.Collaborate with development teams to ensure containerized applications function efficiently and reliably on the platform.Resolve issues related to operational performance and reliability.

Mar 11, 2026
Apply
companyCollabera logo
Full-time|On-site|Vancouver

We are seeking a highly skilled Senior Software Engineer specializing in Network Router Infrastructure. In this role, you will contribute to the design and development of robust networking solutions that enhance our clients' connectivity and performance. You will collaborate with cross-functional teams to deliver high-quality software that meets customer requirements and industry standards.

Feb 23, 2016

Sign in to browse more jobs

Create account — see all 5,662 results

Tailoring 0 resumes

We'll move completed jobs to Ready to Apply automatically.