Site Reliability Engineer Remote Opportunities Across Canada jobs in Toronto – Browse 1,271 openings on RoboApply Jobs
Site Reliability Engineer Remote Opportunities Across Canada jobs in Toronto
Open roles matching “Site Reliability Engineer Remote Opportunities Across Canada” with location signals for Toronto. 1,271 active listings on RoboApply Jobs.
1,271 jobs found
Site Reliability Engineer - Remote Opportunities Across Canada
Clicking Apply Now takes you to AutoApply where you can tailor your resume and apply.
Unlock Your Potential
Generate Job-Optimized Resume
One Click And Our AI Optimizes Your Resume to Match The Job Description.
Is Your Resume Optimized For This Role?
Find Out If You're Highlighting The Right Skills And Fix What's Missing
Qualifications
Strong understanding of systems architecture and cloud infrastructure. Proficiency in scripting and automation tools. Experience with monitoring and incident management tools. Excellent problem-solving skills and ability to work collaboratively. Familiarity with DevOps practices and methodologies.
About the job
Join our innovative team at Newton as a Site Reliability Engineer, where you'll play a crucial role in ensuring the reliability and performance of our systems. In this fully remote position, you will collaborate with engineering and operations teams to develop solutions that enhance system uptime and efficiency.
Your expertise will help us transition and maintain our infrastructure, ensuring our services are resilient and scalable. This is an exciting opportunity to contribute to a company that values innovation and teamwork.
About Newton
Newton is a forward-thinking technology company committed to building reliable and efficient systems. We prioritize employee growth and encourage a culture of innovation. Our diverse team thrives in a collaborative environment, where your ideas are valued, and your contributions make a real impact.
Join our innovative team at Newton as a Site Reliability Engineer, where you'll play a crucial role in ensuring the reliability and performance of our systems. In this fully remote position, you will collaborate with engineering and operations teams to develop solutions that enhance system uptime and efficiency.Your expertise will help us transition and maintain our infrastructure, ensuring our services are resilient and scalable. This is an exciting opportunity to contribute to a company that values innovation and teamwork.
Full-time|CA$144K/yr - CA$200K/yr|Hybrid|Toronto; Vancouver
The TeamAt MongoDB, our Platform Engineering division within Site Reliability Engineering (SRE) is tasked with managing essential infrastructure and operational functions that empower our engineering teams. This includes our robust, multi-cloud Kubernetes infrastructure, deployment systems, and advanced observability and alerting mechanisms.The Fabric team is at the forefront of enabling secure communication across systems and from the public internet. Our responsibilities involve designing network architecture, implementing service mesh solutions, and optimizing edge load balancing to ensure the safety of customer data in transit. This team is vital in developing and maintaining a dependable and globally connected multi-cloud network that underpins MongoDB products.This position can be based in our Toronto or Vancouver offices, or you can work completely remotely from anywhere in North America. We provide flexible hybrid work arrangements for those in our offices.
Veeva Systems is a mission-driven leader in industry cloud technology, dedicated to accelerating the delivery of therapies to patients in the life sciences sector. As one of the fastest-growing SaaS companies ever, we surpassed $2 billion in revenue last fiscal year with significant growth prospects ahead.Central to Veeva's mission are our core values: Do the Right Thing, Customer Success, Employee Success, and Speed. Notably, we made history in 2021 by becoming a public benefit corporation (PBC), which legally commits us to balance the interests of our customers, employees, society, and investors.As a Work Anywhere company, we empower you to choose your work environment, whether it's from home or in our office, enabling you to excel in your preferred setting.Be part of our journey in transforming the life sciences industry and making a positive impact on our customers, employees, and communities.The RoleWe are seeking a talented Senior Site Reliability Engineer to join our Vault Platform team. In this role, you will be instrumental in ensuring the scalability and reliability of our enterprise applications. You will face complex challenges on a global scale, leveraging your extensive knowledge of Java and modern open-source technologies to create a meaningful impact on our production systems.The ideal candidate will possess substantial experience with Java applications and cutting-edge open-source technologies, particularly within the context of enterprise software development or a high-growth tech environment. As a Senior SRE, you should have a natural curiosity and a strong aptitude for problem-solving. Your unique engineering perspective will be critical as you understand how systems integrate in production to function efficiently on a global scale, supporting hundreds of customers across North America, Europe, and Asia.
Pinterest is hiring a Senior Site Reliability Engineer in Toronto, ON, Canada. The focus of this role is to ensure that Pinterest’s services remain reliable, scalable, and perform well as the platform grows. Working closely with software engineers, this position involves designing and implementing solutions that strengthen system reliability and efficiency. Key responsibilities Partner with engineering teams to maintain and enhance the reliability of Pinterest’s services Design and implement improvements to support scalability and performance Troubleshoot and resolve service issues to reduce downtime Requirements Extensive experience in site reliability engineering or a closely related field Strong technical background with proven problem-solving abilities Comfort working alongside software engineers to improve systems This position is located in Toronto, ON, Canada.
At Veeva Systems, we are driven by a mission to revolutionize the life sciences industry, empowering companies to bring therapies to patients at an accelerated pace. As one of the fastest-growing SaaS companies in history, we achieved over $2 billion in revenue last fiscal year and possess immense growth potential.Our core values - Do the Right Thing, Customer Success, Employee Success, and Speed - define who we are. In 2021, we made history by becoming a public benefit corporation (PBC), committed to balancing the interests of our customers, employees, society, and investors.As a Work Anywhere organization, we offer the flexibility for you to work remotely or from our office, allowing you to thrive in your preferred environment.Join us in transforming the life sciences sector and making a positive impact on our customers, employees, and communities.
Full-time|CA$243K/yr - CA$297K/yr|On-site|Toronto, ON
At Relay, we empower self-made business owners with a digital banking platform that transforms financial management into a source of clarity, confidence, and control. Our mission is to replace financial uncertainty with genuine visibility, enabling entrepreneurs to convert their hard work into enduring success. By alleviating the stress of cash flow management, we provide the tools necessary for owners to operate robust and resilient businesses.As Relay continues its growth trajectory, the reliability, performance, and resilience of our platform have become integral to both our customer experience and overall business success.This senior leadership position is crucial in steering a team of Site Reliability Engineers while shaping how reliability strategies influence engineering and product decisions throughout the organization. You will determine the future direction of the SRE function, promote operational excellence, and assist the company in anticipating and managing scale challenges before they pose risks.If you thrive on tackling complex systems, leading organizations, and building resilient platforms that customers depend on daily, we are eager to connect with you!Key ResponsibilitiesLead and enhance Relay’s Site Reliability Engineering function, establishing strategic direction as the company scales.Define and implement a long-term reliability roadmap, making informed trade-offs under real business and capacity constraints.Act as the senior reliability voice in discussions involving engineering and product leadership.Influence the integration of reliability considerations into product planning, architectural decisions, and delivery processes.Serve as a senior escalation point during critical production incidents, ensuring effective communication and thorough follow-up actions.Enhance Relay’s observability, performance, and operational maturity practices across teams.Establish and uphold standards concerning SLOs, operational readiness, incident management, and continuous improvement.Collaborate with stakeholders in Engineering, Product, Data, and Finance to balance velocity, risk, performance, and cost.Build and nurture a high-performing SRE organization capable of supporting future growth.
Full-time|CA$144K/yr - CA$200K/yr|Hybrid|Montreal; Toronto
The Storage Layer Services (SLS) team at MongoDB is embarking on an innovative journey to re-architect our cloud storage layer, forming the core of our next-generation cloud storage architecture. This newly established team is dedicated to creating high-performance, multi-tenant distributed storage services that not only enhance our current Atlas storage stack but also enable more efficient customer workloads. As a Senior Site Reliability Engineer, you will collaborate closely with teams responsible for these storage services to establish Service Level Objectives (SLOs), develop capacity plans, and guarantee the reliability, durability, and operational safety of the foundational storage layer supporting Atlas. By joining our small team of seasoned SREs, you will play an integral role in executing a multi-year roadmap for MongoDB’s cloud storage architecture. This position is open to candidates based in our Toronto or Montreal offices or those working remotely from anywhere in Canada, provided they are located in the Eastern or Central time zones.
Join Tenstorrent as a Site Reliability Engineer, where you will play a crucial role in ensuring the reliability and performance of our cutting-edge systems. As a member of our dedicated engineering team, you will work on innovative solutions to enhance our infrastructure and streamline operations. Your expertise will help us deliver exceptional service and uptime to our customers.
Full-time|$211.5K/yr - $258.5K/yr|On-site|Toronto, ON
At Relay, we are revolutionizing the way self-made business owners manage their finances through our cutting-edge digital banking platform. Our mission is to empower entrepreneurs with the tools and knowledge they need to achieve financial clarity, confidence, and control over their earnings. By transforming cash flow management from a source of stress into a clear, actionable insight, we help our customers build stronger and more resilient businesses.As we continue to grow, the reliability, performance, and resilience of our platform have become critical components of our customer experience and overall business success.We are currently seeking an Engineering Manager to lead our Site Reliability Engineering (SRE) team. In this pivotal role, you will oversee the scalability, reliability, and robustness of Relay's systems. This position transcends infrastructure management and incident response; it is a leadership opportunity that sits at the nexus of technology, team dynamics, and business strategy. You will mentor and manage a talented SRE team, influence how reliability is integrated across the organization, and ensure our systems can safely scale in response to increasing customer demands and complexity.If you thrive in technically demanding environments and are passionate about fostering strong teams, a healthy workplace culture, and effective cross-functional collaboration, this position is designed for you.
At Movable Ink, we empower marketers with cutting-edge content personalization through data-driven content creation and AI-driven decision-making. Our innovative platform is trusted by top global brands to enhance revenue, streamline workflows, and increase marketing agility. With our headquarters in New York City and a talented team of nearly 600 employees, Movable Ink has a presence across North America, Central America, Europe, Australia, and Japan.As a Lead Site Reliability Engineer, you will leverage your technical expertise and leadership skills to oversee infrastructure and software development initiatives. You will play a pivotal role in designing and evolving key systems within our multi-cloud, multi-region content serving platform, which handles over 25 billion requests daily. By fostering architectural vision, cross-team collaboration, and mentorship, you will spearhead reliability initiatives and define the technical strategies necessary for scaling our platform to accommodate 50 billion requests per day and beyond.
Empower Every Identity, from AI to HumanIdentity is the cornerstone of unlocking AI's potential. At Okta, we secure AI by creating a trustworthy, neutral infrastructure that allows organizations to confidently navigate this transformative era. This mission demands an unwavering commitment to addressing intricate challenges with significant real-world implications. We seek innovative builders who act with speed and urgency and execute with exceptional proficiency.This is your chance to engage in work that can define your career. We are fully dedicated to this mission. If you share this passion, we want to hear from you.Join Us in Securing Every Identity, from AI to HumanOkta is at the forefront of providing a superior authentication experience for hundreds of millions globally. Our focus on reliability forms the bedrock of our product, with a strong commitment to surpassing customer expectations for availability being a fundamental engineering priority. As a Senior Site Reliability Engineer, you will be part of our SRE team, ensuring our production systems are not only fully operational but also resilient, scalable, and poised for remarkable growth. This role goes beyond mere maintenance; it is about playing a significant role in enhancing the core robustness and resilience of our platform. You will be a proactive builder, developing solutions that inherently boost our system's reliability.Your Responsibilities:Craft and develop custom software in Go to bolster the platform’s reliability and resilience.Collaborate with engineering teams to integrate reliability principles, enhancing the availability, performance, and observability of our services.Utilize your profound understanding of infrastructure and observability to pinpoint improvement opportunities within the product and implement effective solutions.Participate in our on-call rotation, providing swift, effective responses to critical incidents and utilizing your expertise to troubleshoot, mitigate, or accurately escalate production issues.Enhance our SRE tooling and processes, focusing on automation and operational efficiency.Establish, document, and promote reliability best practices throughout the organization.
About Rootly At Rootly, we are dedicated to revolutionizing how organizations manage incidents. Our mission is to provide a reliable incident management platform that empowers companies to respond swiftly and effectively when challenges arise. Our innovative approach has established us as leaders in a new multi-billion dollar segment, and we are seeking exceptional talent to help us achieve our ambitious goals. Our customers, including industry giants like NVIDIA, Figma, Canva, and Tripadvisor, trust Rootly for their critical incident management needs. They appreciate our user-friendly platform and unique partnership approach, which has garnered us a stellar 5-star rating on G2. Join us in creating a reliable future for organizations worldwide. Backed by prestigious investors from Y Combinator to key operators in tech, we prioritize transparency and team involvement in our financial health. We conduct monthly business reviews and share updates through our weekly changelog. About the Role As a Senior Site Reliability Engineer at Rootly, you will play a crucial role in shaping our technical infrastructure. You will thrive in a dynamic environment where each day presents new challenges and opportunities for growth. This position is perfect for individuals who seek ownership, enjoy tackling complex technical problems, and are driven by a mission to enhance reliability. While the work will be demanding, it promises to be one of the most rewarding experiences in your career. Collaborate with product teams to enhance the observability, reliability, and performance of services. Take ownership of our CI/CD pipelines, observability tools, monitoring systems, and incident response processes. Develop tools and automation to reduce manual toil, enhance engineering velocity, and improve developer experience and system reliability. Engage deeply with engineering teams to gain insights into system performance and identify cross-functional reliability and scaling concerns. Design and scale our infrastructure while ensuring top-notch performance and operational excellence.
A Few Important Notes:Join a Profitable B2B SaaS company with teams primarily located in North America.This position is predominantly remote, with a requirement to meet in Toronto once a month.Candidates must possess the legal right to work in Canada; we are unable to provide visa sponsorship.As our platform continues to expand, we are actively seeking a Senior Site Reliability Engineer (SRE) / Cloud Engineer.Experience with Azure is highly prioritized as it is our primary cloud platform.About Our Company:We are recognized as one of the leading retail analytics platforms, empowering marketing teams and brands to decode retail data and execute targeted media campaigns without the need for coding. Our services enhance client understanding of customer behavior and maximize ROI on marketing campaigns, with notable clients including Home Depot.Utilize a modern cloud stack, with a focus on Azure, CI/CD, containerization, and distributed computing technologies.About You:We are in search of a dynamic and skilled Senior SRE/Cloud Engineer who is eager to take on a pivotal role in managing our Cloud Operations, ensuring uptime, reliability, and automation.Key Responsibilities:Collaborate with software engineering teams to design, implement, and maintain CI/CD pipelines for rapid and reliable software releases.Automate and optimize infrastructure provisioning, configuration, and management processes utilizing industry-standard tools and methodologies.Implement and manage containerization and orchestration technologies to enhance scalability and resource efficiency.Own the end-to-end availability and performance of our cloud infrastructure; proactively identify potential issues and implement automation to mitigate recurrence.Participate in an on-call rotation to ensure system stability and responsiveness during off-hours.Lead the development and implementation of service-level objectives crucial for maintaining product reliability.
Momentum Financial Services Group (MFSG) is the company behind Money Mart, Canada’s largest non-bank branch network. With over four decades of experience, MFSG delivers financial solutions for underserved communities, including short-term loans, money transfers, and prepaid cards. Each year, millions of customers rely on these services for timely financial support. Role Overview: Site Reliability Engineer The Site Reliability Engineer plays a key role in keeping MFSG’s digital banking and financial services platforms available, responsive, and resilient. This position centers on automating operational tasks, setting and maintaining service-level objectives, and engineering systems to withstand and recover from failures. Daily work involves close collaboration with engineering, DevOps, QA, cybersecurity, and compliance teams to ensure platform reliability meets both technical and regulatory requirements. The role also emphasizes proactive monitoring, incident response, and ongoing improvements to the software delivery process to reduce production risk. Why Join Momentum Financial Services Group? Competitive compensation that reflects experience and current market rates Annual bonus based on individual and company achievements Comprehensive benefits including health and dental coverage with premiums fully paid, plus Employee Assistance Program access Retirement planning support to help prepare for the future Hybrid work model offering flexibility between remote work and in-office collaboration at the Toronto headquarters Employee perks such as tuition reimbursement, professional development, Perkopolis discounts, and recognition programs Location Toronto, Canada (hybrid work model)
Brafton Inc. is a global content marketing agency with teams in Boston, London, Toronto, and Sydney. The company delivers strategic content aimed at improving SEO, increasing social engagement, and generating leads for a variety of clients. Teams at Brafton work across multiple media, including video, blogging, infographics, and web development. This full-time Content Writer position is fully remote and open to candidates based anywhere in Canada. The role centers on producing engaging, original content that supports client marketing strategies. Content Writers at Brafton work directly with clients and create a range of materials, such as white papers, case studies, landing pages, infographic outlines, video blog scripts, and long-form articles. Most clients are B2B organizations in sectors like technology, healthcare, finance, business, education, and marketing. Research and writing tailored to these industries, while meeting deadlines, is a key part of the job. Main responsibilities Research, create, and deliver high-quality content to clients within set deadlines. Edit peer submissions using Brafton's editorial guidelines. Collaborate with account managers to ensure content meets client goals. Participate in virtual meetings for projects, client discussions, and team updates. Track writing and editing productivity daily using the internal project management system. Develop and propose monthly content topics based on client briefs and industry trends. Communicate directly with clients as needed. Requirements Bachelor's degree in Journalism, Marketing, or Communications preferred. 1-3 years of relevant writing experience in content marketing. Strong writing and editing skills, with a portfolio of relevant samples considered highly valuable. Attention to detail. Familiarity with AP style guidelines. Understanding of SEO keywords and branded content principles. Strong time management and organizational skills. Comfort working in a high-volume, deadline-driven setting. Self-motivated, able to work independently and as part of a team. Creative approach that aligns with client marketing objectives.
We are seeking a detail-oriented Data Administrator/Specialist to join our dynamic team at Ample Insight Inc. This role is crucial in assisting our data scientists and engineers in managing, cleaning, and curating data from diverse sources. Day-to-day tasks may involve identifying objects in images, conducting online research, and entering data accurately. Strong communication skills are essential for articulating findings through clear written summaries and during conference calls.Key Responsibilities:Execute data labeling tasks which include object identification and annotation in photographs.Update and maintain databases and spreadsheets efficiently.Summarize and communicate findings, addressing any issues promptly while proposing workflow improvements.Ensure the accuracy and completeness of collected data, adhering to established guidelines (spelling, grammar, etc.).Handle general inquiries and information requests.Provide administrative support to team members as needed.Research, gather, and compile information effectively.Collaborate closely with our data science and engineering teams.Perform additional duties as required.
ABOUT QUINCEEstablished in 2018, Quince is revolutionizing the retail landscape by demonstrating that high-quality goods can be affordably priced. Our mission is straightforward: to provide premium essentials at accessible costs, produced ethically and sustainably. We believe everyone deserves exceptional craftsmanship and timeless design without the inflated prices typically associated with luxury. Quince operates on a direct-to-consumer (DTC) model that eliminates intermediaries, utilizing just-in-time manufacturing to reduce waste and enhance value.Quince is a tech-driven company that is transforming the retail sector by integrating AI, analytics, and automation into our core operations. Our steadfast dedication to excellence and adherence to our company values shape our decisions and actions:Customer First: We prioritize customer satisfaction in every decision.High Quality: True quality means premium materials and rigorous production standards you can feel good about.Essential Design: We focus on timeless, functional essentials instead of chasing trends.Always a Better Deal: Innovation and transparency ensure value for both customers and partners.Social & Environmental Responsibility: We commit to sustainable materials, ethical production, and fair wages.Quince collaborates with top-tier manufacturers worldwide, serving millions of satisfied customers. Backed by strong investors and a commitment to sustainable growth, we are rapidly expanding while upholding our focus on quality, simplicity, and radical price transparency.JOIN OUR TEAM AND BE PART OF OUR SUCCESS
Join our innovative team at Smile Digital Health as a Technical Product Manager. In this fully remote role, you will lead the development and execution of digital health products that enhance patient care and drive business success.As a key player in our organization, you will collaborate closely with cross-functional teams, including engineering, design, and marketing, to ensure product alignment with customer needs and business goals.
Join our dynamic team at newton as a DevOps Engineer and be part of a transformative journey in the tech industry. This role is designed for innovative individuals who thrive in a fast-paced environment and are passionate about implementing cutting-edge solutions.As a DevOps Engineer, you will collaborate closely with our development and operations teams to enhance the efficiency and reliability of our infrastructure. You will be responsible for designing, implementing, and maintaining our CI/CD pipelines, as well as monitoring system performance and troubleshooting issues. Your expertise will be crucial in optimizing our cloud services and automating processes to streamline development.
Join Smile Digital Health as an Intermediate Backend Developer. In this pivotal role, you will collaborate with engineering teams to design, implement, and maintain key components of our innovative platform. This position offers the opportunity for autonomy, allowing you to contribute throughout the entire development lifecycle—from requirements analysis to final delivery. As you grow in this role, you will deepen your technical expertise, preparing you for future senior-level responsibilities.
Join our innovative team at Newton as a Site Reliability Engineer, where you'll play a crucial role in ensuring the reliability and performance of our systems. In this fully remote position, you will collaborate with engineering and operations teams to develop solutions that enhance system uptime and efficiency.Your expertise will help us transition and maintain our infrastructure, ensuring our services are resilient and scalable. This is an exciting opportunity to contribute to a company that values innovation and teamwork.
Full-time|CA$144K/yr - CA$200K/yr|Hybrid|Toronto; Vancouver
The TeamAt MongoDB, our Platform Engineering division within Site Reliability Engineering (SRE) is tasked with managing essential infrastructure and operational functions that empower our engineering teams. This includes our robust, multi-cloud Kubernetes infrastructure, deployment systems, and advanced observability and alerting mechanisms.The Fabric team is at the forefront of enabling secure communication across systems and from the public internet. Our responsibilities involve designing network architecture, implementing service mesh solutions, and optimizing edge load balancing to ensure the safety of customer data in transit. This team is vital in developing and maintaining a dependable and globally connected multi-cloud network that underpins MongoDB products.This position can be based in our Toronto or Vancouver offices, or you can work completely remotely from anywhere in North America. We provide flexible hybrid work arrangements for those in our offices.
Veeva Systems is a mission-driven leader in industry cloud technology, dedicated to accelerating the delivery of therapies to patients in the life sciences sector. As one of the fastest-growing SaaS companies ever, we surpassed $2 billion in revenue last fiscal year with significant growth prospects ahead.Central to Veeva's mission are our core values: Do the Right Thing, Customer Success, Employee Success, and Speed. Notably, we made history in 2021 by becoming a public benefit corporation (PBC), which legally commits us to balance the interests of our customers, employees, society, and investors.As a Work Anywhere company, we empower you to choose your work environment, whether it's from home or in our office, enabling you to excel in your preferred setting.Be part of our journey in transforming the life sciences industry and making a positive impact on our customers, employees, and communities.The RoleWe are seeking a talented Senior Site Reliability Engineer to join our Vault Platform team. In this role, you will be instrumental in ensuring the scalability and reliability of our enterprise applications. You will face complex challenges on a global scale, leveraging your extensive knowledge of Java and modern open-source technologies to create a meaningful impact on our production systems.The ideal candidate will possess substantial experience with Java applications and cutting-edge open-source technologies, particularly within the context of enterprise software development or a high-growth tech environment. As a Senior SRE, you should have a natural curiosity and a strong aptitude for problem-solving. Your unique engineering perspective will be critical as you understand how systems integrate in production to function efficiently on a global scale, supporting hundreds of customers across North America, Europe, and Asia.
Pinterest is hiring a Senior Site Reliability Engineer in Toronto, ON, Canada. The focus of this role is to ensure that Pinterest’s services remain reliable, scalable, and perform well as the platform grows. Working closely with software engineers, this position involves designing and implementing solutions that strengthen system reliability and efficiency. Key responsibilities Partner with engineering teams to maintain and enhance the reliability of Pinterest’s services Design and implement improvements to support scalability and performance Troubleshoot and resolve service issues to reduce downtime Requirements Extensive experience in site reliability engineering or a closely related field Strong technical background with proven problem-solving abilities Comfort working alongside software engineers to improve systems This position is located in Toronto, ON, Canada.
At Veeva Systems, we are driven by a mission to revolutionize the life sciences industry, empowering companies to bring therapies to patients at an accelerated pace. As one of the fastest-growing SaaS companies in history, we achieved over $2 billion in revenue last fiscal year and possess immense growth potential.Our core values - Do the Right Thing, Customer Success, Employee Success, and Speed - define who we are. In 2021, we made history by becoming a public benefit corporation (PBC), committed to balancing the interests of our customers, employees, society, and investors.As a Work Anywhere organization, we offer the flexibility for you to work remotely or from our office, allowing you to thrive in your preferred environment.Join us in transforming the life sciences sector and making a positive impact on our customers, employees, and communities.
Full-time|CA$243K/yr - CA$297K/yr|On-site|Toronto, ON
At Relay, we empower self-made business owners with a digital banking platform that transforms financial management into a source of clarity, confidence, and control. Our mission is to replace financial uncertainty with genuine visibility, enabling entrepreneurs to convert their hard work into enduring success. By alleviating the stress of cash flow management, we provide the tools necessary for owners to operate robust and resilient businesses.As Relay continues its growth trajectory, the reliability, performance, and resilience of our platform have become integral to both our customer experience and overall business success.This senior leadership position is crucial in steering a team of Site Reliability Engineers while shaping how reliability strategies influence engineering and product decisions throughout the organization. You will determine the future direction of the SRE function, promote operational excellence, and assist the company in anticipating and managing scale challenges before they pose risks.If you thrive on tackling complex systems, leading organizations, and building resilient platforms that customers depend on daily, we are eager to connect with you!Key ResponsibilitiesLead and enhance Relay’s Site Reliability Engineering function, establishing strategic direction as the company scales.Define and implement a long-term reliability roadmap, making informed trade-offs under real business and capacity constraints.Act as the senior reliability voice in discussions involving engineering and product leadership.Influence the integration of reliability considerations into product planning, architectural decisions, and delivery processes.Serve as a senior escalation point during critical production incidents, ensuring effective communication and thorough follow-up actions.Enhance Relay’s observability, performance, and operational maturity practices across teams.Establish and uphold standards concerning SLOs, operational readiness, incident management, and continuous improvement.Collaborate with stakeholders in Engineering, Product, Data, and Finance to balance velocity, risk, performance, and cost.Build and nurture a high-performing SRE organization capable of supporting future growth.
Full-time|CA$144K/yr - CA$200K/yr|Hybrid|Montreal; Toronto
The Storage Layer Services (SLS) team at MongoDB is embarking on an innovative journey to re-architect our cloud storage layer, forming the core of our next-generation cloud storage architecture. This newly established team is dedicated to creating high-performance, multi-tenant distributed storage services that not only enhance our current Atlas storage stack but also enable more efficient customer workloads. As a Senior Site Reliability Engineer, you will collaborate closely with teams responsible for these storage services to establish Service Level Objectives (SLOs), develop capacity plans, and guarantee the reliability, durability, and operational safety of the foundational storage layer supporting Atlas. By joining our small team of seasoned SREs, you will play an integral role in executing a multi-year roadmap for MongoDB’s cloud storage architecture. This position is open to candidates based in our Toronto or Montreal offices or those working remotely from anywhere in Canada, provided they are located in the Eastern or Central time zones.
Join Tenstorrent as a Site Reliability Engineer, where you will play a crucial role in ensuring the reliability and performance of our cutting-edge systems. As a member of our dedicated engineering team, you will work on innovative solutions to enhance our infrastructure and streamline operations. Your expertise will help us deliver exceptional service and uptime to our customers.
Full-time|$211.5K/yr - $258.5K/yr|On-site|Toronto, ON
At Relay, we are revolutionizing the way self-made business owners manage their finances through our cutting-edge digital banking platform. Our mission is to empower entrepreneurs with the tools and knowledge they need to achieve financial clarity, confidence, and control over their earnings. By transforming cash flow management from a source of stress into a clear, actionable insight, we help our customers build stronger and more resilient businesses.As we continue to grow, the reliability, performance, and resilience of our platform have become critical components of our customer experience and overall business success.We are currently seeking an Engineering Manager to lead our Site Reliability Engineering (SRE) team. In this pivotal role, you will oversee the scalability, reliability, and robustness of Relay's systems. This position transcends infrastructure management and incident response; it is a leadership opportunity that sits at the nexus of technology, team dynamics, and business strategy. You will mentor and manage a talented SRE team, influence how reliability is integrated across the organization, and ensure our systems can safely scale in response to increasing customer demands and complexity.If you thrive in technically demanding environments and are passionate about fostering strong teams, a healthy workplace culture, and effective cross-functional collaboration, this position is designed for you.
At Movable Ink, we empower marketers with cutting-edge content personalization through data-driven content creation and AI-driven decision-making. Our innovative platform is trusted by top global brands to enhance revenue, streamline workflows, and increase marketing agility. With our headquarters in New York City and a talented team of nearly 600 employees, Movable Ink has a presence across North America, Central America, Europe, Australia, and Japan.As a Lead Site Reliability Engineer, you will leverage your technical expertise and leadership skills to oversee infrastructure and software development initiatives. You will play a pivotal role in designing and evolving key systems within our multi-cloud, multi-region content serving platform, which handles over 25 billion requests daily. By fostering architectural vision, cross-team collaboration, and mentorship, you will spearhead reliability initiatives and define the technical strategies necessary for scaling our platform to accommodate 50 billion requests per day and beyond.
Empower Every Identity, from AI to HumanIdentity is the cornerstone of unlocking AI's potential. At Okta, we secure AI by creating a trustworthy, neutral infrastructure that allows organizations to confidently navigate this transformative era. This mission demands an unwavering commitment to addressing intricate challenges with significant real-world implications. We seek innovative builders who act with speed and urgency and execute with exceptional proficiency.This is your chance to engage in work that can define your career. We are fully dedicated to this mission. If you share this passion, we want to hear from you.Join Us in Securing Every Identity, from AI to HumanOkta is at the forefront of providing a superior authentication experience for hundreds of millions globally. Our focus on reliability forms the bedrock of our product, with a strong commitment to surpassing customer expectations for availability being a fundamental engineering priority. As a Senior Site Reliability Engineer, you will be part of our SRE team, ensuring our production systems are not only fully operational but also resilient, scalable, and poised for remarkable growth. This role goes beyond mere maintenance; it is about playing a significant role in enhancing the core robustness and resilience of our platform. You will be a proactive builder, developing solutions that inherently boost our system's reliability.Your Responsibilities:Craft and develop custom software in Go to bolster the platform’s reliability and resilience.Collaborate with engineering teams to integrate reliability principles, enhancing the availability, performance, and observability of our services.Utilize your profound understanding of infrastructure and observability to pinpoint improvement opportunities within the product and implement effective solutions.Participate in our on-call rotation, providing swift, effective responses to critical incidents and utilizing your expertise to troubleshoot, mitigate, or accurately escalate production issues.Enhance our SRE tooling and processes, focusing on automation and operational efficiency.Establish, document, and promote reliability best practices throughout the organization.
About Rootly At Rootly, we are dedicated to revolutionizing how organizations manage incidents. Our mission is to provide a reliable incident management platform that empowers companies to respond swiftly and effectively when challenges arise. Our innovative approach has established us as leaders in a new multi-billion dollar segment, and we are seeking exceptional talent to help us achieve our ambitious goals. Our customers, including industry giants like NVIDIA, Figma, Canva, and Tripadvisor, trust Rootly for their critical incident management needs. They appreciate our user-friendly platform and unique partnership approach, which has garnered us a stellar 5-star rating on G2. Join us in creating a reliable future for organizations worldwide. Backed by prestigious investors from Y Combinator to key operators in tech, we prioritize transparency and team involvement in our financial health. We conduct monthly business reviews and share updates through our weekly changelog. About the Role As a Senior Site Reliability Engineer at Rootly, you will play a crucial role in shaping our technical infrastructure. You will thrive in a dynamic environment where each day presents new challenges and opportunities for growth. This position is perfect for individuals who seek ownership, enjoy tackling complex technical problems, and are driven by a mission to enhance reliability. While the work will be demanding, it promises to be one of the most rewarding experiences in your career. Collaborate with product teams to enhance the observability, reliability, and performance of services. Take ownership of our CI/CD pipelines, observability tools, monitoring systems, and incident response processes. Develop tools and automation to reduce manual toil, enhance engineering velocity, and improve developer experience and system reliability. Engage deeply with engineering teams to gain insights into system performance and identify cross-functional reliability and scaling concerns. Design and scale our infrastructure while ensuring top-notch performance and operational excellence.
A Few Important Notes:Join a Profitable B2B SaaS company with teams primarily located in North America.This position is predominantly remote, with a requirement to meet in Toronto once a month.Candidates must possess the legal right to work in Canada; we are unable to provide visa sponsorship.As our platform continues to expand, we are actively seeking a Senior Site Reliability Engineer (SRE) / Cloud Engineer.Experience with Azure is highly prioritized as it is our primary cloud platform.About Our Company:We are recognized as one of the leading retail analytics platforms, empowering marketing teams and brands to decode retail data and execute targeted media campaigns without the need for coding. Our services enhance client understanding of customer behavior and maximize ROI on marketing campaigns, with notable clients including Home Depot.Utilize a modern cloud stack, with a focus on Azure, CI/CD, containerization, and distributed computing technologies.About You:We are in search of a dynamic and skilled Senior SRE/Cloud Engineer who is eager to take on a pivotal role in managing our Cloud Operations, ensuring uptime, reliability, and automation.Key Responsibilities:Collaborate with software engineering teams to design, implement, and maintain CI/CD pipelines for rapid and reliable software releases.Automate and optimize infrastructure provisioning, configuration, and management processes utilizing industry-standard tools and methodologies.Implement and manage containerization and orchestration technologies to enhance scalability and resource efficiency.Own the end-to-end availability and performance of our cloud infrastructure; proactively identify potential issues and implement automation to mitigate recurrence.Participate in an on-call rotation to ensure system stability and responsiveness during off-hours.Lead the development and implementation of service-level objectives crucial for maintaining product reliability.
Momentum Financial Services Group (MFSG) is the company behind Money Mart, Canada’s largest non-bank branch network. With over four decades of experience, MFSG delivers financial solutions for underserved communities, including short-term loans, money transfers, and prepaid cards. Each year, millions of customers rely on these services for timely financial support. Role Overview: Site Reliability Engineer The Site Reliability Engineer plays a key role in keeping MFSG’s digital banking and financial services platforms available, responsive, and resilient. This position centers on automating operational tasks, setting and maintaining service-level objectives, and engineering systems to withstand and recover from failures. Daily work involves close collaboration with engineering, DevOps, QA, cybersecurity, and compliance teams to ensure platform reliability meets both technical and regulatory requirements. The role also emphasizes proactive monitoring, incident response, and ongoing improvements to the software delivery process to reduce production risk. Why Join Momentum Financial Services Group? Competitive compensation that reflects experience and current market rates Annual bonus based on individual and company achievements Comprehensive benefits including health and dental coverage with premiums fully paid, plus Employee Assistance Program access Retirement planning support to help prepare for the future Hybrid work model offering flexibility between remote work and in-office collaboration at the Toronto headquarters Employee perks such as tuition reimbursement, professional development, Perkopolis discounts, and recognition programs Location Toronto, Canada (hybrid work model)
Brafton Inc. is a global content marketing agency with teams in Boston, London, Toronto, and Sydney. The company delivers strategic content aimed at improving SEO, increasing social engagement, and generating leads for a variety of clients. Teams at Brafton work across multiple media, including video, blogging, infographics, and web development. This full-time Content Writer position is fully remote and open to candidates based anywhere in Canada. The role centers on producing engaging, original content that supports client marketing strategies. Content Writers at Brafton work directly with clients and create a range of materials, such as white papers, case studies, landing pages, infographic outlines, video blog scripts, and long-form articles. Most clients are B2B organizations in sectors like technology, healthcare, finance, business, education, and marketing. Research and writing tailored to these industries, while meeting deadlines, is a key part of the job. Main responsibilities Research, create, and deliver high-quality content to clients within set deadlines. Edit peer submissions using Brafton's editorial guidelines. Collaborate with account managers to ensure content meets client goals. Participate in virtual meetings for projects, client discussions, and team updates. Track writing and editing productivity daily using the internal project management system. Develop and propose monthly content topics based on client briefs and industry trends. Communicate directly with clients as needed. Requirements Bachelor's degree in Journalism, Marketing, or Communications preferred. 1-3 years of relevant writing experience in content marketing. Strong writing and editing skills, with a portfolio of relevant samples considered highly valuable. Attention to detail. Familiarity with AP style guidelines. Understanding of SEO keywords and branded content principles. Strong time management and organizational skills. Comfort working in a high-volume, deadline-driven setting. Self-motivated, able to work independently and as part of a team. Creative approach that aligns with client marketing objectives.
We are seeking a detail-oriented Data Administrator/Specialist to join our dynamic team at Ample Insight Inc. This role is crucial in assisting our data scientists and engineers in managing, cleaning, and curating data from diverse sources. Day-to-day tasks may involve identifying objects in images, conducting online research, and entering data accurately. Strong communication skills are essential for articulating findings through clear written summaries and during conference calls.Key Responsibilities:Execute data labeling tasks which include object identification and annotation in photographs.Update and maintain databases and spreadsheets efficiently.Summarize and communicate findings, addressing any issues promptly while proposing workflow improvements.Ensure the accuracy and completeness of collected data, adhering to established guidelines (spelling, grammar, etc.).Handle general inquiries and information requests.Provide administrative support to team members as needed.Research, gather, and compile information effectively.Collaborate closely with our data science and engineering teams.Perform additional duties as required.
ABOUT QUINCEEstablished in 2018, Quince is revolutionizing the retail landscape by demonstrating that high-quality goods can be affordably priced. Our mission is straightforward: to provide premium essentials at accessible costs, produced ethically and sustainably. We believe everyone deserves exceptional craftsmanship and timeless design without the inflated prices typically associated with luxury. Quince operates on a direct-to-consumer (DTC) model that eliminates intermediaries, utilizing just-in-time manufacturing to reduce waste and enhance value.Quince is a tech-driven company that is transforming the retail sector by integrating AI, analytics, and automation into our core operations. Our steadfast dedication to excellence and adherence to our company values shape our decisions and actions:Customer First: We prioritize customer satisfaction in every decision.High Quality: True quality means premium materials and rigorous production standards you can feel good about.Essential Design: We focus on timeless, functional essentials instead of chasing trends.Always a Better Deal: Innovation and transparency ensure value for both customers and partners.Social & Environmental Responsibility: We commit to sustainable materials, ethical production, and fair wages.Quince collaborates with top-tier manufacturers worldwide, serving millions of satisfied customers. Backed by strong investors and a commitment to sustainable growth, we are rapidly expanding while upholding our focus on quality, simplicity, and radical price transparency.JOIN OUR TEAM AND BE PART OF OUR SUCCESS
Join our innovative team at Smile Digital Health as a Technical Product Manager. In this fully remote role, you will lead the development and execution of digital health products that enhance patient care and drive business success.As a key player in our organization, you will collaborate closely with cross-functional teams, including engineering, design, and marketing, to ensure product alignment with customer needs and business goals.
Join our dynamic team at newton as a DevOps Engineer and be part of a transformative journey in the tech industry. This role is designed for innovative individuals who thrive in a fast-paced environment and are passionate about implementing cutting-edge solutions.As a DevOps Engineer, you will collaborate closely with our development and operations teams to enhance the efficiency and reliability of our infrastructure. You will be responsible for designing, implementing, and maintaining our CI/CD pipelines, as well as monitoring system performance and troubleshooting issues. Your expertise will be crucial in optimizing our cloud services and automating processes to streamline development.
Join Smile Digital Health as an Intermediate Backend Developer. In this pivotal role, you will collaborate with engineering teams to design, implement, and maintain key components of our innovative platform. This position offers the opportunity for autonomy, allowing you to contribute throughout the entire development lifecycle—from requirements analysis to final delivery. As you grow in this role, you will deepen your technical expertise, preparing you for future senior-level responsibilities.
Apr 13, 2026
Sign in to browse more jobs
Create account — see all 1,271 results
Tailoring 0 resumes…
Tailoring 0 resumes…
We'll move completed jobs to Ready to Apply automatically.