About the job
Join Our Team at Rainforest!
At Rainforest, we are pioneering the payments-as-a-service landscape, offering innovative solutions that simplify payment monetization for specialized software platforms. Our focus is on empowering small to mid-sized platforms to enrich the value they provide to their small business customers through seamless embedded payments, all while alleviating operational and regulatory challenges.
Backed by a seasoned fintech founder and a top-tier venture capital firm, we are positioned to make a significant impact in the fintech space. We invite you to join us on this exciting journey!
Your Role
We seek a proactive and hands-on Site Reliability Engineer who excels in building and scaling cloud infrastructure within a dynamic startup environment. This role offers you the opportunity to take ownership of systems from design to production reliability, collaborating closely with engineering teams to deliver secure and scalable payment platforms. If you are passionate about automation, performance, and continuous improvement while making a real impact in fintech, you will thrive at Rainforest.
Key Responsibilities
- Manage and scale our AWS-based cloud infrastructure utilizing Terraform and Infrastructure-as-Code (IaC) practices.
- Develop, operate, and enhance Elastic Kubernetes Service (EKS) and serverless environments that underpin our payment services.
- Design and maintain modern Continuous Integration/Continuous Deployment (CI/CD) pipelines with GitLab to ensure rapid and secure deployments.
- Implement and refine monitoring, alerting, and observability practices to guarantee high uptime and swift incident resolution using tools like OpenTelemetry, Prometheus, and New Relic.
- Automate infrastructure and operational processes to streamline workflows and expedite delivery.
- Collaborate closely with application engineers to enhance system performance, reliability, and scalability.
- Lead incident response initiatives, conduct postmortems, and drive a culture of continuous improvement.
- Contribute to defining and implementing SRE best practices, including Service Level Indicators (SLIs) and Service Level Objectives (SLOs).
