About CI&T CI&T brings together human expertise and AI to build scalable technology solutions. With a team of over 8,000 professionals worldwide and more than 1,000 client partnerships over the past 30 years, CI&T focuses on real-world artificial intelligence and digital transformation. Location Requirement Important: Candidates living in the Metropolitan Region of Campinas must work onsite at our city offices, following our current attendance policy. Role Overview We are hiring a Senior Site Reliability Engineer (SRE) based in Brazil to join CI&T and support one of our projects. This role calls for someone who takes ownership of applications, manages their own backlog, and collaborates closely with cross-functional teams. Strong communication and analytical skills are essential. What You Will Do Analyze reliability, performance, and availability of applications. Monitor deployments, address performance and security issues, and apply lessons learned to prevent future incidents. Proactively manage and prioritize the task backlog, identify improvement areas, and suggest collaborative solutions. Communicate efficiently with teams across the application lifecycle to clarify needs and priorities. Stay informed about industry trends, best practices, and new technologies in cloud computing and DevOps/SRE. Technical Requirements Previous experience as a Site Reliability Engineer (SRE) and understanding of key reliability metrics. Background in monitoring Java backend applications. Strong experience with FinOps practices and cloud cost management. Hands-on with observability tools such as Datadog, Grafana, Prometheus, and Thanos. Experience working with AWS platforms (ECS, EKS), Kubernetes, and Docker. Proficient in Linux environments. Familiarity with GitHub, Jenkins, and Splunk (these are desirable but not strictly required). Experience building and maintaining CI/CD pipelines (GitHub Actions, Code Build, Code Pipeline). Knowledge of Infrastructure as Code using Terraform. Strong analytical and problem-solving skills, with adaptability and willingness to learn. Experience with performance and stress testing. Understanding of Chaos Theory, including what to test, how to validate, which failures to simulate, and how to analyze application impact.
Apr 17, 2026