Wrike stands as the premier work management platform designed for teams and organizations striving for enhanced collaboration, creativity, and achievement every day. Our platform unifies all work processes and teams, simplifying complexities, boosting productivity, and allowing individuals to concentrate on their most meaningful tasks. Our Vision: A world where everyone can focus on their most meaningful work, together. About the Role: Join Wrike's Backend Reliability (BRE) team, a crucial component of our backend infrastructure and the guardian of our uptime. We aim to achieve and maintain 99.99% availability while developing the tools, components, and safety nets relied upon by our entire engineering organization. As a Senior / Staff Backend Engineer on this team, you will not merely address tickets but architect essential reliability solutions that influence how Wrike scales, operates, and recovers from failures.Your Impact: Design, build, and maintain vital reliability components including HTTP rate limiters, internal DB schema migration tools, circuit breakers, and distributed Redis-based caching.Troubleshoot intricate production issues, optimize PostgreSQL usage, and ensure our distributed systems remain robust and stable under high load.Lead initial investigations during significant production incidents to identify probable root causes, assess impacts, and suggest mitigation strategies. Long-term solutions are then implemented by the responsible teams based on your insights.Develop scalable, reusable tools and frameworks aiding other engineering teams in building more resilient services.Utilize AI-driven tools and coding agents to expedite development, scrutinize architectures, and automate repetitive or error-prone tasks.Promote reliability best practices across engineering through knowledge sharing, design reviews, and establishing high technical standards.Your Qualifications: Proficient in Java/JVM, with experience in building scalable, high-performance backend systems; willing to adopt other languages as necessary.Strong grasp of distributed systems concepts, including high availability, the CAP theorem, and fault tolerance.Extensive experience with relational databases (PostgreSQL) and non-relational storage solutions.
Feb 17, 2026