About the role
At Braze, we pride ourselves on our vibrant culture, comprising a friendly, supportive, and deeply passionate team. We strive to fuel this passion by maintaining high standards, fostering collaboration, and promoting work-life balance as we collectively navigate rapid global growth while advocating for greater equity and opportunity both within our organization and the broader community.
To thrive in our environment, you must be ready to set high expectations for yourself and those around you. Our team values autonomy, accountability, and openness to new ideas as fundamental components of our ongoing success.
Our innate curiosity and willingness to share diverse interests enrich our culture and create a unique vibrancy. If you are motivated to tackle exciting challenges and take decisive action in the face of change, you will be empowered to make a significant impact here, supported by a talented and passionate team. If Braze resonates with you as a place where you can excel, we are eager to meet you.
WHAT YOU'LL DO
As a Senior Platform Software Engineer (PSWE), you will design and develop the distributed systems that drive Braze's expansive background processing platform. Our team is responsible for managing Sidekiq, which processes over a trillion jobs daily across Kubernetes clusters globally. Your work will involve autoscaling systems, metrics pipelines, reliable job execution, and internal frameworks that ensure safe distributed processing for application teams.
Operating at a monumental scale, Braze supports 3.3 billion monthly active users, gathers hundreds of billions of data points monthly, and sends billions of messages daily. Our technology stack includes Ruby on Rails, Go, MongoDB, Redis, and Kafka. As a PSWE, you will collaborate with application teams to enhance the Sidekiq platform, improving reliability, performance, and developer experience.
Main Responsibilities:
- Develop Braze's embedded frameworks that facilitate large-scale distributed processing.
- Design, build, and maintain internal software frameworks that power Braze’s asynchronous and background processing systems at a massive scale.
- Enhance and expand frameworks based on technologies like Sidekiq to reliably execute over a trillion jobs per day across a globally distributed platform.
- Take ownership of scaling behavior, reliability assurances, failure modes, and operational safety of these systems.
- Provide well-defined abstractions, tools, and guidelines that enable application teams to leverage distributed processing safely without managing underlying complexities.
- Improve observability, debugging capabilities, and overall system performance.
