About the job
About Gusto
At Gusto, we are dedicated to empowering small businesses by managing essential services like payroll, health insurance, 401(k)s, and HR, allowing owners to focus on their passions and customers. With offices in Denver, San Francisco, and New York, we proudly support over 400,000 small businesses nationwide, fostering a workplace that reflects and celebrates the diverse customers we serve. Explore our Total Rewards philosophy.
About the Role:
We are seeking a seasoned engineer with extensive knowledge in distributed data systems to help shape the future of Gusto's storage architecture. In this impactful role, you will oversee intricate migrations, design high-scale systems, and establish benchmarks for automation, resilience, and security. Your work in implementing distributed database solutions will facilitate Gusto's ongoing growth and scalability.
About the Team:
The Datastores Infrastructure Engineering team is responsible for designing, building, and maintaining the data platforms that drive Gusto's products, including MySQL, Postgres, Redis, Kafka, and S3. We are committed to ensuring that our infrastructure is consistent, dependable, and equipped to support Gusto's expanding requirements. As we transition to self-hosted distributed databases, our focus lies in minimizing the blast radius, enhancing operational resilience, and enabling sustainable scalability.
Here’s what you’ll do day-to-day:
- Architect, deploy, and manage the complete lifecycle of distributed database systems (TiDB) on Kubernetes at scale, ensuring high availability, data consistency, and operational excellence.
- Coordinate complex, zero-downtime migrations from monolithic to distributed architectures, including vertical sharding to isolate Product Services.
- Define and implement efficiency enhancements across the storage infrastructure through query optimization, caching strategies, and workload management.
- Establish standards and develop reliable automation to maintain data consistency, integrity, and security across distributed systems.
- Continuously enhance operational excellence by decreasing on-call burdens with sustainable, long-term solutions.
- Collaborate with product engineering teams and technical partners to enable rapid and reliable product development.

