About the job
Speechify builds tools that turn written content into audio, helping over 50 million people read and learn in new ways. Our products span iOS, Android, Mac, Chrome, and the web. Google named us Chrome Extension of the Year, and Apple recognized our design in 2025.
Our fully remote team includes nearly 200 people with backgrounds at Amazon, Microsoft, Google, Stanford, Stripe, Vercel, and other top tech companies and universities.
Role Overview
The Data team within Speechify’s AI division is looking for a Software Engineer focused on data infrastructure and acquisition. This position centers on building and maintaining the systems that gather the large-scale datasets needed for model training. The work blends infrastructure, engineering, and research to deliver high-quality data at petabyte scale while keeping costs in check.
What You Will Do
- Find and connect new sources of audio data to our ingestion pipeline.
- Maintain and improve our cloud infrastructure for data ingestion, currently running on Google Cloud Platform and managed with Terraform.
- Work closely with Scientists to optimize for cost, throughput, and data quality, supporting richer datasets for our models.
- Collaborate with the AI team and company leadership to shape the roadmap for datasets that power future consumer and enterprise products.
Location
Remote (company operates without a physical office). Menlo Park, CA, USA listed as company location.
