About the job
Speechify builds text-to-speech tools that help over 50 million people turn written content into audio. From PDFs and books to news articles and websites, our products make reading more accessible and efficient. Our apps span iOS, Android, Mac, Chrome, and web, earning recognition such as Chrome Extension of the Year from Google and the 2025 Apple Design Award for Inclusivity.
Our team of nearly 200 works remotely across the globe. We bring together frontend and backend engineers, AI researchers, and specialists from companies like Amazon, Microsoft, and Google, as well as alumni of top PhD programs and startups.
Role Overview
Speechify is hiring a Software Engineer for the AI data division in Hyderabad, India. This role focuses on all aspects of data collection that power our model training. The work centers on building and maintaining large-scale, high-quality datasets, integrating engineering, infrastructure, and research to do so efficiently.
What You Will Do
- Identify and bring in new audio data sources to expand our ingestion pipeline.
- Manage and improve the cloud infrastructure supporting the ingestion pipeline (currently on GCP, configured with Terraform).
- Partner with scientists to improve dataset cost, throughput, and quality for advanced model development.
- Work with AI team members and company leadership to shape the strategic roadmap for datasets used in future consumer and enterprise products.

