Qualifications
Responsibilities:Innovatively source new audio data and integrate it into our ingestion pipeline. Manage and enhance our cloud infrastructure on GCP, utilizing Terraform for deployment. Work closely with our AI scientists to optimize the cost, throughput, and quality of our data, enabling the development of our next-generation models. Collaborate with the AI Team and Speechify leadership to shape the dataset roadmap that will drive our upcoming consumer and enterprise products. Ideal Candidate Requirements:BS/MS/PhD in Computer Science or a related discipline. A minimum of 5 years of professional experience in software development. Strong proficiency in bash/Python scripting within Linux environments. Expertise in Docker and Infrastructure-as-Code, with practical experience using a major cloud provider (GCP preferred). Familiarity with web crawlers and large-scale data processing workflows is advantageous. Able to manage multiple tasks and adapt to evolving priorities. Excellent communication skills, both verbal and written.
About the job
Speechify builds technology that removes barriers to learning. Over 50 million people use our text-to-speech tools to turn PDFs, books, Google Docs, news articles, and websites into audio. Our products span iOS, Android, Mac, a Chrome Extension, and a Web App. Recognition includes Chrome Extension of the Year from Google and the 2025 Design Award for Inclusivity from Apple.
Our fully remote team numbers nearly 200, including frontend and backend engineers, AI research scientists, and professionals from companies like Amazon, Microsoft, and Google. Many hold advanced degrees from top programs such as Stanford.
Role overview
Speechify is hiring a Software Engineer for the AI Data team. This position focuses on the entire data collection pipeline that supports model training. The work combines infrastructure, engineering, and research to build large-scale, high-quality datasets efficiently and cost-effectively.
Location
Berlin, Germany
About Speechify
Speechify is dedicated to transforming the way people interact with text. By providing tools that allow users to listen to written content, we foster a more inclusive learning environment. Our innovative technology has garnered recognition and a loyal user base, making us a leader in the text-to-speech industry.