companySpeechify logo

Software Engineer - Data Infrastructure & Acquisition at Speechify | Kochi, India

SpeechifyKochi, India
Remote Full-time

Clicking Apply Now takes you to AutoApply where you can tailor your resume and apply.


Unlock Your Potential

Generate Job-Optimized Resume

One Click And Our AI Optimizes Your Resume to Match The Job Description.

Is Your Resume Optimized For This Role?

Find Out If You're Highlighting The Right Skills And Fix What's Missing

Experience Level

Mid to Senior

Qualifications

Ideal Candidate QualificationsBachelor's, Master's, or PhD in Computer Science or a related field.5+ years of software development experience. Strong proficiency in bash/Python scripting within Linux environments. Experience with Docker and Infrastructure-as-Code, along with professional experience in a major cloud platform (preferably GCP). Familiarity with web crawlers and large-scale data processing workflows is advantageous. Demonstrated ability to manage multiple priorities and adapt to changing demands. Excellent verbal and written communication skills.

About the job

Speechify builds text-to-speech tools that help over 50 million people turn written content, like PDFs, books, Google Docs, news, and web pages, into audio. Our mission is to make reading accessible for everyone. Industry leaders have recognized our work: Google named us Chrome Extension of the Year, and Apple awarded us the 2025 Design Award for Inclusivity.

Our distributed team spans nearly 200 professionals worldwide. Engineers, AI researchers, and specialists join us from organizations such as Amazon, Microsoft, Google, Stripe, Vercel, Bolt, and top academic programs including Stanford. We operate fully remotely, with no central office.

Role Overview

Speechify’s AI division is hiring a Software Engineer focused on Data Infrastructure & Acquisition. This position centers on building and maintaining the systems that collect and manage the vast datasets needed for training our machine learning models. The work blends infrastructure, engineering, and research to support data operations at petabyte scale.

What You Will Do

  • Find and connect new audio data sources to our ingestion pipeline.
  • Maintain and improve our data ingestion infrastructure, using Google Cloud Platform (GCP) and Terraform.
  • Collaborate with scientists to optimize data cost, throughput, and quality for model improvement.
  • Work with the AI team and company leadership to shape the dataset roadmap for future products.

Location

This role is based in Kochi, India.

About Speechify

Speechify is dedicated to breaking down the barriers to reading and learning, offering innovative text-to-speech solutions that empower users worldwide. With a strong commitment to inclusivity and accessibility, Speechify continues to expand its product offerings and enhance user experiences.

Similar jobs

Tailoring 0 resumes

We'll move completed jobs to Ready to Apply automatically.