companySpeechify logo

Software Engineer - Data Infrastructure & Acquisition

SpeechifySanta Clara, CA, USA
Remote Full-time $140K/yr - $200K/yr

Clicking Apply Now takes you to AutoApply where you can tailor your resume and apply.


Unlock Your Potential

Generate Job-Optimized Resume

One Click And Our AI Optimizes Your Resume to Match The Job Description.

Is Your Resume Optimized For This Role?

Find Out If You're Highlighting The Right Skills And Fix What's Missing

Experience Level

Experience

Qualifications

Ideal Candidate Qualifications: BS/MS/PhD in Computer Science or a related field. 5+ years of industry experience in software development. Proficiency with bash/Python scripting in Linux environments. Proficiency in Docker and Infrastructure-as-Code concepts and professional experience with at least one major Cloud Provider (we use GCP). Experience with web crawlers and large-scale data processing workflows is a plus. Ability to handle multiple tasks and adapt to changing priorities. Strong communication skills, both written and verbal.

About the job

Speechify’s mission is to remove reading barriers and make learning accessible for everyone.

Over 50 million people use Speechify’s text-to-speech tools to turn PDFs, books, Google Docs, news articles, and web pages into audio. These products help users read faster, understand more, and remember what they learn. Speechify has been named Chrome Extension of the Year by Google and received Apple’s 2025 Design Award for Inclusivity.

The team includes nearly 200 professionals around the world, working fully remotely. Engineers, AI researchers, and leaders from companies like Amazon, Microsoft, and Google, as well as alumni from top PhD programs and fast-growing startups, all contribute to Speechify’s growth.

Role Overview

Speechify is hiring a Software Engineer for the AI team’s data group. This engineer will help manage every aspect of data collection for model training. The team builds large, high-quality datasets at petabyte scale, combining infrastructure, engineering, and research to do so efficiently.

What You’ll Do

  • Source and identify new audio data for ingestion pipelines.
  • Manage and improve cloud infrastructure on Google Cloud Platform (GCP) using Terraform.
  • Work with scientists to improve cost, throughput, and data quality to support advanced model development.
  • Collaborate with AI team members and company leadership to plan a dataset roadmap for future consumer and enterprise products.

Qualifications

  • Bachelor’s, Master’s, or PhD in Computer Science or a related field.
  • At least 5 years of professional software development experience.
  • Expertise in bash or Python scripting in Linux environments.
  • Strong skills with Docker and Infrastructure-as-Code, with hands-on experience in a major cloud provider (GCP preferred).
  • Experience with web crawling and large-scale data processing is a plus.
  • Comfort managing multiple priorities and adapting as things change.
  • Clear written and verbal communication skills.

Location

Santa Clara, CA, USA (fully distributed team).

About Speechify

Speechify is dedicated to transforming the reading experience by making it accessible to millions. Our award-winning text-to-speech technology empowers users to consume information more efficiently and effectively, fostering a world where learning barriers are removed.

Similar jobs

Tailoring 0 resumes

We'll move completed jobs to Ready to Apply automatically.