companySpeechify logo

Software Engineer - Data Infrastructure & Acquisition

SpeechifyCuritiba, Brazil
Remote Full-time

Clicking Apply Now takes you to AutoApply where you can tailor your resume and apply.


Unlock Your Potential

Generate Job-Optimized Resume

One Click And Our AI Optimizes Your Resume to Match The Job Description.

Is Your Resume Optimized For This Role?

Find Out If You're Highlighting The Right Skills And Fix What's Missing

Experience Level

Mid to Senior

Qualifications

Ideal Candidate ProfileAdvanced degree (BS/MS/PhD) in Computer Science or a closely related field. A minimum of 5 years of experience in software development. Strong proficiency in bash/Python scripting within Linux environments. Expertise in Docker and Infrastructure-as-Code principles, with professional experience in at least one major cloud provider (we utilize GCP). Experience with web crawlers and large-scale data processing workflows is advantageous. Ability to manage multiple tasks effectively and adapt to shifting priorities. Excellent written and verbal communication skills.

About the job

Speechify builds text-to-speech tools used by over 50 million people worldwide. Our products help users turn reading materials, PDFs, books, Google Docs, news articles, and websites, into audio, making information more accessible and improving learning and retention. The product suite spans iOS, Android, Mac, Chrome extension, and web. Recent recognition includes Google’s Chrome Extension of the Year and Apple’s 2025 Design Award for Inclusivity.

The Speechify team works fully remote, with nearly 200 people collaborating from locations around the globe. Team members bring experience from Amazon, Microsoft, Google, Stripe, Vercel, Bolt, and top academic programs like Stanford.

Role overview: Software Engineer - Data Infrastructure & Acquisition

This role sits within the AI team’s Data division. The engineer will own data collection processes that support model training, helping Speechify build and scale high-quality datasets efficiently. The team’s infrastructure enables petabyte-scale dataset creation by combining engineering, infrastructure, and research.

What you will do

  • Identify and source new audio data for integration into Speechify’s ingestion pipeline.
  • Manage and improve cloud infrastructure for the ingestion pipeline using Google Cloud Platform (GCP) and Terraform.
  • Work with data scientists to boost cost efficiency, throughput, and dataset quality, supporting the development of next-generation models.
  • Collaborate with AI team members and company leadership to shape the dataset roadmap for future consumer and enterprise products.

Location

This position is based in Curitiba, Brazil, with remote collaboration as part of Speechify’s global team.

About Speechify

Speechify is on a mission to transform the way people engage with reading. Through our cutting-edge text-to-speech technology, we strive to make information accessible and learning more efficient for everyone. Join us in our journey to redefine reading and learning.

Similar jobs

Tailoring 0 resumes

We'll move completed jobs to Ready to Apply automatically.