companySpeechify logo

Software Engineer - Data Infrastructure & Acquisition

SpeechifyNairobi, Kenya
Remote Full-time

Clicking Apply Now takes you to AutoApply where you can tailor your resume and apply.


Unlock Your Potential

Generate Job-Optimized Resume

One Click And Our AI Optimizes Your Resume to Match The Job Description.

Is Your Resume Optimized For This Role?

Find Out If You're Highlighting The Right Skills And Fix What's Missing

Experience Level

Mid to Senior

Qualifications

Ideal Candidate ProfileBS/MS/PhD in Computer Science or a related discipline. A minimum of 5 years of professional software development experience. Strong proficiency in bash/Python scripting within Linux environments. Experience with Docker and Infrastructure-as-Code principles, along with professional experience in at least one major Cloud Provider (GCP preferred). Familiarity with web crawlers and large-scale data processing workflows is advantageous. Excellent multitasking abilities and adaptability to shifting priorities. Exceptional communication skills, both written and verbal.

About the job

Speechify builds technology that turns written content into audio, helping over 50 million users learn and access information in new ways. Our text-to-speech tools work with PDFs, books, Google Docs, news articles, and websites, making reading more accessible and efficient.

Our suite of products spans iOS, Android, Mac, and Chrome. Speechify has earned recognition from Google as Chrome Extension of the Year and received Apple’s 2025 Design Award for Inclusivity.

The team at Speechify is fully distributed, with nearly 200 professionals worldwide. Members include frontend and backend engineers, AI research scientists, and leaders from companies such as Amazon, Microsoft, and Google, plus alumni from Stanford and startups like Stripe and Vercel. There is no central office; everyone works remotely.

Role Overview

The Data team within Speechify’s AI division is seeking a Software Engineer focused on Data Infrastructure & Acquisition. This position centers on managing and improving the systems that collect and prepare data for model training. The team’s mission is to assemble large-scale, high-quality datasets efficiently and cost-effectively, combining infrastructure, engineering, and research expertise.

What You Will Do

  • Find and secure new sources of audio data, then integrate them into the data ingestion pipeline.
  • Maintain and improve the cloud infrastructure for the ingestion pipeline, which runs on Google Cloud Platform and uses Terraform for management.
  • Partner with Scientists to optimize cost, throughput, and data quality, enabling richer datasets at scale for next-generation models.
  • Work with the AI team and company leadership to shape the dataset roadmap for both consumer and enterprise product development.

Location

This role is based in Nairobi, Kenya, as part of Speechify’s distributed team.

About Speechify

Speechify is at the forefront of innovation in the field of text-to-speech technology, revolutionizing the way individuals interact with written content. We pride ourselves on our inclusivity and commitment to learning, making reading accessible to everyone, everywhere.

Similar jobs

Tailoring 0 resumes

We'll move completed jobs to Ready to Apply automatically.