About the job
Join Our Vision
At Snapp, we are revolutionizing urban mobility. Our cutting-edge ride-hailing and transportation platform connects millions of users daily, offering safe, dependable, and efficient travel solutions. Leveraging real-time data and a strong infrastructure, we aim to make urban commuting quicker, easier, and more eco-friendly.
With the spirit of a global tech leader and the nimbleness of a startup, we create scalable services that adapt to diverse markets while addressing local requirements.
Your Contribution
As a Senior Data Engineer, you will architect, develop, and manage expansive data infrastructures and pipelines that process billions of records daily. Your expertise will ensure swift, trustworthy, and top-notch data flows across our lakehouse architecture, facilitating both streaming and batch operations. Your role is pivotal in enabling reliable data access, driving analytics, and fostering AI initiatives throughout the company.
Key Responsibilities
- Design and sustain large-scale ETL/ELT pipelines utilizing Apache Flink, Airflow, and Spark for both streaming and batch processes.
- Develop and enhance real-time streaming architectures using Kafka.
- Implement scalable ingestion frameworks for Delta Lake, Iceberg, and Hudi.
- Manage and optimize Ceph-based object storage in our data lakehouse.
- Ensure high-performance analytical querying by overseeing ClickHouse operations.
- Promote reliability, scalability, and cost-effectiveness across systems managing billions of daily records.
- Produce production-grade code in Python, Go, or Java.
- Establish data quality, monitoring, and observability frameworks.
- Collaborate with ML/AI teams to facilitate model training, feature pipelines, and inference workflows.
- Reduce data pipeline latency through the implementation of efficient solutions.

