About the job
Join Our Team as a Senior Data Engineer (AWS-native | Spark | Tokenization & Claims Data)
About Source Meridian
Source Meridian is a leading software development firm dedicated to addressing the most pressing challenges in the healthcare industry. Our focus lies in harnessing cutting-edge technologies in healthcare, artificial intelligence, and interoperability to drive innovation.
About the Role
We are seeking a talented Senior Data Engineer to spearhead the development and management of an AWS-native data platform that processes healthcare claims data and tokenized identifiers. Your responsibilities will include designing and deploying Spark-based data pipelines that transform, intersect, and enhance tokenized datasets primarily stored as Parquet on S3, with queries executed via Athena and additional AWS services. This role emphasizes authentic data engineering on AWS, intentionally avoiding managed lakehouse platforms like Databricks and Snowflake.
What You’ll Do
Develop and sustain Spark pipelines for processing extensive Parquet datasets on S3.
Implement comprehensive tokenization workflows, facilitating transit token to real token conversions and dataset intersections.
Process and deliver healthcare claims datasets for matched individuals, ensuring precise identity mapping and data integrity.
Orchestrate data pipelines utilizing Airflow and/or appropriate AWS-native orchestration tools.
Create reliable, testable, and observable ETL/ELT processes with features such as retries, idempotency, monitoring, and reprocessing.

