About the job
About the Opportunity
As a Senior Data Engineer at Hinge Health, you will take the lead on developing and maintaining the data architecture that drives real-time experiences for our users. You will be responsible for constructing and optimizing data pipelines that aggregate data from numerous upstream services, including Kafka event streams and transactional databases, into a cohesive data platform that supports both real-time APIs and analytical tasks on Databricks. Your contributions will directly support AI-driven coaching assistants and physical therapy applications utilizing real-time member data, including engagement logs and clinical information, to provide tailored recommendations. This role places you at the crossroads of data engineering and artificial intelligence, constructing the dependable, low-latency data infrastructure essential for these systems.
You will engage with a cutting-edge tech stack: employing Python, Flink, and PySpark for pipeline development, Kafka for event streaming, Delta Lake for scalable data storage, and Aurora PostgreSQL for operational data management.
This role emphasizes high ownership. You will collaborate closely with application engineers, data scientists, and AI teams across the organization to define the data flow from its inception to its consumption. Furthermore, you will assist in establishing standards and practices that empower product teams to manage their own data in a HIPAA-compliant environment. If you are passionate about creating the data infrastructure that underpins AI systems impacting health outcomes, we encourage you to apply.
Our technology stack includes: Python, SQL, dbt, Airflow, PostgreSQL, MySQL, REST, Aptible, Docker, Tonic.ai, Terraform, Spark, Kafka, Flink, Fivetran, Databricks, and AWS (S3, Lambda, Kinesis, RDS, Glue).
Key Responsibilities
Design and construct the data architecture for AI-enhanced health solutions - You will build and oversee data pipelines that cover both streaming and batch processes, from data ingestion to serving. This includes creating ingestion layers that pull data from Kafka event streams and transactional databases, executing transformations using Flink, Databricks, and dbt, and making informed decisions about the placement and purpose of transformations. You will manage pipelines across the entire stack: from raw ingestion to normalized staging, aggregated analytical models, and the serving layer relied upon by downstream consumers including real-time APIs, BI tools, and AI applications. You will establish data contracts, oversee schema evolution, and ensure the data you deliver is well-documented and reliable for others to utilize.
Maintain platform reliability and data integrity - Define and monitor SLAs (Service Level Agreements) to ensure system performance and data quality.

