About the job
About Us
At Abacus Insights, we are revolutionizing the use of data for health plans. Our mission is straightforward: to make healthcare data actionable, enabling those who make care and cost decisions to act swiftly and with confidence.
We assist health plans in dismantling data silos to establish a unified, reliable data foundation. This foundation facilitates enhanced decision-making—allowing plans to improve outcomes, minimize waste, and provide superior experiences for both members and providers.
Supported by $100 million from leading investors, we are addressing significant challenges in an industry primed for transformation. Our platform enables GenAI applications by delivering clean, interconnected, and dependable healthcare data that supports automation, prioritization, and decision-making workflows—and this is why we are at the forefront of the industry.
Innovation begins with our people. We are daring, inquisitive, and collaborative—believing that the best ideas emerge from teamwork. Are you ready to make a difference? Join us as we shape the future together.
About the Role
We are in search of an experienced Data Engineer to become a part of our dynamic and rapidly growing Tech Ops division. With significant anticipated growth, this is a chance to create substantial technological impact. In this role, you will collaborate closely with customers, data vendors, and internal engineering teams to design, implement, and enhance intricate data integration solutions within a modern, large-scale cloud environment.
You will utilize advanced expertise in distributed computing, data architecture, and cloud-native engineering to establish scalable, resilient, and high-performance data ingestion and transformation pipelines. Acting as a trusted technical advisor, you will assist customers in adopting Abacus’s core data management platform and ensure high-quality, compliant data operations throughout the lifecycle.
Your Day-to-Day Responsibilities
- Architect, design, and implement high-volume batch and real-time data pipelines utilizing PySpark, SparkSQL, Databricks Workflows, and distributed processing frameworks.
- Create end-to-end ingestion frameworks that integrate with Databricks, Snowflake, AWS services (S3, SQS, Lambda), and vendor data APIs, ensuring data quality, lineage, and schema evolution.
- Develop data modeling frameworks, including star and snowflake schemas to support analytics and reporting needs.

