Qualifications
Key Responsibilities:Design, develop, and maintain ELT/ETL pipelines using Synapse Pipelines, Data Factory, SQL, and notebooks to move data from D365 to ADLS and Synapse while ensuring job reliability through scheduling, monitoring, and troubleshooting. Write efficient SQL queries (including joins, views, CTEs, and window functions) and utilize basic PySpark/Python for data transformation and modeling. Implement and oversee data quality checks, collaborating with Data Stewards to address issues and contribute to root cause analysis and resolution. Apply foundational optimizations for queries and file formats (e.g., partitioning) under supervision and escalate platform-level tuning or performance concerns to the Lead. Maintain comprehensive metadata, data lineage, and clear documentation for tables, transformations, business rules, and dataset logic. Utilize Git and Azure DevOps for version control, code reviews, task management, and release coordination. Assist analysts and report authors during user acceptance testing, addressing defects and facilitating iterative improvements. Contribute to basic AI/ML enablement by preparing features, managing datasets, running training or inference notebooks, and aiding in experiment tracking under guidance.
About the job
Join Thorlabs, a leading innovator in photonics technology, where we design and manufacture advanced components, instruments, and systems that are pivotal in transforming the world. With a dedicated team of over 3,000 employees globally, we are at the forefront of cutting-edge research and real-world innovation.
At Thorlabs, we offer you the chance to grow your career, take ownership of your projects, and make meaningful contributions from day one. We appreciate the unique talents and perspectives that each employee brings to our dynamic and fast-paced culture, and we seek motivated individuals who are eager to make a significant impact.
The Data Engineer II will play a critical role in building, maintaining, and enhancing robust data pipelines and models that facilitate analytics, reporting, and data-driven decision-making. This position involves developing ETL/ELT processes, optimizing SQL and compute logic, documenting data lineage, and collaborating with Data Analysts, Data Stewards, and business partners to ensure the delivery of reliable and high-performance datasets. A foundational understanding of Python/PySpark and basic AI/ML concepts is required to support data preparation and experimentation. While this position is based in Newton, NJ, occasional duties at other Thorlabs locations may be necessary.
About Thorlabs
Thorlabs is a premier manufacturer of photonics products, dedicated to advancing technology in light-based research and applications. Our commitment to innovation and excellence drives our mission to support groundbreaking scientific endeavors globally.