About the job
Join Our Team
We invite you to become a vital part of Evolv as a Senior Data Infrastructure Engineer within our Machine Learning & Sensors organization. This pivotal role entails the design, construction, and maintenance of robust, secure, and scalable data pipelines that drive our AI/ML research and production systems. You will take charge of the complete data lifecycle—from ingestion across thousands to millions of edge devices, through cloud processing, to a centralized data factory that supports model training, evaluation, and ongoing enhancement.
Data is at the core of our mission to revolutionize AI-based weapon detection systems. Your expertise will ensure seamless data flow across various geographies, devices, and cloud systems, while adhering to stringent standards for quality, privacy, security, and scalability. This position is perfect for someone who is passionate about the intersection of distributed systems, cloud pipelines, and ML-driven data requirements.
Success in the Role: Your First Year
In the first 30 days:
- Gain an in-depth understanding of our existing edge-to-cloud data pipelines and deployment environments.
- Evaluate current data ingestion processes, governance frameworks, and cloud infrastructure.
- Identify challenges related to data reliability, quality, and operational scalability.
- Establish rapport with AI/ML, data science, field operations, and cloud engineering teams.
- Design and prototype both cloud and edge data processing pipelines.
Within the first three months:
- Implement enhancements to critical ingestion, validation, and processing pipelines.
- Deploy scalable data pipelines using AWS components such as S3, EC2, Lambda, Glue, Step Functions, and SageMaker integrations.
- Develop automated validation workflows to identify data corruption, missing metadata, or malformed data.
- Create automated model evaluation, training, and improvement pipelines to accelerate experimentation.
- Collaborate with field operations to enhance data reliability, observability, and coverage.
By the end of the first year:
- Oversee the entire lifecycle of mission-critical data pipelines that support AI/ML research and production.
- Architect advanced edge-to-cloud data systems capable of scaling across millions of devices.
- Establish and enforce data governance frameworks, including retention, access control, privacy, and lineage.
- Enable ML teams to quickly conduct experiments with high-quality, discoverable, versioned datasets.

