About the job
As a hands-on leader, you will merge extensive technical knowledge in ML systems and performance enhancement with robust leadership and personnel management skills. You will have the opportunity to recruit, mentor, and cultivate a high-performing team of engineers, while promoting a culture that embraces innovation, collaboration, and ongoing improvement.
Your Responsibilities:
- Team Leadership & Management:
- Assemble, lead, and manage a high-performing team of ML and infrastructure engineers focused on acceleration.
- Offer technical direction, mentorship, and career development pathways for team members.
- Encourage a collaborative and inclusive team atmosphere.
- Establish team objectives, priorities, and roadmaps that align with organizational goals.
- Technical Strategy & Execution:
- Articulate the technical vision and strategy for ML acceleration throughout the organization.
- Identify and assess cutting-edge technologies and methodologies to expedite ML training, including data pipeline optimization, large-scale distributed training, data loader optimization, hardware acceleration, and model optimization techniques.
- Design, develop, and deploy scalable and efficient ML acceleration solutions.
- Cross-functional Collaboration:
- Work closely with ML research, ML Training platform, and product teams to understand their requirements and seamlessly integrate acceleration solutions.
- Communicate intricate technical concepts and strategies to both technical and non-technical stakeholders.
- Serve as a technical authority and proponent for ML acceleration initiatives across the company.
- Impact & Measurement:
- Regularly assess and report on the effectiveness of acceleration initiatives.
- Persistently seek out further optimization and innovation opportunities.

