About the job
About the Role
Join our dynamic team, affectionately known as MBMB (More Big More Better), where you will play a crucial role in optimizing our training and on-robot inference stacks. We are seeking bold innovations that drive substantial improvements rather than incremental changes.
Your Responsibilities Will Include:
- Maximizing GPU performance through innovative strategies
- Deploying machine learning, hardware, and software modifications that yield significant advancements
- Enhancing both inference and training stacks for optimal performance
Ideal Candidates Will:
- Possess proficiency in the latest machine learning techniques, particularly for training and inference optimizations within transformer and diffusion-based architectures
- Have a relentless pursuit of ML optimizations across various domains, including CUDA kernels, ML architecture, frontend and backend network bottlenecks, CPU inefficiencies, NVLink, and communication protocols, as well as optimizations in libraries such as Torch, NumPy, and Python.
