Clicking Apply Now takes you to AutoApply where you can tailor your resume and apply.
Unlock Your Potential
Generate Job-Optimized Resume
One Click And Our AI Optimizes Your Resume to Match The Job Description.
Is Your Resume Optimized For This Role?
Find Out If You're Highlighting The Right Skills And Fix What's Missing
Experience Level
Mid to Senior
Qualifications
We are looking for candidates who possess the following:
A solid understanding of modern machine learning techniques and toolsets.
Experience in debugging performance issues in training runs from start to finish.
Low-level GPU knowledge including PTX, SASS, warps, cooperative groups, Tensor Cores, and the memory hierarchy.
Proficiency in debugging and optimization using tools such as CUDA GDB, NSight Systems, and NSight Compute.
Familiarity with libraries like Triton, CUTLASS, CUB, Thrust, cuDNN, and cuBLAS.
An intuitive grasp of latency and throughput characteristics of CUDA graph launches, tensor core arithmetic, and asynchronous memory loads.
Experience with Infiniband, RoCE, GPUDirect, PXN, rail optimization, NVLink, and how to utilize these networking technologies for GPU clusters.
Knowledge of collective algorithms for distributed GPU training using NCCL or MPI.
A creative approach and a willingness to question and refine our methodologies and tools.
Fluency in English.
About the job
Join our dynamic Machine Learning team at Jane Street as a Machine Learning Performance Engineer. We are seeking an innovative engineer with a strong background in low-level systems programming and performance optimization.
Machine learning is fundamental to Jane Street's operations, providing a unique rapid-feedback environment for experimentation and real-time application in our trading strategies. Your role will focus on optimizing the performance of our machine learning models, enhancing both training efficiency and inference speed.
We prioritize efficient large-scale training, low-latency inference for real-time systems, and high-throughput inference for research purposes. Your contributions will involve not just improving CUDA performance but also adopting a holistic systems approach that encompasses storage systems, networking, and GPU-level considerations. We aim to ensure that our infrastructure is optimized at every level, from memory access to overall throughput.
If you have a curious mind and a passion for tackling complex problems, regardless of your background in finance, you will find a welcoming environment here at Jane Street.
About Jane Street
At Jane Street, we leverage cutting-edge machine learning to enhance our trading strategies and operations. We pride ourselves on fostering an environment that encourages experimentation and innovation, allowing our teams to adapt quickly to new ideas and challenges. With a commitment to collaboration and continuous improvement, we are at the forefront of financial technology.
Clicking Apply Now takes you to AutoApply where you can tailor your resume and apply.
Unlock Your Potential
Generate Job-Optimized Resume
One Click And Our AI Optimizes Your Resume to Match The Job Description.
Is Your Resume Optimized For This Role?
Find Out If You're Highlighting The Right Skills And Fix What's Missing
Experience Level
Mid to Senior
Qualifications
We are looking for candidates who possess the following:
A solid understanding of modern machine learning techniques and toolsets.
Experience in debugging performance issues in training runs from start to finish.
Low-level GPU knowledge including PTX, SASS, warps, cooperative groups, Tensor Cores, and the memory hierarchy.
Proficiency in debugging and optimization using tools such as CUDA GDB, NSight Systems, and NSight Compute.
Familiarity with libraries like Triton, CUTLASS, CUB, Thrust, cuDNN, and cuBLAS.
An intuitive grasp of latency and throughput characteristics of CUDA graph launches, tensor core arithmetic, and asynchronous memory loads.
Experience with Infiniband, RoCE, GPUDirect, PXN, rail optimization, NVLink, and how to utilize these networking technologies for GPU clusters.
Knowledge of collective algorithms for distributed GPU training using NCCL or MPI.
A creative approach and a willingness to question and refine our methodologies and tools.
Fluency in English.
About the job
Join our dynamic Machine Learning team at Jane Street as a Machine Learning Performance Engineer. We are seeking an innovative engineer with a strong background in low-level systems programming and performance optimization.
Machine learning is fundamental to Jane Street's operations, providing a unique rapid-feedback environment for experimentation and real-time application in our trading strategies. Your role will focus on optimizing the performance of our machine learning models, enhancing both training efficiency and inference speed.
We prioritize efficient large-scale training, low-latency inference for real-time systems, and high-throughput inference for research purposes. Your contributions will involve not just improving CUDA performance but also adopting a holistic systems approach that encompasses storage systems, networking, and GPU-level considerations. We aim to ensure that our infrastructure is optimized at every level, from memory access to overall throughput.
If you have a curious mind and a passion for tackling complex problems, regardless of your background in finance, you will find a welcoming environment here at Jane Street.
About Jane Street
At Jane Street, we leverage cutting-edge machine learning to enhance our trading strategies and operations. We pride ourselves on fostering an environment that encourages experimentation and innovation, allowing our teams to adapt quickly to new ideas and challenges. With a commitment to collaboration and continuous improvement, we are at the forefront of financial technology.