About the job
At d-Matrix, we are dedicated to unlocking the potential of generative AI and driving the evolution of technology. Positioned at the cutting edge of software and hardware innovation, we constantly challenge the limits of what can be achieved. Our corporate culture revolves around respect and collaboration, where humility and open communication are highly valued.
We foster an inclusive team environment where diverse perspectives lead to superior solutions. We are on the lookout for passionate individuals eager to tackle challenges and who excel in execution. Are you ready to discover your playground? Together, we can shape the infinite possibilities of AI.
Location:
Santa Clara, CA headquarters or any of our regional offices. Remote work is an option.
The Role: Staff Software Engineer - SIMD Kernels
What You Will Do:
As part of the SIMD Kernels team, you will contribute to the development of the software stack for our AI compute engine. Your responsibilities will include creating, enhancing, and maintaining software kernels for machine learning operators—such as softmax, layer normalization, and activation functions—for our next-generation AI hardware. You will also develop solutions that enhance our SDK, making it user-friendly for developers and facilitating performance analysis.
You should possess experience in constructing software kernels for modern hardware architectures and understand how to effectively map algorithms and AI-framework computational graphs to those architectures. Your expertise will enable you to navigate hardware-software co-design trade-offs and deliver high-quality software efficiently in a fast-paced development environment.
What You Will Bring:
Minimum Requirements:
MS or PhD in Computer Engineering, Mathematics, Physics, or a related discipline with 5+ years of industry experience.
Strong understanding of computer architecture, data structures, system software, and machine learning principles.
Proficiency in C/C++ and Python development within a Linux environment, utilizing standard development tools.
Experience in implementing algorithms using C/C++ and Python.
Familiarity with specialized hardware such as FPGAs, DSPs, GPUs, and AI accelerators, utilizing libraries like CUDA.
Experience in implementing operators frequently employed in ML workloads—GEMMs, Convolutions, softmax, layer normalization, pooling, etc.
Self-motivated team player with a robust ability to collaborate and innovate.

