About the job
About the Algorithm Team - Model Compression Division
It is widely recognized that LLM quantization can significantly enhance inference efficiency. However, implementing this in real-world applications presents ongoing challenges. The Model Compression Division is dedicated to developing user-friendly model compression tools that address these challenges and empower customers to maximize the efficiency of their NPU.
When model compression tools incorporate hardware-specific optimizations, they can achieve greater efficiency. To meet this demand, we have developed proprietary tools equipped with optimization features tailored for our NPU, enabling the provision of an essential software stack that maximizes NPU performance.
The FuriosaAI Model Compression tool is continuously evolving, with a focus on increasing automation, scalability, and reliability, leading to a growing demand for enhanced capabilities. As such, we are seeking talented software engineers with substantial software engineering experience who aspire to advance their careers as Model Compression Engineers.

