companyBjak logo

Lead Principal Machine Learning Engineer

BjakChina
On-site Full-time

Clicking Apply Now takes you to AutoApply where you can tailor your resume and apply.


Unlock Your Potential

Generate Job-Optimized Resume

One Click And Our AI Optimizes Your Resume to Match The Job Description.

Is Your Resume Optimized For This Role?

Find Out If You're Highlighting The Right Skills And Fix What's Missing

Experience Level

Mid to Senior

Qualifications

RequirementsSolid foundation in deep learning and transformer-based architectures. Hands-on experience in training, fine-tuning, or deploying large-scale ML models in a production environment. Proficient in at least one modern ML framework (e.g. PyTorch, JAX) with a demonstrated ability to quickly master new tools. Experience with distributed training and inference frameworks (e.g. DeepSpeed, FSDP, Megatron, ZeRO, Ray). Strong software engineering principles—capable of building robust, maintainable, production-level systems. Familiarity with GPU optimization techniques, including memory efficiency, quantization, and mixed precision. Ability to take ownership of ambiguous, zero-to-one ML systems from concept to execution. A proactive approach to rapid iteration, learning, and system enhancement.

About the job

About the Role

At bjakcareer, we are pioneering an advanced AI system designed to comprehend context in conversations, orchestrate actions, and ensure continuous progress over time.

As a Principal Machine Learning Engineer, you will play a pivotal role in transforming research initiatives into robust, production-ready machine learning systems. This position encompasses the execution layer of our AI intelligence, focusing on training pipelines, inference systems, evaluation tools, and deployment processes.

Key Responsibilities

  • Develop and manage comprehensive ML pipelines covering data processing, training, evaluation, inference, and deployment.

  • Refine and customize models utilizing cutting-edge techniques such as LoRA, QLoRA, SFT, DPO, and distillation.

  • Design and operate scalable inference systems, optimizing for latency, cost, and reliability.

  • Create and sustain data systems that ensure high-quality synthetic and real-world training datasets.

  • Implement evaluation pipelines that assess performance, robustness, safety, and bias in collaboration with research leadership.

  • Oversee production deployment, focusing on GPU optimization, memory efficiency, latency reduction, and scaling strategies.

  • Work closely with application engineering teams to seamlessly integrate ML systems into backend, mobile, and desktop applications.

  • Make pragmatic decisions to enhance systems rapidly, leveraging insights from real-world usage.

  • Operate effectively under real production pressures: latency, cost, reliability, and safety.

About Bjak

bjakcareer is at the forefront of artificial intelligence innovation, dedicated to developing proactive AI systems that not only understand user context but also effectively manage workflows over time. Join us in shaping the future of AI technology.

Similar jobs

Tailoring 0 resumes

We'll move completed jobs to Ready to Apply automatically.