companyHark logo

Technical Staff Member - Multimodal Post-training and RL

HarkSan Jose New
On-site Full-time

Clicking Apply Now takes you to AutoApply where you can tailor your resume and apply.


Unlock Your Potential

Generate Job-Optimized Resume

One Click And Our AI Optimizes Your Resume to Match The Job Description.

Is Your Resume Optimized For This Role?

Find Out If You're Highlighting The Right Skills And Fix What's Missing

Experience Level

Mid to Senior

Qualifications

Responsibilities Design and implement effective reinforcement learning algorithms and training methodologies to achieve state-of-the-art performance in multimodal foundation models, including PPO, GRPO, and RLHF. Lead research and development efforts to enhance real-time multimodal intelligence, particularly focusing on audio and video modeling capabilities. Enhance data quality for large-scale post-training by developing techniques for data filtering, curation, and synthetic data generation. Create evaluation frameworks and internal benchmarks to assess model capabilities, reliability, and user experience across various modalities. Collaborate closely with product and engineering teams to translate research breakthroughs into impactful real-world AI experiences. Requirements A proven track record of leading research that significantly enhances neural network capabilities through advancements in data, modeling, or training. Extensive experience in data-driven experimentation, systematic analysis, and iterative model debugging. Experience building or working with large-scale models in various modalities.

About the job

About Hark

Hark is at the forefront of artificial intelligence innovation, dedicated to creating advanced, personalized intelligence that is proactive, multimodal, and able to engage with the world through speech, text, vision, and persistent memory.

We are combining this intelligence with cutting-edge hardware to establish a universal interface between humans and machines. While existing AI typically operates through chat boxes and outdated devices, Hark is pioneering the future: intelligent systems that naturally interact with people and their environments.

To achieve this, we are developing multimodal models and state-of-the-art AI hardware, designed from the ground up as a cohesive interface for a new era of intelligent systems.

About the Role

The Omni team at Hark is creating the next generation of AI experiences that extend beyond text, enabling models to comprehend and generate content across diverse modalities, including text, audio, and vision. Our mission is to develop seamless, real-time multimodal intelligence that enhances intuitive and immersive user experiences.

As a member of the Omni team, you will play a critical role in advancing real-time audio, video, and multimodal models. This position encompasses full-stack development—from data and modeling to training, serving, and product integration. You will contribute to both pretraining and posttraining initiatives while collaborating closely with product teams to push the limits of model capabilities and deliver outstanding end-to-end user experiences.

About Hark

Hark is an innovative AI company committed to developing advanced, personalized intelligence that interacts seamlessly with the world. Our focus is on creating intelligent systems that redefine how humans and machines communicate, leveraging next-generation hardware and multimodal models.

Similar jobs

Tailoring 0 resumes

We'll move completed jobs to Ready to Apply automatically.