About the job
About Hark
Hark is a pioneering artificial intelligence company dedicated to creating advanced and personalized intelligence systems. Our focus is on building proactive, multimodal AI capable of engaging with the world through speech, text, vision, and persistent memory.
We are merging this intelligence with cutting-edge hardware to establish a universal interface between humans and machines. While current AI largely relies on chat boxes and outdated devices, Hark is at the forefront of developing the next generation of agentic systems that interact naturally with users and their environment.
To achieve our ambitious goals, we are developing multimodal models alongside next-generation AI hardware, designed from the ground up as a single, integrated interface for a new era of intelligent systems.
About the Role
The Omni team at Hark is revolutionizing AI experiences beyond text, focusing on enabling models to comprehend and generate content across various modalities, including text, audio, and vision. Our mission is to create seamless, real-time multimodal intelligence that enhances intuitive and immersive user experiences.
As a key member of the Omni team, you will be responsible for developing large-scale pretraining systems and foundational models. This entails working across the entire stack, from data curation and large-scale training infrastructure to model architecture and optimization. You will significantly contribute to advancing the core capabilities of our models through extensive pretraining efforts.
