About the job
About Us
Join us in setting global standards for AI in video understanding!
At Twelve Labs, we create world-class AI models specialized in processing vast video datasets to provide advanced features such as search, analysis, summarization, and insights generation.
Our models are utilized in the world's largest sports leagues to swiftly and accurately highlight key moments, enhancing the viewing experience. In domestic integrated control centers, our technology assists in efficiently navigating CCTV footage for rapid crisis response, while major broadcasters and studios globally rely on our models to produce content for billions of viewers.
Twelve Labs is a Deep Tech startup with offices in San Francisco and Seoul, recognized as one of the top 100 AI startups globally by CB Insights for four consecutive years. We've secured over $110 million in investments from leading VCs and companies such as NVIDIA, NEA, Index Ventures, Databricks, and Snowflake. Our AI models are uniquely available through Amazon Bedrock, developed in Korea. We are dedicated to creating exceptional products alongside outstanding colleagues and growing with our global clientele.
Our core values include:
A reflective and honest attitude towards oneself and the team.
Perseverance and humility in the face of failure and feedback.
A commitment to continuous learning and team empowerment.
If you enjoy solving challenging problems and growing through the process, the opportunity awaits you at Twelve Labs.
About the Team
You will be part of the Marengo team, responsible for the research and development of our multimodal embedding models. We integrate various modalities such as video, audio, and text into a singular embedding space.
Our work encompasses a variety of research topics, including contrastive learning, temporal video understanding, and multimodal representation learning. We manage the entire model development lifecycle, from constructing large-scale training data pipelines to designing model architectures, optimizing distributed training, and developing evaluation systems. With access to top-tier GPU resources like the NVIDIA B300, we efficiently conduct large-scale experiments.
In a fast-paced environment with a short research-to-production gap, we collaborate closely with the Search, Product, and Infrastructure teams to continuously enhance the quality of models used by thousands of customers worldwide.
About the Role
As a Senior ML Research Engineer on the Marengo team, you will spearhead the research and development of Twelve Labs' multimodal embedding models, focusing on data strategy, training pipeline optimization, and model architecture experimentation and evaluation.
This is a research-intensive engineering role lying at the intersection of multimodal representation learning, large-scale distributed training, and data engineering. We seek a proficient engineer-researcher capable of tackling well-defined research problems with moderate ambiguity, designing rigorous experiments, and delivering reproducible results ready for production deployment.
