companyCohere logo

Member of Technical Staff, Pre-Training Data

CohereToronto
On-site Full-Time

Clicking Apply Now takes you to AutoApply where you can tailor your resume and apply.


Unlock Your Potential

Generate Job-Optimized Resume

One Click And Our AI Optimizes Your Resume to Match The Job Description.

Is Your Resume Optimized For This Role?

Find Out If You're Highlighting The Right Skills And Fix What's Missing

Qualifications

Bachelor's or Master's degree in Computer Science, Data Science, or a related field. Proficient in programming languages such as Python and experience with data manipulation libraries. Strong understanding of machine learning concepts, particularly in natural language processing. Experience with data ablation and performance evaluation methods. Ability to work collaboratively in a fast-paced, dynamic environment. Excellent problem-solving skills and a passion for data-driven decision making.

About the job

About Us

At Cohere, our mission is to scale intelligence for the betterment of humanity. We specialize in training and deploying cutting-edge models for developers and enterprises, enabling them to create AI-powered experiences such as content generation, semantic search, retrieval-augmented generation (RAG), and intelligent agents. Our commitment to excellence is key to fostering the widespread adoption of AI technologies.

We take great pride in our craft. Each team member is dedicated to enhancing the capabilities of our models and maximizing the value delivered to our clients. We thrive in a fast-paced environment, working diligently to prioritize our customers' needs.

Cohere is a dynamic team of researchers, engineers, and designers, all driven by a shared passion for innovation. We believe that a diverse range of perspectives is essential for creating exceptional products.

Join us on this exciting journey and help shape the future of AI!

Why this role matters

As a Machine Learning Engineer focusing on pre-training data, you will be instrumental in developing the data pipeline that supports Cohere’s advanced language models. This role involves conducting data ablations to evaluate data quality and crafting pre-training data mixtures aimed at optimizing model performance. By merging research with engineering, you will play a crucial role in transforming raw data into state-of-the-art AI models and will directly influence key training metrics such as throughput and accelerator utilization.

Your contributions will be vital to our mission of delivering reliable and efficient language understanding and generation capabilities, driving innovation in natural language processing. If you're passionate about harnessing data to build foundational AI systems, this position presents a unique opportunity to make a significant impact.

Note: Our offices are located in London, Paris, Toronto, San Francisco, and New York, and we fully support remote work! This role is open to candidates located anywhere in the EST to EU time zones.

As a Machine Learning Engineer, Pre-Training Data, your responsibilities will include:

  • Conducting data ablations to assess data quality and experimenting with data mixtures to boost model performance.

  • Developing robust data modeling techniques to guarantee datasets are structured and formatted for optimal training efficiency.

  • Researching and implementing innovative data curation methods, utilizing Cohere’s infrastructure to propel advancements in natural language processing.

  • Collaborating with cross-functional teams to ensure the integration of high-quality data into our models.

About Cohere

Cohere is at the forefront of AI innovation, dedicated to creating transformative language models that empower developers and enterprises. Our commitment to diversity, excellence, and collaboration drives our success in delivering groundbreaking AI solutions. Join us as we revolutionize the way intelligence is harnessed for the benefit of humanity.

Similar jobs

Tailoring 0 resumes

We'll move completed jobs to Ready to Apply automatically.