About the job

Join Our Team at ai&

ai& is a cutting-edge global AI technology firm committed to addressing the surging demand for artificial intelligence solutions. Our dual mission is to establish ourselves as a leading AI research lab focused on localization while also providing comprehensive global infrastructure and computing services. We are developing a cohesive, state-of-the-art platform that amalgamates next-gen data centers, diverse computing resources, and advanced model services. We believe that owning the entire technology stack is pivotal to effectively building and scaling AI solutions.

At ai&, we empower small, agile teams with the autonomy to confront significant challenges. We prioritize breaking down complex problems into digestible components and collaboratively addressing intricate issues. We are in search of driven, mission-oriented individuals who exhibit substantial personal initiative. Curiosity is the cornerstone of our talent, and we desire team members who are excited to grow alongside our dynamic technology and expanding enterprise.

We are actively recruiting talented individuals globally, with offices in Tokyo, San Francisco, Austin, and Toronto. We are eager to connect with exceptional talent wherever they may be located.

As an Inference & Serving Engineer, your mission is to create a high-performance, multi-tenant serving architecture that maximizes the utilization of varied hardware resources. You will navigate the complexities of various state-of-the-art inference frameworks and engines, optimizing the runtime for specific workloads. Your responsibilities will encompass not just Large Language Models but also pioneering Generative AI applications, including high-throughput video generation and sophisticated multimodal systems with heightened memory and computational demands.

Your role extends beyond merely deploying models at scale; you will build a robust system that unites specialized, high-performance clusters with extensive, multi-node deployments as our company expands. A profound understanding of the 'Inference Triangle' is essential—continually fine-tuning the stack to achieve the ideal balance between low latency (TTFT/ITL), high throughput, and inference quality (Precision/Quantization). The ideal candidate should be a hands-on engineer who perceives the entire GPU fleet as a singular, programmable compute fabric and is enthusiastic about engaging at every level of the stack.

About the job

Join Our Team at ai&

We are actively recruiting talented individuals globally, with offices in Tokyo, San Francisco, Austin, and Toronto. We are eager to connect with exceptional talent wherever they may be located.

Join Our Team at ai&

We are actively recruiting talented individuals globally, with offices in Tokyo, San Francisco, Austin, and Toronto. We are eager to connect with exceptional talent wherever they may be located.

Join Our Team at ai&

We are actively recruiting talented individuals globally, with offices in Tokyo, San Francisco, Austin, and Toronto. We are eager to connect with exceptional talent wherever they may be located.

Technical Staff Member - Inference Serving at ai& | Tokyo

Unlock Your Potential

Experience Level

Qualifications

About the job

Join Our Team at ai&

About ai&

Team Leader at Greene King | Greenwich

Team Leader at Greene King | Soho

Talent Partner - In-house Recruiter at ennovationHUB | Barcelona

Part-Time Chef at Greene King | Walnut Tree

Part-Time Chef at Greene King | Walnut Tree

Cooks

Desk Investigator Officer

Senior Scheduling Coordinator

Commercial Cleaner

Anti-Fraud Officer - Transaction Monitoring

Project Management Information Systems Specialist

Merchant Relations Officer - Bekasi

Associate Talent Acquisition - 6 Month Contract

Bar & Waiting Staff at Greene King | Chichester

Bar and Waiting Staff at Greene King | Chichester

Door Attendant at Raffles The Red Sea | Umluj

Front Office Supervisor at Raffles The Red Sea | Umluj

Senior Finance Manager

Guest Relations Supervisor - Raffles The Red Sea (Saudi National)

People & Culture Assistant

Technical Staff Member - Inference Serving at ai& | Tokyo

Unlock Your Potential

Experience Level

Qualifications

About the job

Join Our Team at ai&

About ai&

Technical Staff Member - Inference Serving at ai& | Tokyo

Unlock Your Potential

Experience Level

Qualifications

About the job

Join Our Team at ai&

About ai&

Technical Staff Member - Inference Serving at ai& | Tokyo

Unlock Your Potential

Experience Level

Qualifications

About the job

Join Our Team at ai&

About ai&