About the job
About the Position
Character.AI is seeking a skilled Technical Program Manager for our AI Infrastructure team to spearhead transformative programs that bolster our model development and serving systems on a large scale. In this pivotal role, you'll collaborate intimately with engineering, research, and product teams to define infrastructure strategies, synchronize project roadmaps, and drive comprehensive execution for vital initiatives encompassing training, evaluation, and inference. Your contributions will be instrumental in ensuring our systems are robust, efficient, and primed to support rapid iterations as we enhance AI experiences for millions of users.
This opportunity is ideal for individuals who excel in technically intricate environments, relish structuring ambiguous challenges, and can foster alignment across diverse, cross-functional teams. You will frequently navigate the nexus of research and production, assisting teams in balancing trade-offs between speed, quality, and reliability. A knack for influencing without formal authority, early risk identification, and maintaining momentum on extensive, interdependent initiatives will be crucial to your success.
Your Responsibilities
Oversee the planning and execution of significant AI infrastructure projects that encompass training pipelines, data systems, model evaluation, and inference/serving processes.
Establish frameworks to keep teams aligned, including scopes, objectives, requirements, timelines, risks, and success metrics.
Collaborate with engineering, research, and product teams to translate model and product requirements into actionable infrastructure roadmaps and priorities.
Promote accountability and effective communication across teams working on interrelated systems.
Monitor critical infrastructure metrics (e.g., reliability, latency, throughput, cost efficiency) and create reports that highlight progress and risks.
Identify workflow bottlenecks in infrastructure processes and spearhead improvements in tooling, automation, and developer efficiency.
Assist in capacity planning and resource management to ensure infrastructure growth aligns with model and product advancements.
Develop scalable frameworks and operational practices that enhance execution quality across infrastructure initiatives.
Act as a strategic ally to leadership regarding prioritization, sequencing, and trade-offs in infrastructure investments.
Engage with AI cloud service providers and serve as the key liaison between internal engineering teams and external partnership initiatives.

