company

Director of AI Inference and MLOps

Deeter AnalyticsAustin Area
On-site Full-time

Clicking Apply Now takes you to AutoApply where you can tailor your resume and apply.


Unlock Your Potential

Generate Job-Optimized Resume

One Click And Our AI Optimizes Your Resume to Match The Job Description.

Is Your Resume Optimized For This Role?

Find Out If You're Highlighting The Right Skills And Fix What's Missing

Experience Level

Senior

Qualifications

Qualifications:Proven track record in AI infrastructure management and MLOps. Deep expertise in real-time inference systems. Strong understanding of GPU architectures and high-performance computing. Experience in revenue generation and marketplace strategy. Exceptional leadership and team management skills. Ability to innovate and adapt in a rapidly changing technological landscape.

About the job

Location: Austin, Texas area / On-site preferred
Project: 7MW Phase I AI Datacenter -> 50MW Campus Expansion
Reports to: Founders / Executive Team

About the Project

We are in the process of constructing a cutting-edge, high-density AI datacenter campus situated just outside of Austin, Texas, commencing with an initial capacity of approximately 7MW of NVIDIA GB300 NVL72 infrastructure and anticipating a scale-up to over 50MW. Our primary aim is to focus on real-time inference, reasoning, and high-value AI serving workloads, effectively monetizing our infrastructure in active markets rather than merely leasing out space.

This role transcends traditional datacenter operations.

We are seeking a visionary leader who will strategically transform our GPU racks into a profitable inference operation.

As the head of this initiative, you will be responsible for defining and executing strategies that enhance revenue, uptime, and utilization through careful selection of models, orchestration stacks, pricing strategies, customer segments, and marketplace partnerships.

The ideal candidate will appreciate that the essence of our business lies beyond mere computation; rather, it encompasses monetized tokens, latency-adjusted utilization, and gross margins.

The Role

We are in search of a senior operator-builder capable of bridging multiple domains:

  • AI infrastructure

  • Inference performance engineering

  • Model serving and routing

  • Marketplace monetization

  • Customer and partner integration

  • Revenue optimization

You will architect and manage the inference platform that dictates how our GB300 NVL72 racks are monetized in real-time. This could involve direct enterprise workloads, marketplace distribution, API-based reselling, model hosting, fine-tuned/private deployments, and novel inference channels.

You should possess a keen understanding of profitable applications on modern inference hardware, and be prepared to answer critical questions such as:

  • Which open-weight and commercially viable models should be prioritized on this hardware?

  • How should workloads be balanced across premium low-latency serving, bulk throughput, reserved capacity, and experimental capacities?

  • Should we leverage third-party marketplaces for routing?

About Deeter Analytics

At Deeter Analytics, we are pioneering the future of AI infrastructure, focused on delivering cutting-edge solutions that empower businesses to leverage the full potential of artificial intelligence. Our mission is to provide robust and scalable AI capabilities that drive profitability and operational excellence.

Similar jobs

Tailoring 0 resumes

We'll move completed jobs to Ready to Apply automatically.