About the job
TLDR: We are seeking a skilled Machine Learning Engineer / MLOps Specialist to take ownership of our inference stack, optimizing serving engines and developing vector search pipelines. Your role will involve bridging the gap between Research and Product to deliver models that are efficient, cost-effective, and ready for production.
About Us
White Circle is an AI Safety company dedicated to enhancing the safety, reliability, and optimization of AI systems. Our platform is centered around defining straightforward, natural-language policies that dictate the appropriate and inappropriate behaviors of AI models. We continually test, enforce, and enhance these policies at scale to ensure optimal performance.
We have successfully raised $11M from leading investors, founders, and executives from prominent organizations, including OpenAI, Anthropic, Hugging Face, Mistral, DeepMind, Datadog, and Sentry.
Our infrastructure handles over 100 million API calls each month.
We specialize in fine-tuning and training our own large language models (LLMs) to operate faster and more economically than any competing open or proprietary models.
Your Responsibilities
Oversee the end-to-end inference infrastructure: enhance latency, throughput, and cost efficiency across our model fleet.
Develop and scale model serving solutions using TensorZero, vLLM/SGlang/TRT, and Kubernetes.
Design and maintain vector search pipelines utilizing vector storage technologies.
Demonstrate familiarity with support metrics (SLAs, FCR, deflection) and define service health KPIs.
Transform research outcomes into practical applications: collaborate with the research team, assess production readiness, and manage deployment processes including formatting and sampling parameters.
Your Qualifications
A minimum of 3 years of experience in deploying high-performance ML systems in production, beyond just training notebooks.
In-depth experience with inference optimization, including troubleshooting latency issues and understanding the differences between theoretical and actual throughput.
Proficient across the tech stack: from CUDA kernels to Kubernetes manifests and Grafana dashboards.
Preferred Qualifications: Experience with Rust, custom Triton kernels, and performance benchmarking.
Why Join White Circle?
Competitive salary ranging from $100,000 to $150,000 plus equity options.
20 days of paid vacation annually.
Opportunity to work from Paris with a hybrid work model, including a relocation package.
Access to top-tier medical insurance in France.

