About the job
Our Vision
At Reflection AI, we are on a mission to develop open superintelligence and democratize its access for everyone.
Our team, hailing from renowned organizations like DeepMind, OpenAI, Google Brain, Meta, Character. AI, and Anthropic, is dedicated to creating open weight models that cater to individuals, enterprises, and even nations.
Role Overview
Design, construct, and manage state-of-the-art GPU infrastructure for high-throughput model inference and mid-training processes.
Develop systems that facilitate synthetic data generation and reinforcement learning pipelines at scale.
Create high-performance inference platforms capable of serving and evaluating models across thousands of GPUs.
Optimize throughput, latency, and GPU utilization for large language model inference and deployment tasks.
Construct infrastructure that enhances reinforcement learning pipelines, including large-scale rollout generation, evaluation, and policy enhancement loops.
Collaborate closely with research teams to support distributed reinforcement learning workloads and extensive model evaluation infrastructure.
Enhance model execution performance through kernel-level optimization, model parallelism strategies, and GPU runtime improvements.
Develop distributed systems that enable large-scale synthetic data generation and reinforcement learning-driven training workflows.
Identify and address performance bottlenecks across inference runtimes, GPU kernels, networking, and distributed computing systems.

