Your Responsibilities:Lead and direct the development of next-generation model inference and feature serving systems capable of supporting models up to 100x larger, significantly enhancing Pinterest’s monetization strategies.Design and improve low-latency, high-throughput inference pipelines to adhere to stringent SLOs while optimizing performance, efficiency, and cost.Collaborate with Ads ML and product teams to productionize innovative model architectures, including LLMs and multi-stage ranking models, ensuring reliable scalability for global traffic.Advance the online feature platform (feature computation, caching, and retrieval) to enhance coverage, freshness, and consistency for Ads models.Assess and incorporate new technologies (e.g., GPU acceleration, model compression, Triton, vLLM, Dynamo) to enhance our inference stack.Forge strong partnerships with other infrastructure and ML teams to boost end-to-end reliability, observability, and developer productivity for Ads ML.Mentor and guide fellow engineers, assisting them in technical decision-making, system design, and career progression.
About the job
About Pinterest:
At Pinterest, we inspire millions globally to explore creative ideas, envision new possibilities, and create lasting memories. Our mission is to empower everyone to craft a life they adore, and this begins with the talented individuals shaping our product.
Join us in a career that fuels innovation for millions, transforms passion into growth opportunities, and celebrates diverse experiences while embracing flexibility to perform at your best. Crafting a fulfilling career? It’s possible with us.
The Ads ML Inference Infrastructure team is responsible for the online inference and feature-serving systems that enable real-time model scoring and delivery for all Ads models at Pinterest. We are seeking a staff engineer with extensive hands-on experience in large-scale ML inference systems, with a knack for resolving ambiguous technical challenges and spearheading strategic, cross-functional initiatives.
About Pinterest
Pinterest is a vibrant platform where individuals come together to discover, share, and bring to life their creative aspirations. With a commitment to innovation and a supportive work environment, we strive to inspire our users and our employees alike. Join us to make a difference in how people connect with ideas and each other.
This job posting is no longer active and is not accepting applications.
Your Responsibilities:Lead and direct the development of next-generation model inference and feature serving systems capable of supporting models up to 100x larger, significantly enhancing Pinterest’s monetization strategies.Design and improve low-latency, high-throughput inference pipelines to adhere to stringent SLOs while optimizing performance, efficiency, and cost.Collaborate with Ads ML and product teams to productionize innovative model architectures, including LLMs and multi-stage ranking models, ensuring reliable scalability for global traffic.Advance the online feature platform (feature computation, caching, and retrieval) to enhance coverage, freshness, and consistency for Ads models.Assess and incorporate new technologies (e.g., GPU acceleration, model compression, Triton, vLLM, Dynamo) to enhance our inference stack.Forge strong partnerships with other infrastructure and ML teams to boost end-to-end reliability, observability, and developer productivity for Ads ML.Mentor and guide fellow engineers, assisting them in technical decision-making, system design, and career progression.
About the job
About Pinterest:
At Pinterest, we inspire millions globally to explore creative ideas, envision new possibilities, and create lasting memories. Our mission is to empower everyone to craft a life they adore, and this begins with the talented individuals shaping our product.
Join us in a career that fuels innovation for millions, transforms passion into growth opportunities, and celebrates diverse experiences while embracing flexibility to perform at your best. Crafting a fulfilling career? It’s possible with us.
The Ads ML Inference Infrastructure team is responsible for the online inference and feature-serving systems that enable real-time model scoring and delivery for all Ads models at Pinterest. We are seeking a staff engineer with extensive hands-on experience in large-scale ML inference systems, with a knack for resolving ambiguous technical challenges and spearheading strategic, cross-functional initiatives.
About Pinterest
Pinterest is a vibrant platform where individuals come together to discover, share, and bring to life their creative aspirations. With a commitment to innovation and a supportive work environment, we strive to inspire our users and our employees alike. Join us to make a difference in how people connect with ideas and each other.