Clicking Apply Now takes you to AutoApply where you can tailor your resume and apply.
Unlock Your Potential
Generate Job-Optimized Resume
One Click And Our AI Optimizes Your Resume to Match The Job Description.
Is Your Resume Optimized For This Role?
Find Out If You're Highlighting The Right Skills And Fix What's Missing
Experience Level
Entry Level
Qualifications
QualificationsProficiency in machine learning systems and deep learning frameworks such as PyTorch, TensorFlow, and ONNX. Familiarity with prevalent LLM architectures and inference optimization methods, including continuous batching and quantization. Solid understanding of GPU architectures, along with experience in GPU kernel programming using CUDA.
About the job
Join our dynamic team at Perplexity as an AI Inference Engineer, where you will be at the forefront of deploying cutting-edge machine learning models for real-time inference. Our tech stack includes Python, Rust, C++, PyTorch, Triton, CUDA, and Kubernetes, providing you with a chance to work on large-scale applications that make a real impact.
Key Responsibilities
Design and develop APIs for AI inference that cater to both internal and external stakeholders.
Conduct benchmarking and identify bottlenecks within our inference stack to enhance performance.
Ensure the reliability and observability of our systems while promptly addressing any outages.
Investigate innovative research and implement optimizations for LLM inference.
About Perplexity
Perplexity is a forward-thinking technology company based in San Francisco, dedicated to harnessing the power of artificial intelligence to transform industries. Our innovative team thrives on collaboration and creativity, driving advancements in AI and machine learning.
Similar jobs
1 - 20 of 11,809 Jobs
Search for Ai Inference Engineer At Perplexity San Francisco
About the RoleWe are seeking a talented Inference Engineering Manager to spearhead our AI Inference team at Perplexity. This is a remarkable opportunity to design and expand the infrastructure that drives Perplexity's innovative products and APIs, catering to millions of users with cutting-edge AI capabilities.You will take charge of the technical direction and implementation of our inference systems while cultivating and leading a high-caliber team of inference engineers. Our technology stack encompasses Python, PyTorch, Rust, C++, and Kubernetes. You will play a crucial role in architecting and scaling the large-scale deployment of machine learning models for Perplexity's Comet, Sonar, Search, and Deep Research products.Why Perplexity?Develop state-of-the-art systems that are among the fastest in the industry using leading-edge technology.Engage in high-impact work within a smaller team, enjoying considerable ownership and autonomy.Seize the chance to create infrastructure from the ground up instead of maintaining outdated systems.Work across the entire spectrum: minimizing costs, scaling traffic, and advancing the capabilities of inference.Make a significant impact on the technical roadmap and team culture at a rapidly expanding company.ResponsibilitiesLead and nurture a high-performing team of AI inference engineers.Develop APIs for AI inference utilized by both internal and external clients.Design and scale our inference infrastructure for enhanced reliability and efficiency.Benchmark and resolve bottlenecks across our inference stack.Drive large sparse/MoE model inference at rack scale, including sharding strategies for extensive models.Innovate by developing inference systems that support sparse attention and disaggregated pre-fill/decoding serving.Enhance the reliability and observability of our systems and lead incident response efforts.Make technical decisions regarding batching, throughput, latency, and GPU utilization.Collaborate with ML research teams on model optimization and deployment.Recruit, mentor, and develop engineering talent.Establish team processes, engineering standards, and operational excellence.Qualifications5+ years of engineering experience, with at least 2 years in a technical leadership or management capacity.Proficiency in programming languages and tools such as Python, PyTorch, Rust, and C++.Experience with Kubernetes and cloud infrastructure.Strong understanding of machine learning model deployment and optimization.Exceptional problem-solving and communication skills.
Join our dynamic team at Perplexity as an AI Inference Engineer, where you will be at the forefront of deploying cutting-edge machine learning models for real-time inference. Our tech stack includes Python, Rust, C++, PyTorch, Triton, CUDA, and Kubernetes, providing you with a chance to work on large-scale applications that make a real impact.Key ResponsibilitiesDesign and develop APIs for AI inference that cater to both internal and external stakeholders.Conduct benchmarking and identify bottlenecks within our inference stack to enhance performance.Ensure the reliability and observability of our systems while promptly addressing any outages.Investigate innovative research and implement optimizations for LLM inference.
At Perplexity, we are on the lookout for a talented and experienced AI Security Engineer to bolster our security team. This pivotal role involves safeguarding cutting-edge AI systems from adversarial threats. You will be responsible for creating and implementing strong security measures for self-hosted models, LLM APIs, agents, MCPs, and the essential AI infrastructure. Your expertise will empower our developers with the necessary tools and guidance, enabling them to innovate while ensuring that AI security remains a top priority.Our technology stack is comprised of Python, NextJS, TypeScript, Docker, AWS, Kubernetes, and PostgreSQL.
Join Perplexity as a skilled Software Engineer, where you will play a pivotal role in developing the next-generation AI Foundation and Platform. Our mission is to transform how individuals search and engage online. In this exciting position, you will contribute to building Perplexity's comprehensive AI data, evaluation, and personalization infrastructure, which underpins nearly all of our agent products.Technology Stack: Spark | AWS Data Stack (S3, RDS, DynamoDB, Docker, EKS, Kinesis) | Pytorch | Databricks | Snowflake | LLM APIsAs we continue to expand our user base and diverse use cases, our data stack ensures that millions around the globe receive fast, personalized answers.
About the RoleJoin Perplexity as a dynamic Software Engineer specializing in security, where you will play a pivotal role in developing and enhancing the software, automation, and systems that drive our security operations. This position focuses on engineering innovative security tools and AI-driven agents aimed at improving our detection and response capabilities, vulnerability management, and overall security posture across our products and infrastructure.ResponsibilitiesDesign, build, and maintain software and automation solutions that enhance our detection and response capabilities, including alert enrichment, triage workflows, and investigation tools.Implement and refine internal AI agents and security bots that facilitate monitoring, investigations, reporting, and other security operations tasks.Develop and manage systems and workflows that support our bug bounty and vulnerability disclosure program, covering intake, triage, prioritization, and remediation tracking.Collaborate with product and engineering teams to perform threat modeling on new features and systems, propose mitigations, and integrate security guardrails into designs and implementations.Contribute to secure-by-default libraries, services, and patterns that empower teams to build secure features effortlessly.Integrate security signals from cloud services, endpoints, SaaS, and applications into unified pipelines and data models that bolster detection and analysis.Automate processes to minimize manual effort in incident response, containment, and remediation.Work closely with security engineers and fellow software engineers to review designs and code, continuously enhancing our security tools and platforms.QualificationsA minimum of 4 years of experience as a software engineer, particularly in developing security-related tools, platforms, or automation, or in a security engineering role with a strong emphasis on software development.Proficiency in at least one major programming language (e.g., Python, Go, or TypeScript) with experience in building production services, command-line interfaces, or internal tools.Experience with integration of security-relevant systems such as logging pipelines, SIEMs, EDR, cloud APIs, or identity platforms.Hands-on experience in threat modeling, secure design, or conducting application security reviews for services or features.Experience in operating or contributing to bug bounty or vulnerability management programs is a plus.
Join Perplexity AI as a skilled iOS Engineer and play a pivotal role in transforming the way users navigate the web. You'll be instrumental in developing and enhancing Comet, our innovative browser designed specifically for iOS.We seek a candidate with exceptional programming expertise, a keen interest in artificial intelligence and large language models, and a dedication to providing an outstanding user experience supported by a sophisticated user interface.Key ResponsibilitiesCraft a high-performance native iOS application that will be enjoyed by millions globally.Maintain a rigorous standard of quality in both user and developer experiences.Collaborate closely with design teams to create fast, intuitive user interfaces.Engage with data science and machine learning teams to evaluate and enhance the overall user journey.Partner with infrastructure and QA teams to streamline deployment processes, including testing, release, and monitoring.QualificationsA minimum of 5 years of industry experience.Solid understanding of Swift and proven experience with a modern iOS tech stack, including SwiftUI (iOS 16+) and UIKit.A passion for creating beautiful user interfaces, excellent user experiences, and writing reusable, testable code.Strong grasp of low-level details and the ability to profile and optimize app performance.Comfortable working in a small, agile team, demonstrating ownership and initiative.A genuine enthusiasm for iOS development and exploring the latest advancements in iOS and iPadOS.(Bonus) Experience in browser development is a plus.
Join Perplexity as an AI Research Tech Lead, where you'll spearhead our research initiatives and oversee the advancement of our proprietary Online LLMs, the Sonar models. In this pivotal leadership position, you will define the overarching research strategy across various modalities, mentor a talented team of researchers, and leverage our extensive query/answer dataset to enhance Sonar model performance, delivering a state-of-the-art Online LLM experience for our users.Key ResponsibilitiesResearch Leadership & StrategyEstablish and implement the overarching research strategy across diverse modalities, including post-training LLMs for agent trajectories and future mid-training projects.Lead the strategic planning and roadmap development to enhance Sonar model functionalities.Innovate in supervised and reinforcement learning techniques aimed at optimizing query answering.Collaborate with executive leadership to align research goals with product and business strategies.Team Development & MentorshipGuide and mentor a team of AI research scientists and engineers, nurturing their technical and professional development.Set the long-term research direction for the team, encompassing various modalities.Lead the recruitment and onboarding of new research talent.Foster a collaborative atmosphere that promotes knowledge sharing and innovative thinking.Technical ExcellencePost-train cutting-edge LLMs focused on query answering using advanced supervised and reinforcement learning techniques.Own and enhance the complete data, training, and evaluation pipelines necessary for LLM post-training.Deliver Sonar models that achieve top-notch query answering performance.Lead research efforts into agent trajectories and multi-modal capabilities.Steer the technical roadmap for future mid-training investments.Cross-Functional CollaborationCollaborate closely with engineering teams to integrate Sonar models into our products.Work with product teams to discern user needs and translate them into research priorities.Partner with data teams to effectively utilize our unique query/answer dataset.Communicate research progress and findings to stakeholders and broader teams.
Perplexity is on the lookout for an exceptional and proactive Application Security Engineer to enhance our innovative security team. Join us in transforming how individuals search and engage with the internet. You will be instrumental in developing systems, tools, and processes that seamlessly integrate security for developers, fostering rapid innovation while safeguarding our users on a large scale.Key ResponsibilitiesDesign and deploy scalable, developer-friendly security solutions that seamlessly incorporate into engineering workflows.Lead threat modeling exercises, design evaluations, and code assessments for new features and significant product launches.Develop and enhance secure-by-default frameworks for authentication, authorization, input validation, and secrets management.Create and integrate automated security tools within CI/CD pipelines (including linters, dependency scanners, and policy enforcement).Collaborate with product and engineering teams to address vulnerabilities and contribute to incident response and postmortems.Oversee, manage, and enhance our third-party penetration testing engagements and bug bounty program, working closely with external security researchers to detect and fix vulnerabilities.Stay updated on prevalent threats and attack strategies, driving the continuous improvement of our application security posture.
Join Perplexity AI, a pioneering company in the field of AI-driven search, as a Senior iOS Engineer. You will play a crucial role in transforming how users interact with the internet by developing innovative features and enhancing the performance of our iOS application.We are seeking a talented individual with a solid programming background, enthusiasm for search technologies and large language models, and a strong commitment to crafting exceptional user experiences complemented by an elegant user interface.
At Perplexity, we embody the future of AI, bringing transformative solutions to people who demand more. Our data team is at the forefront of this revolution, strategically integrating AI into every facet of our operations.We seek a passionate individual with a strong background as a data scientist, analytics engineer, or data engineer. You understand the significance of key metrics, can expertly design A/B tests that address core questions, dive deep into data models to solve discrepancies, and are eager to take on the challenge of building AI systems that will revolutionize the data science landscape.This is not just another text-to-SQL bot or a simple dashboard. You will create AI agents capable of conducting comprehensive analyses autonomously, from hypothesis formation and query execution to result interpretation and actionable recommendations. Your work will ensure that our entire data warehouse is accessible to AI systems, enabling precise queries across the board. You will develop self-healing data pipelines that proactively identify and resolve issues before they disrupt workflows. In doing so, you will empower our small data team to operate with the efficiency and output of a much larger organization.Join a forward-thinking data team that is already leveraging AI to enhance its processes, with full support from leadership to expand these efforts. Together, we will build a world-class team focused on creating scalable systems, innovative tools, and an AI-centric working environment that not only elevates our standards but also drives the entire industry forward.
Join Perplexity as an IT Systems Administrator and play a pivotal role in transforming how users engage with the internet. As an early addition to our innovative team, you will have a unique opportunity to build and optimize our technology infrastructure from the ground up, ensuring seamless operations and cutting-edge performance.This position requires you to work in-person at our bustling San Francisco office.Key ResponsibilitiesProcure, maintain, and manage computers, networking devices, and office technologies to ensure operational excellence.Oversee and optimize corporate software systems for enhanced productivity.Enhance and manage our Mobile Device Management (MDM) infrastructure.Implement and uphold security policies and procedures to safeguard our digital assets.Provision and administer user accounts, ensuring appropriate access permissions.Deliver responsive technical support and effectively troubleshoot complex issues.Lead IT initiatives aimed at improving endpoint management, security, and overall infrastructure.Provide technical training and support on IT systems to staff, both in-person and remotely.
Join Perplexity as a Senior Backend Software Engineer and help transform how users engage with the internet. As a key member of our dynamic team, you will lead the design, implementation, and scaling of backend systems that drive our web, mobile, and browser applications.Our Technology Stack: Python | Go | Rust | TypeScript | FastAPI | PostgreSQL | Redis | Docker | vLLM | AWSTeams HiringFile AgentThe File Agent team is at the forefront of building a scalable and secure platform for intelligent file editing and processing. Your expertise will help design the infrastructure and APIs that empower agents to autonomously edit and generate files across various formats.Enterprise GrowthThe Enterprise Growth team develops core platform capabilities that ensure Perplexity is a trusted solution for enterprise customers. This includes managing enterprise authentication, onboarding processes, and providing in-depth admin control and visibility.GrowthThe Growth team influences how millions interact with Perplexity by rapidly experimenting and implementing new features aimed at enhancing user experience and promoting user retention and revenue growth.CommerceThe Commerce team is responsible for the complete commerce stack, including payments infrastructure and monetization strategies. Your role will involve scaling billing systems across consumer and enterprise plans.
Perplexity seeks a Software Engineer in San Francisco to focus on computer monetization. This position involves developing and enhancing software that drives the company’s monetization strategies. Role overview This role centers on building and refining systems that support Perplexity’s revenue efforts. Projects in this area have a direct impact on the company’s growth within the technology sector. What you will do Develop software solutions that contribute to monetization goals Refine and improve existing systems to optimize revenue streams Work on projects that influence Perplexity’s expansion and success Location This position is based in San Francisco.
Your RoleAccelerate Revenue and Identify OpportunitiesManage both inbound and outbound tasks to enhance pipeline growth.Assess inbound enterprise prospects utilizing BANT criteria to pinpoint high-value opportunities for transfer to Account Executives.Implement targeted outbound initiatives to engage specific industry sectors.Strategic Outreach and Market ExpansionLeverage social selling techniques to promote the Perplexity brand—creating content, engaging with audiences, and exploring innovative formats to remain memorable to prospects.Conduct research and map accounts within designated verticals to establish thorough prospect lists.Craft vertical-specific messaging that addresses industry challenges and showcases Perplexity Enterprise solutions.Contribute to top-of-funnel strategies.Sales Operations and ExcellenceEnsure data integrity in Salesforce by meticulously documenting all prospect interactions and qualification criteria.Respond promptly to inbound inquiries with consultative, value-focused communication.Develop and refine outbound sequences that consistently generate pipeline results.
At Perplexity, we are revolutionizing the way enterprises integrate AI into their operations. We are in search of a talented Solutions Product Marketing Manager who will take charge of our go-to-market strategy for key industries including Finance, Healthcare, Legal, and Consulting. In this pivotal role, you will delve into buying behaviors, use cases, and competitive landscapes, transforming insights into robust campaigns, engaging content, and effective sales tools that consistently drive success across these sectors. This position merges the realms of marketing, sales strategy, and product innovation.Your Responsibilities:Develop and manage an integrated vertical go-to-market strategy that encompasses ongoing initiatives, not just isolated campaigns.Analyze how enterprise buyers in each sector assess, acquire, and implement AI solutions, including identifying personas, decision-making processes, and competitive options.Create and execute vertical awareness campaigns utilizing various channels such as paid advertising, organic outreach, and proprietary content, including webinars and tailored landing pages.Equip our enterprise sales teams with comprehensive battle cards, objection handling strategies, tailored messaging, and deal-stage content that accelerates revenue generation.Establish thought leadership that positions Perplexity as the authoritative voice in AI adoption for each targeted industry.Track and evaluate vertical performance metrics: sourced pipeline, conversion rates, and engagement, making iterative improvements based on data analysis.
Join the innovative team at Perplexity as an AI Infrastructure Engineer. In this role, you will leverage your expertise in Kubernetes, Slurm, Python, C++, and PyTorch, primarily utilizing AWS. Collaborate closely with our Inference and Research teams to design, deploy, and optimize our extensive AI training and inference clusters.ResponsibilitiesArchitect, deploy, and manage scalable Kubernetes clusters tailored for AI model inference and training workloads.Oversee and enhance Slurm-based HPC environments for distributed training of large language models.Create robust APIs and orchestration systems for training pipelines and inference services.Implement effective resource scheduling and job management systems across diverse compute environments.Evaluate system performance, identify bottlenecks, and implement enhancements across both training and inference infrastructures.Develop monitoring, alerting, and observability solutions specifically designed for ML workloads running on Kubernetes and Slurm.Quickly respond to system outages and collaborate with multiple teams to ensure high uptime for critical training runs and inference services.Optimize cluster utilization and execute autoscaling strategies to meet dynamic workload demands.QualificationsExtensive experience in Kubernetes administration, including custom resource definitions, operators, and cluster management.Proficient in Slurm workload management, encompassing job scheduling, resource allocation, and cluster optimization.Demonstrated experience in deploying and managing distributed training systems at scale.In-depth knowledge of container orchestration and the architecture of distributed systems.Solid familiarity with LLM architecture and training processes, including Multi-Head Attention, Multi/Grouped-Query, and distributed training strategies.Experience in managing GPU clusters and optimizing compute resource utilization.Required SkillsAdvanced Kubernetes administration and YAML configuration management skills.Expertise in Slurm job scheduling, resource management, and cluster configuration.Proficiency in Python and C++ programming with a focus on systems and infrastructure automation.
At Perplexity, we are at the forefront of transforming how organizations achieve their goals. We are on the lookout for a dynamic Product Marketing Manager to become an integral part of our team, acting as the key connection between our innovative products and their market influence. In this role, you will craft compelling narratives that elevate the perception of our offerings, stimulating customer engagement and driving sustainable growth.If you possess a knack for simplifying complexity, excel at turning uncertainty into a clear vision, and aspire to create a robust marketing framework that treats product launches as an ongoing strategy rather than isolated events, then this opportunity is tailored for you.
Join Perplexity as a Frontend Engineer specializing in Design Systems, where you will play a pivotal role in transforming the future of online search and interaction. In this innovative position, you will be at the forefront of developing cutting-edge AI products.Tech Stack: Tailwind | React | TypeScript | CSSKey ResponsibilitiesCollaborate with the design systems team to create an outstanding user interaction layer for all features, including both reusable components and foundational elements of generative UI.Enhance and refine the components that constitute the core of Perplexity's frontend.Continuously seek ways to elevate interaction quality, aesthetics, and team productivity.Essential QualificationsProven experience in building and maintaining user interface systems on a large scale.Solid coding fundamentals with some cross-stack development experience.Skilled at creating foundational systems for others to build upon.Hands-on experience with highly interactive React applications that utilize strongly typed code.Deep understanding of design and UI patterns applicable at scale.A genuine passion for prototyping, experimentation, and crafting accessible user experiences.Demonstrates an extreme ownership mentality.Takes pride in precision and attention to detail.At least 4 years of relevant industry experience.At Perplexity, AI is central to our mission. We expect all team members to effectively leverage AI in their roles. During the interview process, we will assess your thought process and decision-making abilities, which are crucial to our AI development. Please refrain from using AI tools unless instructed otherwise.
Join Cartesia as an Inference EngineerAt Cartesia, our vision is to create the next evolution of AI: an interactive, omnipresent intelligence that operates seamlessly across all environments. Currently, even the most advanced models struggle to continuously analyze a year's worth of audio, video, and text data—comprising 1 billion text tokens, 10 billion audio tokens, and 1 trillion video tokens—much less perform these tasks on-device.We are at the forefront of developing the model architectures that will make this a reality. Our founding team, who met as PhD candidates at the Stanford AI Lab, pioneered State Space Models (SSMs), a groundbreaking framework for training efficient, large-scale foundation models. Our talented team merges deep expertise in model innovation and systems engineering with a design-focused product engineering approach, enabling us to build and launch state-of-the-art models and user experiences.Supported by leading investors such as Index Ventures and Lightspeed Venture Partners, along with contributions from Factory, Conviction, A Star, General Catalyst, SV Angel, Databricks, and others, we are fortunate to be guided by numerous exceptional advisors and over 90 angel investors from diverse industries, including some of the world’s foremost experts in AI.About the RoleWe are actively seeking an Inference Engineer to propel our mission of creating real-time multimodal intelligence.Your ImpactDevelop and implement a low-latency, scalable, and dependable model inference and serving stack for our innovative foundation models utilizing Transformers, SSMs, and hybrid models.Collaborate closely with our research team and product engineers to efficiently deliver our product suite in a fast, cost-effective, and reliable manner.Construct robust inference infrastructure and monitoring systems for our product offerings.Enjoy substantial autonomy in shaping our products and directly influencing how cutting-edge AI is integrated across diverse devices and applications.What You BringAt Cartesia, we prioritize strong engineering skills due to the complexity and scale of the challenges we tackle.Proficient engineering skills with a comfort level in navigating intricate codebases, and a commitment to producing clean, maintainable code.Experience in developing large-scale distributed systems with strict performance, reliability, and observability requirements.Proven technical leadership, capable of executing and delivering results from zero to one amidst uncertainty.A background in or experience with inference pipelines, machine learning, and generative models.
Join Perplexity as an Applied Machine Learning Engineer and be at the forefront of innovation in artificial intelligence. You will design, develop, and refine advanced AI models that enhance user experiences globally. Your expertise in machine learning will allow you to create scalable solutions for user personalization, query comprehension, and content discovery, catering to the curiosity of millions.Key ResponsibilitiesUtilize cutting-edge ML and LLM techniques to address challenges in:Personalization (LLM memory, context summarization, retrieval, and ranking);Query Understanding (intent modeling, rewriting, agentic decomposition);Content Discovery (feed ranking and surfacing).Conduct thorough evaluations of LLM/ML models through offline and online methods, designing comprehensive experiments and metrics that yield insights into quality and impact.Manage the full model lifecycle from research to deployment, including data analysis, modeling, evaluation, A/B testing, and iterative enhancements.Collaborate with cross-functional teams, including engineers, PMs, data scientists, and designers, to ensure AI implementations yield significant product enhancements.Stay updated on ML/AI advancements by assessing and integrating new research and algorithms into the product lifecycle.Preferred QualificationsOver 5 years of experience in developing and deploying robust ML/AI models for large-scale, user-centric or data-driven applications.Extensive knowledge in deep learning frameworks (PyTorch, TensorFlow, JAX), LLMs, information retrieval, content summarization, recommendation systems, NLP, and ranking.Proficient software engineering skills (Python, production-level codebases, collaborative development).Comprehensive experience across the entire ML lifecycle: data analysis, feature engineering, model development, evaluation, and ongoing monitoring/improvement.Effective collaborator and communicator; thrives in fast-paced, cross-functional environments.Inquisitive, motivated by user/product impact, and passionate about advancing applied ML and AI.Bachelor's, Master's, or PhD in Computer Science, Engineering, or a related field (or equivalent experience).
Sep 19, 2025
Sign in to browse more jobs
Create account — see all 11,809 results
Tailoring 0 resumes…
Tailoring 0 resumes…
We'll move completed jobs to Ready to Apply automatically.