About the role
Join DigitalOcean and take your career to new heights. Collaborate with a vibrant community of exceptionally skilled professionals dedicated to creating the most user-friendly and scalable cloud solutions. If you possess a growth mindset, have the courage to think ambitiously, and thrive in a dynamic environment characterized by innovation, you’ll fit right in. We celebrate achievements together, while simultaneously learning, enjoying our work, and making a significant impact for the dreamers and builders of the world.
Scope and Mission:
- Lead the design and operation of the Gradient AI platform, emphasizing a seamless and innovative agent development experience with top-notch scalability, performance, and predictability.
- Drive the architectural vision, ensuring technical excellence and innovation across backend systems and customer interactions.
Main Responsibilities:
- Architect and Build
- Design and refine the architecture for our agent development experience, including code integration, evaluations, observability, tools, and cross-agent interactions.
- Initiate projects to create an architecture optimized for scalability, reliability, low latency, and cost efficiency.
- Oversee and enhance our benchmarking system to continuously improve our performance standards.
- Lead the rollout of new services, taking a hands-on approach to ensure timely delivery.
- Technical Leadership
- Define and uphold technical standards, coding practices, tools, and infrastructure guidelines across AI/ML engineering teams.
- Establish best practices for design, testing, deployments, instrumentation, and performance tuning.
- Mentor senior engineers, fostering a culture of architectural rigor and operational excellence within the team.
- Cross-functional Collaboration
- Collaborate with product managers, stakeholders, and business leaders to translate strategic objectives into scalable technical roadmaps.
- Guide customer-facing teams (e.g., consultants, support, sales engineers) to shape AI modernization initiatives through agents.
- Reliability, Performance & Scaling
- Lead Operations Excellence for our Agent development platform, establishing scalable mechanisms and processes for the engineering organization.

