About the role
Join us at DigitalOcean, where you can unlock your full potential and create the most impactful work of your career. Collaborate with a vibrant community of exceptional professionals who are passionate about simplifying cloud technology. If you are driven by innovation, possess a bold vision, and thrive in a dynamic environment, you’ll find a welcoming place here. We celebrate success together, fostering a culture of learning, enjoyment, and making a meaningful impact for the innovators and creators worldwide.
Scope and Mission:
- Lead the design and operation of the Gradient AI platform, delivering an intuitive and innovative agent development experience that excels in scalability, performance, and reliability.
- Shape architectural vision, technical excellence, and drive innovation across backend systems and customer-facing interactions.
Key Responsibilities:
- Architect and Build
- Design and enhance the architecture for our agent development experience, encompassing code integration, evaluations, observability, tools, and cross-agent interactions.
- Initiate and implement an architecture that is optimized for scalability, reliability, low-latency, and cost-efficiency.
- Oversee and improve our benchmarking system to continuously elevate our development experience.
- Take a hands-on leadership role in rolling out new services to ensure timely delivery.
- Technical Leadership
- Establish and uphold technical standards, coding practices, tooling, and infrastructure guidelines across AI/ML engineering teams.
- Develop best practices for design, testing, deployments, instrumentation, and performance tuning.
- Mentor and guide senior engineers, fostering a culture of architectural rigor and operational excellence within the team.
- Cross-functional Collaboration
- Collaborate with product managers, stakeholders, and business leaders to translate strategic objectives into actionable technical roadmaps.
- Advise customer-facing teams (e.g., consultants, support, sales engineers) in shaping AI modernization initiatives through agent development.
- Reliability, Performance & Scaling
- Lead Operations Excellence for our Agent development platform, establishing scalable mechanisms and processes to support the engineering organization.

