About the job
Eram Talent is seeking an exceptional AI Infrastructure Engineer to become a key player in our forward-thinking team. The successful candidate will design, build, and maintain scalable and resilient infrastructure solutions that underpin AI and machine learning operations. This position requires close collaboration with data scientists, machine learning engineers, and software developers to enhance infrastructure performance and streamline AI model development and deployment.
Key Responsibilities:
- Design, implement, and manage high-performance computing environments tailored for AI and machine learning applications.
- Deploy and maintain GPU-accelerated clusters, cloud-based AI platforms, and parallel processing systems.
- Collaborate with data scientists and ML engineers to understand infrastructure requirements for various AI projects.
- Optimize resource allocation and scalability of AI infrastructure to support large datasets and complex models.
- Automate infrastructure provisioning and deployment using Infrastructure as Code (IaC) tools.
- Ensure security, compliance, and reliability of AI infrastructure.
- Monitor system performance and troubleshoot issues to minimize downtime and maximize productivity.
- Stay updated on emerging technologies and best practices in AI infrastructure and propose continuous improvements.
