companyHitachi Digital Services logo

MLOps Engineer - 24/7 Support

On-site Full-time

Clicking Apply Now takes you to AutoApply where you can tailor your resume and apply.


Unlock Your Potential

Generate Job-Optimized Resume

One Click And Our AI Optimizes Your Resume to Match The Job Description.

Is Your Resume Optimized For This Role?

Find Out If You're Highlighting The Right Skills And Fix What's Missing

Experience Level

Experience

Qualifications

Qualifications: Experience with Dataiku and cloud platforms such as AWS. Proficiency in managing CI/CD pipelines and containerized applications. Strong problem-solving skills and ability to perform effective troubleshooting. Experience with monitoring tools such as Prometheus and Grafana. Ability to work independently and as part of a team in a fast-paced environment.

About the job

 

About Us

At Hitachi Digital Services, we are pioneers in the realm of digital solutions and transformation. Our vision is to unlock the immense potential of our world, and we are driven by a people-centric approach that aims to create positive change. Every day, we innovate to future-proof urban spaces, conserve natural resources, protect vital ecosystems, and enhance lives. Our unique blend of innovation, technology, and expertise empowers us to lead both our company and clients into the future.

We believe that diverse experiences and perspectives are invaluable. We value your character, life experiences, and passion just as much as your qualifications.

Join Our Team

We are seeking a dedicated MLOps L2 Support Engineer who will play a crucial role in providing 24/7 production support for our machine learning (ML) and data pipelines. This role involves on-call support, including weekends, to ensure the high availability and reliability of our ML workflows. You will work with technologies such as Dataiku, AWS, CI/CD pipelines, and containerized deployments to maintain and troubleshoot ML models in production.

Key Responsibilities:

  • Deliver L2 support for MLOps production environments, ensuring maximum uptime and reliability.
  • Troubleshoot issues related to ML pipelines, data processing jobs, and APIs.
  • Monitor logs, alerts, and performance metrics using tools like Dataiku, Prometheus, Grafana, or AWS CloudWatch.
  • Conduct root cause analysis (RCA) and resolve incidents within agreed SLAs.
  • Escalate unresolved issues to L3 engineering teams as necessary.

Dataiku Platform Management:

  • Manage Dataiku DSS workflows, troubleshoot job failures, and optimize performance.
  • Monitor and support Dataiku plugins, APIs, and automation scenarios.
  • Collaborate with Data Scientists and Data Engineers to debug ML model deployments.
  • Perform version control and ensure proper documentation.

About Hitachi Digital Services

Hitachi Digital Services is a global leader in digital solutions and transformation, committed to driving innovation and sustainable change. We empower communities and organizations to leverage technology for a better future.

Similar jobs

Tailoring 0 resumes

We'll move completed jobs to Ready to Apply automatically.