About the job
Join our dynamic team as a Platform Engineer, where you will play a crucial role in supporting our high-performance computing platform utilized by computational scientists in Research & Development. This position emphasizes expertise in AWS infrastructure, DevOps automation, container management, and high-throughput storage solutions, with a significant focus on infrastructure as code practices. You will take ownership of both cloud and HPC infrastructure, collaborating closely with scientists and engineers to deliver robust, scalable, and automated platform solutions.
Key Responsibilities:
- Design, implement, and maintain scalable cloud infrastructure on AWS.
- Manage infrastructure as code with tools such as Terraform, Terragrunt, and CloudFormation.
- Create immutable infrastructure utilizing Packer.
- Develop and oversee CI/CD pipelines using GitLab CI/CD.
- Run containerized applications across various platforms including:
- Amazon EKS
- Docker on EC2
- Singularity (Apptainer) for HPC workloads
- Configure systems effectively using Ansible.
- Design and manage high-throughput cloud and HPC storage solutions.
- Monitor performance, troubleshoot issues, and optimize platforms for reliability and cost efficiency.
- Document architectural designs and operational best practices.

