companyPathway logo

Senior ML Infrastructure / DevOps Engineer

PathwayRemote — Stockholm, Stockholm County, Sweden
Remote Full-time

Clicking Apply Now takes you to AutoApply where you can tailor your resume and apply.


Unlock Your Potential

Generate Job-Optimized Resume

One Click And Our AI Optimizes Your Resume to Match The Job Description.

Is Your Resume Optimized For This Role?

Find Out If You're Highlighting The Right Skills And Fix What's Missing

Experience Level

Senior

Qualifications

QualificationsExtensive experience in managing Linux systems and distributed application architectures. Proficient in managing GPU clusters and utilizing orchestration tools like Kubernetes. Strong background in infrastructure-as-code practices and tools such as Terraform or CloudFormation. Experience with CI/CD tools and methodologies. Demonstrated expertise in building and maintaining ML pipelines. Excellent problem-solving skills and a passion for continuous learning and improvement.

About the job

About Pathway

Pathway is revolutionizing artificial intelligence with its pioneering post-transformer model that mimics human thought processes.

Our groundbreaking architecture (BDH) surpasses traditional Transformer models, providing enterprises with comprehensive insights into model functionality. By integrating this foundational model with the fastest data processing engine available, Pathway empowers organizations to transcend mere incremental improvements and embrace genuinely contextualized, experience-driven intelligence. Trusted by prestigious entities like NATO, La Poste, and Formula 1 teams, we are at the forefront of technological advancement.

Founded by complexity scientist Zuzanna Stamirowska, our leadership team includes AI visionaries such as CTO Jan Chorowski, who pioneered Attention in speech processing and collaborated with Nobel laureate Geoff Hinton at Google Brain, and CSO Adrian Kosowski, a distinguished computer scientist and quantum physicist who earned his PhD at 20.

With backing from esteemed investors and advisors, including TQ Ventures and Lukasz Kaiser, co-author of the Transformer model behind ChatGPT and a key figure at OpenAI, Pathway operates out of Palo Alto, California.

The Opportunity

We are seeking a passionate Senior ML Infrastructure / DevOps Engineer who thrives on optimizing Linux environments, distributed systems, and GPU cluster scalability over traditional notebook usage. You will be responsible for the infrastructure that drives our machine learning training and inference workloads across diverse cloud platforms, managing everything from basic Linux setups to advanced container orchestration and CI/CD pipelines.

Your role will be integral to the R&D team, focusing on production infrastructure, including clusters, networks, storage, observability, and automation. Your contributions will directly influence the speed and efficiency of model training, deployment, and iteration.

Why This Role is Unique

  • Manage and scale GPU-intensive clusters utilized daily by the R&D team for high-scale training and rapid inference.
  • Design, build, and automate our ML platform, moving beyond the execution of predefined playbooks.
  • Collaborate across multiple major cloud providers to tackle intriguing challenges in networking, scheduling, and cost/performance optimization at scale.

Your Responsibilities

  • Architect, operate, and scale GPU and CPU clusters for ML training and inference (Slurm, Kubernetes, autoscaling, queue management, quota management).
  • Automate infrastructure provisioning and configuration using infrastructure-as-code (Terraform, CloudFormation, cluster tooling) and configuration management techniques.
  • Create and uphold robust ML pipelines (data ingestion, training, evaluation, deployment) with strong assurances around reproducibility, traceability, and rollback capabilities.
  • Implement and maintain monitoring and observability solutions to ensure maximum uptime and performance.

About Pathway

Pathway is at the forefront of AI innovation, developing advanced technologies that transform how businesses leverage artificial intelligence. Our team consists of industry leaders and pioneers committed to enhancing the capabilities of AI.

Similar jobs

Tailoring 0 resumes

We'll move completed jobs to Ready to Apply automatically.