Qualifications
As an HPC Operations Engineer, your responsibilities will include:Delivering 24/7 operational support for Linux HPC compute, storage, and interconnect systems, utilizing technologies such as RDMA fabrics, parallel filesystems, HPC batch schedulers, and cybersecurity measures.Addressing and resolving inquiries from Jump's research community, overseeing the entire problem lifecycle.Timely response to alerts and participation in coordinated maintenance operations, including evenings and weekends.Engaging in global infrastructure projects and developing diagnostic and automation scripts.Collaborating across teams to enhance both new and existing codebases in various programming languages.Maintaining vendor relationships through travel as necessary.Implementing monitoring systems to track performance and faults, alongside improving documentation for systems and users.Ensuring adherence to company cybersecurity and IT policies, participating in an on-call rotation as needed.
About the job
At Jump Trading Group, we are dedicated to pioneering research that sets industry standards. We harness the brilliance of talents in Mathematics, Physics, and Computer Science to explore scientific frontiers, pushing beyond current limits and translating innovative research into impactful solutions in global financial markets. Our workplace culture thrives on creativity, intellectual honesty, and a bold competitive spirit, fostering collaboration and mutual respect. Together, we not only achieve exceptional risk-adjusted returns but also develop technologies that redefine our world, invest in diverse startups, and collaborate with leading research institutions and universities to tackle significant challenges.
We seek a proactive and detail-oriented HPC Operations Engineer who is enthusiastic about managing extensive Linux HPC environments. This role will focus on navigating complex operational tasks and adapting to unpredictable challenges in a dynamic setting.