High Performance Computing Software Engineer - Supercomputing
Clicking Apply Now takes you to AutoApply where you can tailor your resume and apply.
Unlock Your Potential
Generate Job-Optimized Resume
One Click And Our AI Optimizes Your Resume to Match The Job Description.
Is Your Resume Optimized For This Role?
Find Out If You're Highlighting The Right Skills And Fix What's Missing
Experience Level
Experience
Qualifications
About Institute of Foundation Models
The Institute of Foundation Models is a research lab committed to building, understanding, and leveraging foundation models to enhance AI development. Our focus is on advancing research, nurturing future AI innovators, and fostering contributions to a knowledge-driven economy.
Similar jobs
Search for Staff Ml Performance Engineer Training Efficiency
548 results
Wayve Technologies
Join Wayve Technologies as a Staff Machine Learning Performance Engineer, specializing in Training Efficiency. In this pivotal role, you will be responsible for enhancing the performance of our machine learning models and algorithms, ensuring they operate at peak efficiency. You will collaborate with cross-functional teams to develop innovative solutions that improve training processes, optimize model performance, and drive impactful results in autonomous vehicle technology.
Cerebras Systems
Join Cerebras Systems as an Engineering Manager specializing in Inference ML Runtime, where you will lead a dedicated team in developing groundbreaking machine learning solutions. Your expertise will guide the design and implementation of our inference runtime, ensuring efficiency and performance at scale.As a pivotal leader in our innovative environment, you will collaborate with cross-functional teams, driving the development of state-of-the-art algorithms and systems that push the boundaries of artificial intelligence.
Cerebras Systems
Cerebras Systems is at the forefront of AI technology, developing the world’s largest AI chip that is 56 times larger than conventional GPUs. Our innovative wafer-scale architecture delivers the computational power of dozens of GPUs within a single chip, simplifying programming and enhancing performance. This unique capability enables Cerebras to provide unparalleled training and inference speeds, allowing machine learning practitioners to execute large-scale ML applications seamlessly without the complexities of managing extensive GPU or TPU infrastructures.Cerebras serves a diverse clientele, including top-tier model labs, global enterprises, and pioneering AI-native startups. OpenAI has recently partnered with Cerebras to leverage 750 megawatts of power, significantly enhancing key workloads through ultra high-speed inference.Our cutting-edge wafer-scale architecture has made Cerebras Inference the fastest Generative AI inference solution globally, achieving speeds over ten times faster than GPU-based hyperscale cloud inference services. This revolutionary speed is transforming the user experience of AI applications, facilitating real-time iteration and boosting intelligence through enhanced computational capabilities.About The RoleWe invite you to join Cerebras as a Performance & Reliability Engineer within our dynamic Co-Design and Next Generation Team. Our groundbreaking CS-3 system has established benchmarks for high-performance ML training and inference solutions, utilizing a chip the size of a dinner plate with 44GB of on-chip memory that exceeds traditional hardware capabilities. In this role, you will focus on characterizing and optimizing the performance and reliability of state-of-the-art AI models operating on Cerebras' revolutionary hardware.ResponsibilitiesCharacterize and enhance the performance and reliability of advanced ML hardware/software systems, focusing on minimizing power and thermal fluctuations.Analyze ML workloads, software kernels, and hardware architecture for their power and performance impacts, synthesizing high-level insights across these layers.Develop innovative software solutions to enhance system performance and efficiency.
Applied Intuition
Applied Intuition is hiring a Software Engineer in Sunnyvale, California, with a focus on the Axion Data Engine and machine learning operations. This role centers on building and supporting the systems that power advanced data processing and ML workflows. Key Responsibilities Collaborate with cross-functional teams to design, build, and deploy data solutions for the Axion Data Engine. Maintain and enhance machine learning operations, aiming to improve system reliability and performance. Develop data processing capabilities that meet high standards for efficiency and accuracy. Team and Impact This position works closely with engineers and specialists from multiple areas. The work directly supports the quality and precision needed in industries that rely on advanced data and machine learning tools.
Join Wayve, a pioneering company at the forefront of robotic software development, as a Software Engineer specializing in System Performance. In this role, you will be instrumental in optimizing our advanced robotic systems to enhance their efficiency and reliability. Collaborate with a talented team to push the boundaries of what is possible in the field of robotics.
Intuitive Surgical, Inc.
Join Intuitive Surgical as a Staff Value Engineer, where you'll have the opportunity to shape the future of minimally invasive surgery. Our team is dedicated to advancing surgical technology and improving patient outcomes. In this role, you will leverage your engineering expertise to analyze and optimize the value of our surgical systems.
Intuitive Surgical, Inc.
We are seeking a talented and motivated Staff Supplier Engineer to join our dynamic team at Intuitive Surgical, Inc. In this role, you will play a crucial part in managing supplier relationships and ensuring the highest quality of materials and components for our innovative surgical systems. You will be responsible for evaluating suppliers, conducting audits, and collaborating closely with cross-functional teams to drive continuous improvement.
Intuitive Surgical, Inc.
Join our dynamic team as a Staff Quality Engineer at Intuitive Surgical, where you will play a pivotal role in ensuring the highest standards of quality in our innovative medical devices. You will collaborate with cross-functional teams to enhance product reliability and maintain compliance with industry regulations. Your expertise will contribute to our mission of advancing minimally invasive surgical technologies.
Intuitive Surgical, Inc.
Join Intuitive Surgical, a pioneering company at the forefront of robotic-assisted surgery, as a Staff Electrical Engineer. In this role, you will collaborate on innovative projects that enhance surgical precision and patient safety. Your expertise will help drive the development of cutting-edge medical devices that are transforming healthcare.
Intuitive Surgical, Inc.
Join Intuitive Surgical, a pioneering leader in minimally invasive robotic-assisted surgery, as a Managing Staff Engineer. In this critical role, you will oversee engineering projects and lead a talented team of engineers to innovate and improve our surgical systems. You will have the opportunity to drive advancements in technology and contribute to transforming surgical practices across the globe.
Intuitive Surgical, Inc.
Join Intuitive Surgical as a Staff Research Engineer and become a vital member of our innovative team. In this role, you will contribute to the development of advanced robotic systems designed to enhance surgical procedures. Your expertise will be crucial in pushing the boundaries of technology and improving patient outcomes.
Institute of Foundation Models
Join Our Innovative Team at the Institute of Foundation ModelsAt IFM, we are pioneers in developing, understanding, and managing foundation models. Our mission is to advance research, cultivate the next generation of AI innovators, and contribute significantly to a knowledge-driven economy. As a member of our esteemed team, you will engage in the forefront of cutting-edge foundation model training, collaborating with top-tier researchers, data scientists, and engineers. Together, we will address the most significant and impactful challenges in AI development. You will play a crucial role in creating revolutionary AI solutions that have the potential to transform entire industries. Your strategic and innovative problem-solving abilities will be essential in establishing MBZUAI as a global leader in high-performance computing for deep learning, facilitating discoveries that will inspire future AI pioneers. The Role IFM is developing the foundational compute infrastructure that will drive future breakthroughs in AI and computational science. We are seeking a High Performance Computing Software Engineer to collaborate in designing, developing, and operating the software systems that manage our extensive AI workloads. In this position, you will work at the crossroads of high-performance computing and machine learning. You will be part of a dedicated team focused on creating the software stack that supports the training of advanced ML models using over 1000 GPUs, while ensuring our infrastructure remains robust, efficient, and user-friendly.
Illumio develops technology to contain ransomware and breaches, helping organizations guard against cyberattacks and maintain business continuity. The company’s breach containment platform uses the Illumio AI Security Graph to detect and isolate threats across hybrid and multi-cloud environments. Illumio is recognized for its leadership in microsegmentation and its commitment to the Zero Trust model, serving critical infrastructure and organizations worldwide. This Senior Staff Engineer - Cybersecurity position is based onsite in Sunnyvale, California, with work expected in the office five days a week. Role overview The engineering team at Illumio values leadership, autonomy, and ownership. Members work with a modern technology stack that spans operating systems, distributed applications, and advanced UI and visualization tools. The team is focused on building products that address today’s cybersecurity challenges, drawing on diverse perspectives and a shared drive for innovation. What you will do Develop containerized microservices for a distributed, multi-tenant system that processes data, real-time events, and network telemetry from multiple public clouds. These services provide customers with real-time insights, visibility, and security recommendations to reduce cloud risks. Design and architect platform components and subsystems, working through technical details, presenting and defending designs to peers, and ensuring thorough implementation. Mentor junior engineers, recent graduates, and interns to support their professional development and integration into the team. Write code primarily in Go and manage data pipelines using SQL or similar technologies. Familiarity with Kubernetes for service infrastructure is considered a plus. The team welcomes candidates with varied programming backgrounds who are eager to learn. Take ownership of critical features and subsystems, overseeing the software development lifecycle from requirements to deployment and customer adoption. Contribute to operational excellence and help drive engineering efforts toward greater innovation and efficiency.
Intuitive Surgical, Inc.
Role Overview Intuitive Surgical, Inc. is hiring a Staff Software Engineer in Sunnyvale. This position focuses on designing, building, and maintaining software that supports surgical robotics and improves patient care. What You Will Do Develop and refine software solutions for surgical robotics systems Work closely with teams from different disciplines to deliver reliable, high-quality products Contribute technical expertise to projects that advance healthcare technology
Join our dynamic team as a Staff Software Engineer specializing in Frontend development. We are seeking a talented individual with a robust background in building scalable e-commerce applications or mobile software. Your expertise in modern JavaScript frameworks and attention to detail will be instrumental in delivering high-quality web applications that enhance user experience.
Join Us in Shaping the Future!At Illumio, we are pioneers in ransomware and breach containment, transforming the way organizations tackle cyber threats and enhance operational resilience. Our innovative breach containment platform, fueled by the Illumio AI Security Graph, effectively identifies and mitigates threats across hybrid multi-cloud environments—preventing potential disasters before they escalate.As a recognized leader in the Forrester Wave™ for Microsegmentation, Illumio champions Zero Trust principles, bolstering cyber resilience for the critical infrastructure and systems that sustain global operations.Our Vision:Our Engineering team thrives on a culture of visionary leadership, autonomy, and ownership, fostering a collaborative environment that propels us forward in the rapidly evolving cybersecurity landscape.By joining our team, you will contribute to the leader in Zero Trust Segmentation, utilizing a cutting-edge technology stack that encompasses operating systems, distributed applications, and immersive UI/visualization tools.Together, we are shaping the future of cybersecurity, crafting world-class products led by diverse perspectives and a shared commitment to innovation amidst unprecedented cybersecurity challenges.Your Responsibilities:Architect cloud solutions that effectively address business challenges while balancing architectural integrity and business margins.Collaborate with cross-functional teams, including product managers, developers, and DevOps engineers, to comprehend business requirements and design scalable cloud architectures.Design, deploy, and manage cloud architectures adhering to industry best practices, with a focus on efficiency, scalability, availability, performance, and security.Assess and select suitable cloud technologies and platforms, including Kubernetes, to fulfill organizational needs and foster innovation.Enhance cloud-based systems for high availability, fault tolerance, and disaster recovery capabilities.Implement and oversee monitoring, logging, and alerting systems to ensure optimal health and performance of cloud infrastructure.Identify and rectify performance bottlenecks, security vulnerabilities, and operational challenges within the cloud environment.Stay abreast of the latest trends, technologies, and best practices in cloud computing, distributed systems, and cybersecurity.
Join Us in Shaping the Future of Cybersecurity!Illumio stands at the forefront of ransomware and breach containment, revolutionizing the way organizations defend against cyberattacks while fostering operational resilience. Our advanced breach containment platform, powered by the Illumio AI Security Graph, adeptly identifies and mitigates threats across hybrid multi-cloud environments—preventing attacks from escalating into catastrophic events.As a recognized Leader in the Forrester Wave™ for Microsegmentation, Illumio empowers Zero Trust principles, enhancing cyber resilience across critical infrastructure, systems, and organizations that underpin our global society.Work Environment:This role requires on-site presence five days a week at our Sunnyvale, CA Headquarters.Our Engineering Vision:Our engineering team thrives in a culture defined by visionary leadership, individual autonomy, and a sense of ownership, fostering a dynamic synergy that propels us through the rapidly evolving cybersecurity landscape.By joining us, you will contribute to the leader in Zero Trust Segmentation, utilizing a cutting-edge technology stack that includes operating systems, distributed applications, and sophisticated UI/visualization tools.Together, we are forging the future of cybersecurity and creating world-class products—driven by diverse perspectives and a commitment to innovation in an era marked by unprecedented cyber threats.Your Contributions:Develop innovative methods to orchestrate Zero Trust Segmentation at the application/pod level, effectively identifying and blocking attack pathways within the container ecosystem.Enhance and understand modern container platforms such as Kubernetes, Istio, OpenShift, AKS, EKS, GKE, and others.Own and design essential features and subsystems, meticulously working through details and presenting your designs to peers.Deliver robust implementations that are elegant, straightforward, scalable, stable, secure, and maintainable—our product is vital for large enterprises and must operate flawlessly.Mentor junior engineers, new graduates, and interns, fostering their growth as engineers and integrating them into the team.Collaborate with field organizations and key customers to refine this groundbreaking product.Your Qualifications:A Bachelor's degree in Computer Science or a related field; a Master's degree is a plus.Experience in software engineering, particularly with container security solutions.Strong problem-solving skills and the ability to work effectively in a team-oriented environment.
Illumio’s engineering team works at the intersection of cloud security and breach containment, supporting organizations as they defend against cyber threats. The company’s platform uses the Illumio AI Security Graph to detect and contain breaches across hybrid and multi-cloud environments. Illumio has been recognized as a Leader in the Forrester Wave™ for Microsegmentation and is committed to advancing Zero Trust security for critical infrastructure worldwide. Role overview The Staff Engineer, Cloud Security, will design and build containerized microservices for distributed, multi-tenant systems. These systems handle data, real-time events, and network telemetry from multiple public clouds, providing customers with insights, visibility, and actionable security recommendations to help reduce risk in the cloud. What you will do Design service architecture, document and present design decisions, and deliver strong implementations. Write code primarily in Go, and work with data pipelines using SQL or similar interfaces. Experience with Kubernetes is valued, though candidates with other language backgrounds who are open to learning are encouraged to apply. Own critical features and subsystems throughout the software development lifecycle, from clarifying requirements to deployment and user adoption. Mentor junior engineers, recent graduates, and interns, supporting their growth and integration within the team. Work environment The team values collaboration, autonomy, and ownership. Illumio encourages leadership at every level and welcomes new ideas to keep pace with changes in cybersecurity. Engineers work with modern technology stacks, including distributed applications, operating systems, and advanced UI and visualization tools. Diverse perspectives and ongoing innovation are central to the company’s approach. Location This position is based at Illumio headquarters in Sunnyvale, California.
Illumio develops solutions that help organizations contain ransomware and breaches, strengthening operational resilience against cyber threats. The company’s breach containment platform, powered by the Illumio AI Security Graph, identifies and mitigates risks across hybrid and multi-cloud environments. Illumio’s approach aims to stop cyberattacks before they escalate into major incidents, and its work in microsegmentation has been recognized by the Forrester Wave™. The company supports the Zero Trust model to protect critical infrastructure and global operations. Location and Work Environment This Staff Engineer, Container Security role is based at Illumio’s Sunnyvale, California headquarters. Onsite presence is required five days a week. The engineering team values leadership, autonomy, and ownership, working with a variety of operating systems, distributed applications, and advanced UI and visualization tools. Illumio encourages diverse perspectives to drive innovation in addressing cybersecurity challenges. What You Will Do Design and implement new methods for orchestrating Zero Trust Segmentation at both the application and pod level, focusing on identifying and blocking attack paths in container environments. Work extensively with modern container platforms such as Kubernetes, Istio, OpenShift, AKS, EKS, and GKE. Own critical features and subsystems, refining technical details and advocating for your designs within the team. Deliver solutions that are scalable, stable, secure, and maintainable to protect large enterprise infrastructure. Mentor junior engineers, recent graduates, and interns to support their growth and integration into the team. Collaborate with field teams and key customers to help shape the direction of the product.
Role overview The Staff Mechanical Engineer at Ceribell plays a key part in shaping and advancing the mechanical systems behind the company’s technology. This position involves hands-on design and development work, as well as ongoing improvement of existing systems. The role requires both creative problem-solving and a strong technical foundation, working alongside a team of experienced professionals in Sunnyvale. What you will do Design mechanical components and systems that are integral to Ceribell’s products Develop and refine mechanical solutions to align with project objectives Work closely with engineers and other team members to address technical challenges Apply mechanical expertise to support innovation across projects Location This role is based in Sunnyvale.
Sign in to browse more jobs
Create account — see all 548 results

