Software Engineer Pre Training Systems At Magic San Francisco jobs in San Francisco – Browse 11,518 openings on RoboApply Jobs

Software Engineer Pre Training Systems At Magic San Francisco jobs in San Francisco

Open roles matching “Software Engineer Pre Training Systems At Magic San Francisco” with location signals for San Francisco. 11,518 active listings on RoboApply Jobs.

11,518 jobs found

1 - 20 of 11,518 Jobs
Apply
company
Full-time|On-site|San Francisco

At Magic, we are dedicated to creating safe artificial general intelligence (AGI) that propels humanity forward in tackling the most pressing global challenges. We believe that the most effective route to achieving safe AGI involves automating the research and code generation processes to enhance models and resolve alignment issues more reliably than humans can achieve independently. Our methodology incorporates cutting-edge pre-training at scale, domain-specific reinforcement learning (RL), ultra-long context capabilities, and optimized inference-time computations.Role OverviewIn your role as a Software Engineer on the Pre-training Systems team, you will be responsible for designing and managing the distributed infrastructure necessary for training Magic’s long-context models at scale.This position emphasizes large-scale model training utilizing extensive GPU clusters. You will operate at the intersection of deep learning and distributed systems, ensuring that training processes are efficient, reliable, and reproducible under extreme conditions.Magic’s long-context models present complex systems challenges, such as sustained memory usage, communication overhead across thousands of devices, long-duration jobs requiring fault tolerance, and efficient sequence packing within hardware limitations. You will take ownership of the systems that ensure large-scale pre-training is both stable and rapid.Your ContributionsScale distributed training across large GPU clusters, implementing data, tensor, and pipeline parallelism.Optimize communication strategies and gradient synchronization.Enhance checkpointing, fault tolerance, and job recovery mechanisms.Profile and resolve performance bottlenecks across computing, networking, and storage.Advance experiment reproducibility and orchestration workflows.Boost hardware utilization and overall training throughput.Collaborate with Kernel and Research teams to align model architecture with system capabilities.Qualifications We SeekSolid foundation in software engineering and distributed systems.Experience with training large models in multi-node GPU environments.In-depth understanding of parallelism techniques and performance trade-offs.Experience in debugging cross-layer issues within production ML systems.Demonstrated ownership mentality and capability to manage critical infrastructure.Proven track record in enhancing the performance or reliability of large-scale systems.

Feb 28, 2026
Apply
company
Full-time|On-site|San Francisco

At Magic, we are on a mission to develop safe AGI that propels humanity's progress in addressing the world's most significant challenges. We believe that automating research and code generation is the most promising pathway to achieving safe AGI, enabling us to enhance models and address alignment issues more reliably than humans can achieve alone. Our innovative approach integrates frontier-scale pre-training, domain-specific reinforcement learning, ultra-long context, and inference-time computing to realize our vision.About the RoleAs a Software Engineer at Magic, you will engage in developing core systems and product surfaces that directly influence model capabilities and enhance user experience.This position can align with areas such as Pre-training Data, RL Research & Environments, or Product Development, depending on your background and expertise. Regardless of placement, you will be expected to take full ownership of your work: identifying problems, crafting solutions, deploying to production, and iterating based on real-world results.Working with Magic's long-context models presents unique technical challenges, including large-scale data acquisition, long-horizon post-training loops, and developing product workflows that make complex model behaviors understandable and manageable. You will work closely with these constraints, creating systems that are both technically sound and production-ready.This role has the potential to evolve into a deeper specialization in data systems, post-training capability enhancement, or product engineering leadership based on your strengths and interests.What You'll Work OnDepending on your team assignment, your tasks may include:Developing and scaling large distributed data pipelines for pre-trainingDesigning filtering, mixture, and dataset versioning systemsCreating post-training datasets, evaluation frameworks, and reward pipelinesConducting ablations that translate capability goals into quantifiable improvementsBuilding comprehensive product interfaces that integrate seamlessly with the modelDesigning APIs, backend services, and frontend workflows for AI-first experiencesEnhancing the reliability, observability, and performance of production systemsWhat We’re Looking ForSolid foundation in software engineering principlesHigh ownership and comfort in navigating ambiguous problem domainsProven experience in building scalable production systemsAbility to reason through complex technical challenges

Feb 28, 2026
Apply
companyMagical logo
Full-time|Hybrid|San Francisco

About MagicalMagical is a cutting-edge automation platform that integrates advanced AI technology into the healthcare sector, providing AI agents that deliver tangible results in production environments.Our mission is to create AI-driven "employees" that streamline tedious, time-intensive workflows that hinder team productivity. We focus on the healthcare industry—a $4 trillion sector entangled in administrative challenges—by automating processes like claims processing, prior authorizations, and eligibility checks, allowing healthcare providers to dedicate more time to patient care.Our AchievementsThe move towards agentic automation in healthcare is on the horizon, and we are at the forefront:Significant revenue growth as clients expand into new workflows prior to renewalRapid 7-day proof-of-concept implementations that showcase real value, unlike the typical months-long processes in the industrySelf-healing automations that are reliable and scalable in production environments, a feat where many competitors struggleUnlike many AI companies that make grand claims, we deliver dependable solutions that yield measurable outcomes. Our funding partners include Greylock, Coatue, and Lightspeed, with a total of $41M raised. Our founder, Harpaul Sambhi, has previously achieved success by selling his first company to LinkedIn.About the RoleAs the Engineering Manager for our Autonomous team, you will lead and grow a talented group of engineers committed to shaping the future of AI agent development, continually pushing the limits of AI and backend system capabilities.Your passion for management will shine as you nurture the professional growth of your engineers. You possess the technical expertise necessary to engage in intricate architectural discussions and translate complex technical hurdles into clear business strategies. In this position, you will be a vital link between our product vision and technical implementation.This role offers a hybrid work environment, requiring 2 days a week in our San Francisco office.

Mar 6, 2026
Apply
companyMagical logo
Full-time|Hybrid|San Francisco

About MagicalMagical is at the forefront of agentic automation, revolutionizing the healthcare landscape with cutting-edge AI technology. Our platform is designed to empower healthcare providers by automating labor-intensive tasks, allowing them to concentrate on what truly matters: patient care.By streamlining processes like claims management, prior authorizations, and eligibility assessments in an industry plagued by administrative hurdles, we're facilitating a transformative shift—one that is both necessary and inevitable.Our AchievementsWe are leading the charge in agentic automation, evidenced by:Significant revenue growth as clients expand their usage into new workflows.Quick proof-of-concept demonstrations within just 7 days—far exceeding industry norms.Reliable, self-healing automation solutions that excel where others falter.Unlike other AI companies, we deliver dependable solutions that yield tangible outcomes. With $41 million raised from renowned investors like Greylock, Coatue, and Lightspeed, our founder Harpaul Sambhi brings a wealth of expertise, having previously sold his startup to LinkedIn.About the RoleAs a Senior Software Engineer, Product on our Builder Experience team, you will harness your full-stack expertise to develop features that enable teams to create, configure, and deploy AI agents seamlessly. You will oversee the entire product interface—from user-friendly no-code tools for agent setup to dynamic dashboards for real-time monitoring and assessment.This position is crafted for engineers passionate about creating exceptional user experiences, understanding that stellar UX is crucial for making advanced technology accessible. Collaborating closely with customers and our design team, you'll deliver features that enhance agent development, all while maintaining a firm grasp of the underlying systems to create effective abstractions.This is a hybrid role, requiring you to be in our San Francisco office three days a week.

Oct 13, 2025
Apply
companyMagical logo
Full-time|On-site|San Francisco

At Magical, we are transforming the way work is accomplished.Our cutting-edge AI platform introduces "AI employees" to the workplace, tackling monotonous and draining tasks that hinder team efficiency. This empowers organizations to operate more swiftly and effectively, ultimately enhancing outcomes in critical areas such as patient care.As we spearhead the shift towards agentic work, we are rapidly scaling our product from $0 to $XM ARR in just a few months. We are seeking innovative engineers to help us achieve $XXM ARR. Joining our founding team means you will not only be coding but also influencing the future of work with a small, driven team at the forefront of AI advancements.Supported by prominent investors such as those behind OpenAI, Anthropic, Huggingface, and Notion, including Greylock, Coatue, and Lightspeed, we have a robust runway and a vast market waiting to be explored.

Dec 9, 2025
Apply
companySpecter logo
Full-time|On-site|San Francisco

Company Overview:Specter is revolutionizing how businesses perceive their physical environments by developing a software-defined control plane. Our mission is to enhance the security of American enterprises by providing them with comprehensive visibility over their physical assets.We are pioneering a connected hardware-software ecosystem that leverages multi-modal wireless mesh sensing technology, reducing the deployment costs and time for sensors by a factor of ten. Our platform aims to be the perception engine for a company’s physical presence, facilitating real-time visibility of perimeters and enabling autonomous operational management.Founded by passionate innovators from Anduril, Tesla, Uber, and the U.S. Special Forces, our co-founders, Xerxes and Philip, are dedicated to empowering our partners in the rapidly evolving landscape of physical AI and robotics.

Oct 3, 2025
Apply
companyMagic Patterns logo
Full-time|On-site|San Francisco

Hello! I’m Alex, co-founder of Magic Patterns. We're thrilled to announce an exciting opportunity for a Head of Growth to join our dynamic team. Our product-led growth strategy is thriving, and we’re ready to accelerate our momentum. Currently, we engage on platforms like X, LinkedIn, Reddit, and YouTube, but we envision expanding our outreach significantly.At Magic Patterns, you will play a pivotal role in transforming the software development landscape. Our innovative platform is already empowering thousands of teams to deploy software more rapidly. Our mission is to assist product teams in taking their ideas from inception to production, which has attracted Fortune 500 clients and fostered a passionate community. However, we believe it's always day one, and your contribution is crucial!If you’re passionate about startups, AI, and thrive in a fast-paced environment, we can't wait to collaborate with you!

Oct 29, 2025
Apply
companyBraintrust logo
Full-time|On-site|San Francisco

About braintrustBraintrust is at the forefront of AI observability. By merging evaluation and observability into a singular workflow, we empower developers with the insights needed to comprehend AI behavior in production environments, along with the tools to enhance it.Leading teams at Notion, Stripe, Zapier, Vercel, and Ramp utilize Braintrust to compare models, test prompts, and monitor regressions — transforming production data into superior AI with each new release.About the roleWe are in search of a passionate software engineer dedicated to crafting high-performance data processing systems. Our clientele consists of large enterprises handling complex, semi-structured data, which they require for real-time processing and analysis. Our distinct architecture enables these organizations to keep data on-premises while creating intricate visualizations that load without delay. Explore our Brainstore blog post.If you have experience with database systems, compilers, networks, or storage systems and aspire to pivot your expertise into the AI sector, this role could be your ideal fit. You will significantly influence foundational system architecture, technology selection, and implementation. Our founding team possesses extensive knowledge in database and ML systems, and you will have the autonomy to collaborate closely with them while exploring your innovative ideas.Your ResponsibilitiesAs a systems engineer at Braintrust, you’ll contribute to the core systems that empower Braintrust’s capability to process and query vast amounts of unstructured data at an enterprise scale. Key areas of responsibility include:Enhancing the storage, indexing, and query execution performance of Brainstore.Developing Braintrust's btql query language.Optimizing query patterns to boost performance across our platform.QualificationsDeep understanding of systems programming (C++ or Rust, concurrency, databases, operating systems).Experience in founding or working at startups is advantageous.Familiarity with writing prompts or experimenting with GPT models and applications.BenefitsComprehensive medical, dental, and vision insurance.Daily lunch, snacks, and beverages provided.Flexible time off policy.Competitive salary with equity options.

Mar 29, 2024
Apply
companyMidstream logo
Full-time|On-site|San Francisco

Midstream is an innovative, AI-driven financial operating system tailored for healthcare systems. Founded by a team of seasoned entrepreneurs and supported by prestigious investors, we empower finance, supply chain, and managed care teams with real-time insights into margin risks, enabling them to act swiftly to protect their margins.Designed specifically for the complexities of healthcare, Midstream converts structured, unstructured, and external data into immediate, contract-aware insights that enhance decision-making. Our AI-driven agents integrate spending and revenue operations across the entire back-office, continuously learning and adapting to ensure a level of intelligence that surpasses any standalone solution.We are revolutionizing the pace of healthcare finance, compressing lengthy processes into minutes and transforming retrospective insights into proactive foresight. Midstream is at the forefront of this change.The OpportunityJoin Midstream at a pivotal moment in our growth and contribute to establishing the technical backbone of our platform. In this role, you will operate at the intersection of product development, systems architecture, and cloud infrastructure, crafting and building distributed systems that enable our agile team to operate efficiently as we expand.You will collaborate closely with engineers and leadership to identify current infrastructural pain points and anticipate potential challenges before they arise. From multi-tenant architectures to security protocols, you will transform uncertainty into robust systems that minimize operational burdens and enhance engineering productivity.We are seeking a software engineer with a systems-oriented mindset who is passionate about long-term maintainability and developer experience, while also enjoying the process of creating backend services and delivering tangible software solutions. The ideal candidate will be adept at reasoning about distributed systems, making practical compromises, and developing infrastructure that seamlessly integrates into the background, allowing the team to focus on delivering impactful product value.What You’ll DoDevelop shared platform patterns and tools that enable engineers to launch new backend services and workflows efficiently, securely, and reliably.Enhance the reliability of our production systems by improving observability, debugging capabilities, and resilience as we scale.Architect infrastructure that is clean, reviewable, and repeatable, minimizing unique configurations and facilitating rapid iteration.Establish clear, scalable multi-tenant boundaries across data, compute, and identity to support...

Jan 29, 2026
Apply
companyLatchBio logo
Full-time|On-site|San Francisco

About UsAt LatchBio, we are at the forefront of transforming biological discovery through the fusion of laboratory automation, high-throughput assays, and machine learning. Our innovative platform is designed to store, visualize, and analyze the next wave of scientific discoveries. Trusted by teams across pharmaceutical, biotech, and solution provider sectors, our technology plays a crucial role in enhancing, informing, and delivering groundbreaking products.Our dedicated team of engineers has spent over four years developing and marketing cutting-edge technology in a challenging market that is often hesitant to embrace newcomers. We cater to a diverse clientele with varying product expectations, necessitating close collaboration and nuanced communication with both technical and non-technical users. Our systems routinely handle computational tasks involving multiple terabytes of data.Our commitment and perseverance have resulted in significant market validation, with revenue more than tripling over the past year. Looking ahead, we aim to achieve a sustainable growth trajectory, targeting a repeatable sales process and reaching $50 million in annual recurring revenue (ARR) within the next three years.Explore our core product offerings:Distributed file system with metadata on Postgres and blobs on S3, featuring a web UI and a FUSE driver.Workflow orchestrator built on Kubernetes.On-demand interactive compute instances based on Kubernetes containers.Statically-typed tabular data storage engine.Reactive Python-based web application framework for data analysis and visualization.Upcoming: A cluster orchestrator and workflow engine designed to accept compute nodes from anywhere on the internet.While various startups focus on niche solutions, our comprehensive approach sets us apart in this tech-heavy industry.

Jun 22, 2024
Apply
companySazabi logo
Full-time|Remote|San Francisco

Join Our Innovative Team at SazabiAs we approach the year 2026, the tech world faces a looming "infinite software crisis." How do we effectively support, maintain, and manage the vast surge in application development?Our solution is Sazabi: the AI-native observability platform designed specifically for dynamic engineering teams.Sazabi empowers teams to inquire about their production systems in straightforward language, visualize operations automatically, and identify root causes up to 10 times faster. Forget about tedious instrumentation, complex dashboard setups, and alert configurations—just get the answers you need.We are proud to be supported by innovators from industry-leading AI companies, including Vercel, Graphite, Daytona, Browserbase, LangChain, and Replit.

Mar 23, 2026
Apply
companyThe Bot Company logo
Full-time|On-site|San Francisco

The Bot CompanyAt The Bot Company, we are revolutionizing the home experience by creating an intelligent robot that serves as a helpful companion in every household.Our dynamic team, comprised of talented engineers, designers, and operators, is based in the heart of San Francisco. We boast a rich background with team members previously at industry giants such as Tesla, Cruise, OpenAI, Google, and Pixar. Collectively, we have a proven track record of delivering groundbreaking products to millions of users, understanding deeply what it takes to craft exceptional experiences.We pride ourselves on a streamlined team structure that fosters quick decision-making and eliminates unnecessary bureaucracy. Each team member is empowered to take ownership of their work and has the autonomy to drive their projects forward. Our culture encourages rapid iteration and agile execution across all levels of development.

Jan 13, 2026
Apply
companyBaseten logo
Full-time|On-site|San Francisco

Baseten supports companies like Cursor, Notion, and Writer in running AI inference at scale. The team blends AI research, adaptive infrastructure, and developer tools to help organizations deploy advanced AI models efficiently. Backed by investors such as BOND, IVP, and Greylock, Baseten recently raised a $300M Series E. The company aims to be the trusted platform for engineers launching AI products. Role overview The Software Engineer - Realtime Systems (Voice AI) role focuses on building and deploying production-ready Voice AI systems. Baseten’s Voice AI team works with open-source models to power applications in productivity, customer support, clinical conversations, creative tools, and education. Engineers in this group influence how people use voice to interact with technology, shaping products that impact multiple industries. This position involves leading Voice AI projects, setting both product direction and technical strategy. Collaboration is a key part of the work: expect to partner with Forward Deployed Engineers, Model Performance Engineers, and other teams to advance Baseten’s Voice AI capabilities. Sample projects The world's fastest Whisper, with streaming and diarization Orpheus TTS inference partnership with Canopy Labs Collaborate with the Core Product team to build a multi-model voice agent using Baseten’s orchestration framework Work alongside the Training Platform team to support ongoing training of voice models Design APIs and SDKs that make Baseten Voice AI products accessible for developers Location This role is based in San Francisco.

Apr 26, 2026
Apply
companyOmni logo
Full-time|Hybrid|San Francisco, CA

About OmniOmni is an innovative business intelligence and embedded analytics platform dedicated to helping our customers effectively explore, understand, and leverage their data.Based in San Francisco with additional hubs in EMEA and APAC, we are supported by prominent investors such as ICONIQ Growth, Theory Ventures, First Round Capital, Redpoint Ventures, Google Ventures, Snowflake Ventures, and Databricks Ventures.About the RoleAs a Pre-Sales Solutions Engineer, you will gain in-depth knowledge of the Omni platform, equipping our clients to maximize their data usage throughout their data lifecycle. You will engage with businesses across diverse industries, regions, and sizes. Our high-touch, forward-deployed model ensures that you will continuously learn and discover new use cases and requirements. You will advocate for our users and contribute to shaping the product's future by relaying field insights and feedback.You will:Collaborate with the sales team to deliver technical expertise throughout the sales process, identifying and validating use cases that demonstrate the value of the Omni platform.Develop and present Proofs of Concept using the Omni Platform during evaluation phases, guiding potential customers on how to effectively utilize Omni with their data.Facilitate seamless transitions from trials to successful customer status by partnering with post-sales teams and coordinating with our partners to ensure ongoing support where necessary.Work closely with Product, Design, and Engineering teams to provide user feedback and influence product direction.About You3+ years of experience in data analytics or business intelligence.Proven experience in a client-facing role with a strong desire to work closely with customers.Robust SQL skills, utilizing SQL for analytical purposes.A genuine passion for working with data.A love for learning and an enthusiasm for helping and educating others.Previous experience with business intelligence and/or enterprise data analytics tools such as Looker, Tableau, PowerBI, Mode, Sisense, etc., is preferred.This position offers a hybrid work schedule, with three days per week in the office at our SF headquarters.

Oct 20, 2025
Apply
companyCartesia logo
Full-time|On-site|*HQ - San Francisco, CA

About CartesiaAt Cartesia, we are on a mission to revolutionize artificial intelligence by creating interactive intelligence that is accessible and effective in any environment. We have identified a gap in current AI capabilities; existing models struggle to continuously process extensive streams of audio, video, and text data. Our vision is to bridge this gap by developing pioneering model architectures.Founded by PhD experts from the Stanford AI Lab, we are the creators of State Space Models (SSMs), a groundbreaking approach to training efficient, large-scale foundation models. Our team merges profound expertise in model innovation with systems engineering and product design to deliver advanced models and user experiences.Backed by leading investors such as Index Ventures and Lightspeed Venture Partners, along with an array of esteemed advisors, we are well-positioned to push the boundaries of AI.About the RoleWe are seeking a talented Software Engineer specializing in database systems to architect and scale Cartesia’s data infrastructure. You will play a crucial role in implementing robust data governance and developing user-friendly, secure database tools that empower both engineers and non-engineers.Your ImpactDesign and enhance database platforms to ensure scalability to over 100 times current capacity while maintaining uptime, latency, and accuracy.Construct data storage architectures that function seamlessly across various environments including AWS, GCP, on-premises systems, and third-party deployments.Facilitate accelerated development across the organization by providing high-quality database tools and resources to both technical and non-technical users.Implement secure access control mechanisms to ensure sensitive data is restricted to authorized personnel only.Develop scalable data governance systems focused on permissions, auditing, and compliance, utilizing IAM policies, ACLs, and security controls across a large user base.What You BringExpertise with cloud services such as AWS, GCP, or Azure, along with experience using infrastructure-as-code tools like Terraform.A proven history of managing database systems during periods of rapid growth in dynamic environments.

Feb 3, 2026
Apply
companyArchil logo
Full-time|On-site|San Francisco

Role OverviewJoin our innovative team as a Distributed Systems Engineer at Archil, where you will play a pivotal role in developing cutting-edge storage solutions. You will work across the entire technology stack, tackling challenges as they arise and significantly shaping our product's technical and strategic direction.Your responsibilities will include:Being on-call for our production systems to assist customers promptly in case of issues.Innovating and implementing unprecedented features in our storage services.Designing interactions within distributed systems to ensure atomicity and idempotency.Deploying and standardizing infrastructure across various cloud environments.Navigating evolving customer requirements amidst ambiguity.

Jun 2, 2025
Apply
companyMagical logo
Full-time|Hybrid|San Francisco

About MagicalMagical is a cutting-edge automation platform harnessing advanced AI to transform the healthcare landscape. We provide AI agents that are not just theoretical but are proven to work effectively in real-world scenarios.Our vision is to create 'AI employees' that streamline tedious, repetitive tasks that hinder productivity. With healthcare being a $4 trillion sector mired in administrative hurdles, we focus on automating critical processes such as claims processing, prior authorizations, and eligibility checks, allowing healthcare providers to prioritize patient care.Our TractionThe transition to agentic automation in healthcare is unavoidable, and we are at the forefront of this movement:Rapid revenue growth as clients expand their use of our solutions beyond initial workflows.7-day proof-of-concept implementations that showcase real value quickly, in contrast to industry norms that often take months.Self-healing automations that maintain production-grade reliability at scale, where many competitors struggle to deliver.Unlike many AI firms that make grand promises, we deliver dependable solutions that yield quantifiable results. Our company is backed by notable investors Greylock, Coatue, and Lightspeed, with a total of $41 million raised. Our founder, Harpaul Sambhi, is a seasoned entrepreneur who successfully sold his previous venture to LinkedIn.About the RoleAs the Software Engineering Manager for the Builder Experience team, you will lead and expand a talented group of engineers dedicated to optimizing our user-facing interfaces that simplify agent configuration.You should possess a deep passion for management and derive satisfaction from fostering the professional growth of your engineers. Your technical expertise will empower you to navigate intricate architectural discussions and convert complex technical challenges into effective business strategies. In this position, you will be the vital link between our product vision and its technical realization.This role requires a hybrid work model with 2 days per week in our San Francisco office.

Mar 6, 2026
Apply
companyAnrok logo
Full-time|On-site|San Francisco

Anrok builds an AI-powered tax automation platform designed to help businesses manage compliance as they scale globally. The platform integrates with billing and payment systems to automate tax monitoring, calculations, and filing, covering complex regulations across municipal, state, and federal levels. Anrok serves a wide range of clients, including 40% of the top 50 AI companies and 20% of the top 100 Cloud companies recognized by Forbes, as well as organizations like Notion, Anthropic, and Cursor. The company is backed by over $50 million from investors such as Sequoia, Index, and Khosla Ventures. Role overview The Solutions Engineer - Pre Sales role in San Francisco focuses on supporting revenue growth by guiding strategic customers through technical integrations. This position works closely with sales, product, and engineering teams to help close deals and ensure smooth implementations. The role requires deep understanding of Anrok’s API, quote-to-cash workflows, and expertise in sales tax compliance and third-party billing integrations such as Stripe and NetSuite. What you will do Assist customers with complex technical integrations during the pre-sales process Collaborate with Sales, Activation, and Product/Engineering teams to connect technical solutions with business goals Develop expertise in Anrok’s API and the sales tax compliance landscape Support the adoption of third-party billing integrations Requirements Ability to work cross-functionally with sales, product, and engineering Technical proficiency with APIs and billing integrations (such as Stripe, NetSuite, and similar platforms) Strong communication skills to translate technical details into business value

Apr 29, 2026
Apply
companyZyphra logo
Full-time|On-site|San Francisco

Zyphra is an innovative leader in artificial intelligence, located in the heart of San Francisco, California.Role Overview:As a Research Engineer specializing in Language Model Pre-Training, you will play a pivotal role in defining our language model strategy through comprehensive pretraining development. Your close collaboration with our pretraining team will ensure that your insights contribute to the advancement of our next-generation models.Key Responsibilities:Conduct large-scale training runs and implement model parallelization techniques.Optimize the performance of our pretraining stack.Oversee dataset collection, processing, and evaluation.Research architecture and methodologies, including optimizer ablations.Qualifications:Demonstrated engineering prowess in developing reliable and robust systems.A quick learner with a passion for implementing innovative ideas.Exceptional communication and collaboration skills, capable of working effectively on both research and engineering implementations at scale.Preferred Skills:Profound expertise in addressing machine learning challenges and training models.Experience training on large-scale (multi-node) GPU clusters.In-depth understanding of model training pipelines, including model/data parallelism and distributed optimizers.Strong methodology for conducting rigorous ablations and hypothesis testing.Familiarity with large-scale, high-performance data processing pipelines.High proficiency in PyTorch and Python programming.Ability to navigate and understand extensive pre-existing codebases swiftly.Published research in machine learning in reputable venues is an advantage.Postgraduate degree in a relevant scientific field (Computer Science, Electrical Engineering, Mathematics, Physics).Why Join Zyphra?We value a research methodology that emphasizes thoughtful, methodical progress towards ambitious objectives. Both deep research and engineering excellence are given equal importance.Join us in an environment that fosters innovation, collaboration, and professional growth.

Aug 28, 2025
Apply
companySiftstack logo
Full-time|$200K/yr - $250K/yr|On-site|San Francisco, CA

At Sift, we are revolutionizing the way sophisticated machines are constructed, tested, and managed. Our innovative platform provides engineers with instantaneous visibility over high-frequency telemetry, effectively removing bottlenecks and fostering swifter, more dependable development.Originating from our extensive experience at SpaceX on projects such as Dragon, Falcon, Starlink, and Starship, Sift was created to address the challenges of scaling telemetry, debugging flight systems, and ensuring mission reliability, which necessitated the development of groundbreaking infrastructure. Established by a talented team from SpaceX, Google, and Palantir, Sift is tailored for mission-critical systems where accuracy and scalability are imperative.As a key early engineer concentrating on our data infrastructure, your role will extend beyond mere coding—you will shape foundational architecture and assist in scaling a real-time telemetry platform from its inception. You will engage with intricate backend systems designed to process, store, and deliver millions of high-frequency data points each second, facilitating rapid iteration cycles for some of the world's leading engineering teams.

Oct 27, 2025

Sign in to browse more jobs

Create account — see all 11,518 results

Tailoring 0 resumes

We'll move completed jobs to Ready to Apply automatically.