Big Data Engineer With Spark jobs in Palo Alto – Browse 598 openings on RoboApply Jobs

Big Data Engineer with Spark

Cygnus Professionals Inc.Palo Alto

On-site Contract

Clicking Apply Now takes you to AutoApply where you can tailor your resume and apply.

Experience Level

Mid to Senior

Qualifications

Requirements:A minimum of 2 years of experience in distributed systems. Proficiency in Java, Spark, and Hadoop. Familiarity with distributed system design, data pipelining, and implementation. Understanding of machine learning algorithms. Experience in building large-scale applications and data modeling. Knowledge of distributed computing environments (Hadoop/Spark/Cloud) or parallel processing techniques (CUDA/threads/MPI). Ability to work effectively in a collaborative, technical environment. A strong problem-solver and quick learner with excellent interpersonal skills.

About the job

Hello from Cygnus Professionals,

We hope this message finds you well!

We are currently seeking a talented Big Data Engineer specializing in Spark to join our dynamic team. If you possess the necessary skills and are interested, please send us your updated resume along with your contact information and desired rate.

Position: Big Data Engineer with Spark

Location: Palo Alto, CA

Duration: Potential Contract to Hire

Interview Mode: Skype following a phone interview

Important Note: We are looking for Green Card holders, U. S. Citizens, or EAD GC candidates only.

Position Overview:

As a Big Data Engineer, you will utilize your expertise in Java, Spark, and Hadoop to design and implement scalable data solutions. You will work with distributed systems, and your responsibilities will include:

Developing data pipelines and implementing distributed system designs.
Applying machine learning algorithms to enhance data processes.
Building large-scale applications utilizing various software design patterns and object-oriented design principles.
Engaging in research and analysis to convert raw data into structured datasets for product development.
Managing data warehousing and parallel processing of large datasets using technologies such as Hadoop and cloud solutions.
Participating in Agile, Scrum, and SDLC methodologies.
Thriving in a fast-paced, research-oriented environment while demonstrating strong communication and collaboration skills.

About Cygnus Professionals Inc.

Cygnus Professionals Inc., headquartered in New Jersey, is a next-generation global IT solutions and consulting firm. Our leadership team brings over 30 years of combined experience. We have established a strong presence across four countries and serve over 25 satisfied clients. Our mission is to expand our industry-focused excellence across various sectors and regions. We are proud to be recognized by the US Pan Asian American Chamber of Commerce Education Foundation (USPAACC) as one of the 'Fast 100 Asian American Businesses', reflecting our impressive revenue growth over the past two years.

1 - 20 of 598 Jobs

Select all on this page (20)

Apply

Big Data Engineer with Spark

Cygnus Professionals Inc.

Contract|On-site|Palo Alto

Hello from Cygnus Professionals,We hope this message finds you well!We are currently seeking a talented Big Data Engineer specializing in Spark to join our dynamic team. If you possess the necessary skills and are interested, please send us your updated resume along with your contact information and desired rate.Position: Big Data Engineer with SparkLocation…

Apr 20, 2017

Apply

Hadoop Developer - Transform Data into Insights

Sonsoft Inc.

Full-time|On-site|Palo Alto

Join our dynamic team at Sonsoft Inc. as a Hadoop Developer, where you'll leverage your skills to transform vast amounts of data into actionable insights. This role presents an exciting opportunity to work with cutting-edge technologies in a fast-paced environment.

Mar 16, 2017

Apply

Senior Software Engineer, Big Data

ZipRecruiter

Hybrid|On-site|Palo Alto, CA

Join our dynamic team as a Senior Big Data Software Engineer, where you'll play a pivotal role in developing scalable applications and data processing systems that serve millions of job seekers and thousands of employers. In this hybrid work environment, you'll engage in both remote and in-office collaboration to build fast, efficient, and intelligent solutions that enhance the job search experience. Your expertise will help shape our cutting-edge marketplace technology, enabling seamless connections between talent and opportunity.

Oct 29, 2025

Apply

Senior Data Engineer

Rivian and Volkswagen Group Technologies

Full-time|On-site|Palo Alto, California

Rivian and Volkswagen Group Technologies brings together two leaders in the automotive sector to develop new technology for electric vehicles. The partnership focuses on solutions ranging from advanced operating systems and zonal controllers to secure cloud connectivity. By combining expertise in connectivity, artificial intelligence, and security, the team works to shape a more connected and sustainable future in mobility. Role overview The Senior Data Engineer joins the Core Data team, which builds and scales the data platform supporting analytics and decision-making. Within this team, the Operational Insights group handles data from production environments, turning complex operational and quality data into dependable data products and performance metrics that drive business results. What you will do Design and maintain scalable data ingestion pipelines for analytics initiatives across multiple programs and locations Create well-structured datasets to support analytics and AI-driven insights Ensure data integrity and reliability across key operational systems Bridge operational systems, data platform engineering, and analytics to convert raw data into governed, high-value datasets Who thrives in this role This role suits those who enjoy building data infrastructure that connects operational systems and analytics, and who take pride in transforming raw data into reliable, actionable insights for a broad range of business needs.

Apr 28, 2026

Apply

Vice President of Data Engineering

pebl

Full-time|On-site| Palo Alto, CA

Join pebl as the Vice President of Data Engineering, a pivotal role where you will lead our data engineering team in building robust data infrastructures that drive our innovative solutions. You will be responsible for overseeing the development and optimization of data pipelines, ensuring data integrity, and enabling our organization to leverage data for decision-making and strategic initiatives.As a key member of the executive team, you will collaborate closely with other departments to align data strategies with business objectives. Your expertise in data architecture and engineering will help shape our products and services, fostering a data-driven culture across the organization.

Mar 24, 2026

Apply

Hadoop Developer

Sonsoft Inc.

Full-time|On-site|Palo Alto

Join our dynamic team at Sonsoft Inc. as a Hadoop Developer! In this role, you will be responsible for designing and implementing scalable data processing systems using Hadoop technologies. You will collaborate with data scientists and analysts to optimize data workflows and ensure the integrity and availability of large datasets.

Mar 2, 2017

Apply

Data Software Engineer

xai

Full-time|Remote|Palo Alto, CA

Join xai as a Data Software Engineer, where you will be at the forefront of revolutionizing data-driven solutions. As part of our innovative team, your expertise will help in designing and implementing robust software systems that leverage data to drive insights and improve decision-making.

Apr 29, 2026

Apply

Senior ML & Data Infrastructure Engineer

Rhoda AI

Full-time|On-site|Palo Alto

At Rhoda AI, we are pioneering the development of a comprehensive technology stack for the future of humanoid robotics. Our focus ranges from high-performance, software-defined hardware to cutting-edge foundational models and video world models that govern these systems. Our robots are engineered as versatile generalists, adept at navigating complex, real-world scenarios that extend beyond conventional training environments. Collaborating at the forefront of large-scale learning, robotics, and systems, our research team comprises distinguished experts from renowned institutions such as Stanford, Berkeley, and Harvard. With an impressive funding of over $400 million, we are committed to substantial investments in research and development, hardware innovation, and the scaling of manufacturing processes to bring our vision to life.Position OverviewWe are currently seeking a Senior ML & Data Infrastructure Engineer to take ownership of and enhance our data model training pipeline. This role encompasses the entire lifecycle, from raw data ingestion and storage to sophisticated indexing, retrieval, and throughput optimization at an unprecedented scale.Key ResponsibilitiesDesign, develop, and scale a robust data infrastructure capable of processing and managing billions of video clips while ensuring reliability, low latency, and cost-effectiveness.Create and optimize large-scale storage solutions, including cloud object storage and databases, tailored for multimodal datasets.Develop high-performance indexing and retrieval systems to facilitate rapid dataset querying, filtering, and iteration for both research and production applications.Establish observability frameworks for data pipelines that encompass monitoring, alerting, failure recovery, and performance enhancements.Implement intelligent workload distribution and throughput enhancements across distributed compute and storage infrastructures.Oversee data artifacts, versioning, and lineage to guarantee reproducibility and traceability throughout training cycles.Create user-friendly internal interfaces and lightweight tools that empower researchers and engineers to explore, query, and analyze extensive datasets efficiently.Facilitate the integration and scalable deployment of vision-language models (VLMs) within data pipelines for purposes such as screening, enrichment, or metadata generation.QualificationsA minimum of 5 years of experience in data infrastructure, distributed systems, machine learning infrastructure, or a closely related field.Proven expertise in developing and managing large-scale data pipelines and storage solutions.Strong programming skills in languages such as Python, Java, or Scala, and proficiency with data processing frameworks.Experience with cloud-based storage solutions and databases, as well as knowledge of multimodal data management.Ability to work collaboratively in a fast-paced, innovative environment.

Mar 10, 2026

Apply

Data Engineer at mindrobotics | Palo Alto

mindrobotics

Full-time|On-site|Palo Alto

Join mindrobotics as a Data Engineer, where you will play a pivotal role in developing the data systems that drive large-scale AI training for robots. This position is centered around transforming extensive data into well-organized, high-quality datasets. You will take ownership of data pipelines, labeling workflows, storage, and cloud infrastructure from start to finish. If you thrive in scaling environments, enjoy creating bespoke tools when existing solutions are inadequate, and aim to enhance data reliability for rapid iteration, you'll find a welcoming home here. Collaborate closely with researchers and engineers to ensure that our data propels swift and impactful advancements in robot training and evaluation.

Jan 26, 2026

Apply

Senior Data Engineer at Mudflap | Palo Alto, CA

Mudflap

Full-time|$175K/yr - $215K/yr|Hybrid|Palo Alto, CA

Mudflap is revolutionizing the $800 billion trucking industry, which serves as a vital pillar of the U.S. economy. Our innovative payment solutions empower truckers to save significantly on their primary operational cost: fuel. Additionally, we connect our network of fuel stop partners with new, hard-to-reach clients. As a rapidly expanding marketplace, we are on the lookout for a passionate, customer-focused individual to join our dynamic team.In this pivotal role as a Senior Data Engineer, you will be instrumental in crafting the data infrastructure that enables faster and smarter decision-making across our organization. You will design, scale, and manage our modern, cloud-based data platform, ensuring that our data flows seamlessly and is effectively utilized by teams throughout the company.This is a rare opportunity to establish data standards, influence technical direction, and guarantee the quality, consistency, and scalability of the data platform relied upon by hundreds of thousands of small and medium-sized trucking companies each day.Our Palo Alto office offers a hybrid work environment that encourages collaboration in the office while also providing the flexibility to work remotely.

Apr 28, 2025

Apply

Software Engineer - Data Platform

xAI

Full-time|$180K/yr - $440K/yr|On-site|Palo Alto, CA

Join xAI as a Software Engineer on our Data Platform team, where you'll design, build, and operate scalable distributed systems that handle vast data processing and transport. Be part of a dynamic environment focused on engineering excellence, working with cutting-edge technologies like Apache Kafka, Spark, and Flink to drive real-time machine learning and analytics at a petabyte scale.

Dec 29, 2025

Apply

Data Platform Software Engineer

PsiQuantum

Full-time|$140K/yr - $165K/yr|On-site|Palo Alto, California, United States

Join PsiQuantum, a pioneering force dedicated to creating the first practical quantum computers that will revolutionize various industries. Since our inception in 2016, we have been unwavering in our mission to construct and implement million-qubit, fault-tolerant quantum systems.Our innovative quantum computers leverage the principles of quantum mechanics to tackle problems that are beyond the capabilities of the most sophisticated supercomputers and AI technologies. The transformative potential of these machines will benefit fields such as energy, pharmaceuticals, finance, agriculture, transportation, and materials science.Utilizing silicon photonics as the foundation of our architecture, we capitalize on advanced semiconductor manufacturing techniques—partnering with leaders like GlobalFoundries. This approach allows us to employ high-volume processes that already produce billions of chips for telecommunications and consumer electronics. The benefits of photonics are clear: they are immune to heat, unaffected by electromagnetic interference, and seamlessly integrate with current cryogenic cooling systems and standard fiber-optic infrastructures.In 2024, we announced significant government-funded projects aimed at establishing our first utility-scale quantum computers in Brisbane, Australia, and Chicago, Illinois. These initiatives illustrate the growing acknowledgment of quantum computing's strategic and economic significance, emphasizing the urgency to scale our efforts.At PsiQuantum, we are also focused on developing the algorithms and software necessary for these systems to achieve commercial viability. Our application and software teams collaborate with prestigious Fortune 500 companies—including Lockheed Martin, Mercedes-Benz, Boehringer Ingelheim, and Mitsubishi Chemical—to optimize quantum solutions for real-world applications.Quantum computing signifies not merely an extension of classical computing but a radical transformation, offering pathways to tackle challenges that cannot be addressed through any other means. The opportunities are vast, and we are committed to making this vision a tangible reality.We invite you to be a part of this groundbreaking journey.

Mar 31, 2026

Apply

Data Engineer at xai | Palo Alto, CA

xai

Full-time|On-site|Palo Alto, CA

Role overview xai seeks a Data Engineer based in Palo Alto, CA. The position centers on building and refining data infrastructure to align with company objectives. Collaboration with teams throughout the organization is a key part of this role. What you will do Design and develop data solutions that inform business decisions Collaborate with cross-functional teams to create and enhance data systems Contribute to shaping data processes that support company results

Apr 24, 2026

Apply

Vehicle Modeling and Data Analyst Engineer

ALSO

Full-time|$160K/yr - $220K/yr|On-site|Palo Alto

Join the Revolution at ALSO.At ALSO, we are pioneers in electric mobility, initially established as a part of Rivian. Our dynamic team is made up of builders, dreamers, and innovators dedicated to crafting revolutionary vertically integrated small electric vehicles (EVs) that tackle the mobility challenges of today and tomorrow. Our mission is to encourage everyone to switch to ALSO, transforming local car, truck, and SUV travel into rides on more affordable, enjoyable, and significantly more efficient vehicles, boasting efficiency improvements of 10-50 times.We are actively seeking a Vehicle Modeling and Data Analyst Engineer to play a pivotal role in ensuring the reliability of our cutting-edge electric mobility vehicles throughout their development lifecycle. As a Physical Systems Modeling & Data Analyst, you will connect theoretical modeling with actual performance. Your responsibilities will include building and enhancing physics-based and data-driven models of vehicle systems such as powertrains, batteries, thermal management, and structural components. You will analyze telemetry and fleet data to guide engineering design choices, testing protocols, and product development.

Oct 22, 2025

Apply

Staff Data Platform Engineer (Hybrid)

Fiddler AI

Full-time|$190K/yr - $300K/yr|Hybrid|Palo Alto

Our PurposeAt Fiddler AI, we recognize the profound implications of artificial intelligence and its impact on human lives. Our mission is to instill trust in AI technologies. With the emergence of Generative AI and intelligent agents, the potential for generalized intelligence has expanded, but so have the associated risks. Fiddler is dedicated to assisting organizations in navigating these challenges by providing reliable and transparent AI solutions.We collaborate with AI-centric organizations to establish a sustainable framework for responsible AI practices, fostering trust among their users. AI Engineers, Data Scientists, and business teams leverage Fiddler AI to monitor, evaluate, secure, analyze, and enhance their AI solutions, facilitating improved outcomes. Our platform empowers engineering teams and business stakeholders to comprehend the 'what', 'why', and 'how' behind AI results.Our FoundersFiddler AI was co-founded by Krishna Gade, a distinguished engineering leader from Facebook, Pinterest, Twitter, and Microsoft, and Amit Paka, a product visionary with a history at Microsoft, Samsung, PayPal, and as a two-time founder. Our venture is supported by prominent investors including Insight Partners, Lightspeed Venture Partners, and Lux Capital.Why Join UsJoining our innovative team means contributing to the mission of embedding trust into AI, thereby helping society harness its transformative power. You will play a crucial role in ensuring that AI applications deployed at scale across various industries maintain operational transparency and security. As an early-stage startup, we are rapidly expanding and proud of our dynamic team composed of intelligent and empathetic doers, thinkers, creators, and builders. The AI and ML sector is characterized by swift innovation, offering monumental learning opportunities. This is your chance to lead the way as a pioneer in this field.Fiddler is recognized as a trailblazer in AI Observability, having earned numerous accolades such as the 2022 a16z Data50 list, 2021 CB Insights AI 100 most promising startups, 2020 WEF Technology Pioneer, and a 2019 Gartner Cool Vendor in Enterprise AI Governance and Ethical Response. By joining our talented team, you will contribute to shaping the future of AI Observability.‍ The Mission:As a Staff Data Platform Engineer, you will significantly impact the safety and return on investment of large language models and agentic applications across diverse verticals and domains. You will be at the forefront of designing and developing innovative tools that enhance the performance and reliability of AI systems.

Oct 23, 2025

Apply

Senior Software Engineer, Data Platforms

Mudflap

Full-time|$185K/yr - $250K/yr|Hybrid|Palo Alto, CA

Join Mudflap, a pioneering force in the $800 billion trucking industry, where our innovative payment solutions empower truckers to save significantly on fuel costs—their largest operational expense. We connect fuel stop partners with new customers, creating a vibrant marketplace that is rapidly expanding. We are actively seeking a customer-focused Senior Software Engineer to help us shape our exciting future.As a Senior Software Engineer specializing in Data Platforms, you will be instrumental in constructing a robust data infrastructure that ensures reliable and scalable data flow throughout Mudflap's systems. Your contributions will be vital in designing and managing the frameworks and services that facilitate efficient data ingestion, processing, and accessibility on a large scale.In this role, you will architect and develop high-performance platform systems that drive data ingestion, orchestrate pipelines, and manage large-scale processing. Your efforts will lay the groundwork for high-availability data systems that empower Mudflap's teams to operate with speed and confidence.This position is based in the Bay Area and offers a hybrid work model, allowing for a blend of in-office collaboration and remote work.

Mar 17, 2026

Apply

Software Engineer - Data Foundations

Glean

Full-time|$140K/yr - $265K/yr|On-site|San Francisco Bay Area

About Glean:Established in 2019, Glean is a pioneering AI-driven knowledge management platform that empowers organizations to swiftly locate, organize, and disseminate information throughout their teams. By seamlessly integrating with tools such as Google Drive, Slack, and Microsoft Teams, Glean ensures that employees have timely access to essential knowledge, thereby enhancing productivity and collaboration. Our state-of-the-art AI technology streamlines knowledge discovery, facilitating a more efficient utilization of collective intelligence across teams.Glean originated from Founder & CEO Arvind Jain’s profound insight into the challenges that employees encounter while seeking and comprehending information at work. Witnessing firsthand how fragmented knowledge and a multitude of SaaS tools hindered productivity, he aimed to create an enhanced solution – an AI-powered enterprise search platform that enables rapid and intuitive access to necessary information. Since its inception, Glean has transformed into a leading Work AI platform, merging enterprise-grade search, an AI assistant, and robust application and agent-building capabilities to fundamentally reshape the modern workplace.About the RoleWe are in search of a Software Engineer to become a vital member of Glean’s Data Foundations team—the unit responsible for the comprehensive data ingestion and management layer that powers Glean’s Search, AI Assistant, and Agent offerings across a multitude of enterprise applications and vast quantities of documents.Your contributions will directly influence the quality, timeliness, and reliability of the knowledge that each Glean user engages with daily.Your Responsibilities Include:Ingestion & ConnectivityDevelop and enhance connectors for a diverse array of SaaS and on-premise systems (including Google Workspace, Microsoft 365, Slack, Salesforce, Jira, ServiceNow, GitHub, etc.).Ensure the seamless integration and continuous improvement of data flows to maintain system performance and reliability.

Dec 8, 2025

Apply

Senior Data Engineering Analyst at Rubrik | Palo Alto, CA

Rubrik

Full-time|$175.5K/yr - $263.3K/yr|On-site|Palo Alto, CA

About the Team: The Information Technology team at Rubrik is pivotal in shaping business processes, enhancing employee experience, and leveraging technologies to scale our organization beyond $1 billion. This dedicated team drives operational efficiencies across the enterprise by centralizing the management of Infrastructure, Technology, and Data. The IT team guarantees all stages of the software development lifecycle occur within a secure environment while meticulously overseeing the implementation of essential processes and governance. They are not just champions of Rubrik but also serve as primary users of the Engineering teams' offerings. Rubrik Corp IT operates entirely on a Software as a Service (SaaS) model, with no on-premises solutions. The IT team accelerates the enhancement of business value and optimizes daily operations through a diverse portfolio of SaaS applications, including Salesforce.com, Oracle Netsuite, Workday, Snowflake, Etrade, Jitterbit, and Allocadia. Their commitment to delivering high-velocity business outcomes is backed by a 100% system uptime, supported by agile, streamlined, and cohesive cloud architectures. About the Role: Become a part of our Data Engineering and Analytics team at our Palo Alto headquarters, where you will design and implement the data infrastructure that supports our General and Administrative (G&A) business functions. In this impactful role, you will manage the entire lifecycle of data initiatives, transforming complex business requirements into scalable data models and delivering insightful analytics that guide critical decision-making. You will work closely with stakeholders across Go-To-Market (GTM) programs, Product Operations, Pricing, and Treasury, facilitating precise reporting, operational transparency, and strategic planning. Your contributions will directly aid directors and senior leaders within Sales and G&A, shaping revenue growth, strategy, and operational effectiveness through reliable, data-informed decisions. What You'll Do: Business Partnership & Delivery: Act as a data partner across GTM and G&A business functions, leading high-impact projects from analysis to successful completion. Requirements Discovery: Conduct in-depth engagement sessions with stakeholders to convert business queries into actionable technical roadmaps, ensuring data solutions effectively address core strategic challenges. Architectural Excellence: Design and refine scalable data pipelines and advanced dbt models in Snowflake. Develop a robust Semantic layer to standardize metric definitions and ensure alignment across diverse G&A data sources. Advanced Analytics & Visualization: Produce executive-level narratives using Tableau. Transform raw data into proactive analytical tools that enhance insight and decision-making.

Feb 4, 2026

Apply

Backend Engineer - Data Systems and APIs

vinci4d

Full-time|On-site|Palo Alto HQ

Join our innovative team at vinci4d as a Backend Engineer specializing in Data Systems and APIs. In this role, you will play a crucial part in designing, implementing, and maintaining robust data systems that power our applications. You will collaborate with cross-functional teams to build scalable APIs that enhance user experiences and support our mission to deliver cutting-edge technology solutions.

Mar 26, 2026

Apply

Software Engineer - Data Infrastructure & Acquisition

Speechify

Full-time|Remote|Palo Alto, CA, USA

Role overview Speechify seeks a Software Engineer specializing in Data Infrastructure and Acquisition at its Palo Alto, CA office. This position focuses on building and refining the data pipelines and backend systems that support Speechify’s text-to-speech products. What you will do Design, develop, and improve data pipelines to meet product and business requirements Collaborate with engineering, product, and data teams to maintain reliable data flows Contribute to systems that support data-driven decisions and ongoing product improvements

Apr 25, 2026

Create account — see all 598 results

1 - 20 of 598 Jobs

Select all on this page (20)

Apply

Big Data Engineer with Spark

Cygnus Professionals Inc.

Contract|On-site|Palo Alto

Apr 20, 2017

Apply

Hadoop Developer - Transform Data into Insights

Sonsoft Inc.