Clicking Apply Now takes you to AutoApply where you can tailor your resume and apply.
Unlock Your Potential
Generate Job-Optimized Resume
One Click And Our AI Optimizes Your Resume to Match The Job Description.
Is Your Resume Optimized For This Role?
Find Out If You're Highlighting The Right Skills And Fix What's Missing
Experience Level
Experience
Qualifications
We are looking for candidates who:Possess expertise in Java and a strong understanding of SRE practices. Have a passion for messaging platforms, including their setup, monitoring, and maintenance. Demonstrate effective collaboration with infrastructure teams. Are eager to take on complex challenges and learn from experiences. Value diversity and contribute positively to our inclusive culture.
About the job
Join our dynamic team at PIMCO, a premier global asset management firm with a commitment to helping millions of investors achieve their financial aspirations. With over 3,000 employees across 20 offices in 15 countries, we seek innovative thinkers who thrive in a collaborative environment. At PIMCO, we value diversity, hard work, and a continuous learning ethos.
As a Java Site Reliability Engineer (SRE) specializing in Messaging Platforms, you will play a critical role in shaping our technology strategies to enhance operational efficiency. Your responsibilities will include supporting various messaging platforms such as MQ, AMPS, and Kafka, ensuring optimal tool selection and sustainable messaging strategies. You will also focus on improving operational efficiency through advanced tools and monitoring systems.
This position requires a passion for messaging systems, collaborative problem-solving skills, and a strong foundation in software development. You will have the opportunity to contribute to critical business solutions that align with our strategic vision for trading applications.
About PIMCO
PIMCO is a globally recognized leader in asset management, dedicated to helping investors achieve their financial goals through innovative solutions and a robust collaborative culture. We believe in the power of diverse perspectives and continuous improvement in technology and processes.
mks2technologies seeks an On-site IT Customer Service Engineer to join the team in Austin, TX. This position acts as the primary contact for IT support, assisting clients with technical issues to help keep their daily operations running smoothly. Key responsibilities Diagnose and troubleshoot technical problems directly at client sites Offer clear, practical solutions and support Maintain attentive and timely customer service with every client interaction Work location This role is fully on-site in Austin, TX. Regular presence at client locations is required.
Join Saronic as a Civil/Site Engineer specializing in Infrastructure, where you will play a pivotal role in designing and implementing innovative engineering solutions. You will collaborate with a diverse team of professionals to ensure the successful execution of infrastructure projects, enhancing the quality and sustainability of civil engineering.
For over 25 years, Realtor.com® has stood as the premier online platform trusted by real estate professionals, seamlessly connecting buyers, sellers, and renters with invaluable insights and expert advice to discover their ideal home. Our comprehensive suite of tools not only transforms the real estate landscape, but also aids consumers in navigating one of life's most significant decisions—making it simple, intuitive, and empowering.Join us in our mission to enable more individuals to find their way home by dismantling barriers, fostering meaningful connections, and instilling confidence with expert guidance.About the RoleWe are on the lookout for a Staff Site Reliability Engineer to become a vital member of our newly established Operations Excellence organization, reporting directly to the Director of Operations Excellence. This pivotal position will define the reliability, observability, and operational excellence of our platform infrastructure that serves millions of users. As a Staff SRE, you will take on a technical leadership role, mentoring others and establishing best practices, while influencing architectural decisions to empower our team of 600+ engineers in delivering outstanding customer experiences.You will engage with crucial platform systems, including EKS infrastructure, Skyway (CI/CD), Frontdoor (Tyk API Gateway), Pantheon (Apollo GraphQL Federation), and our observability stack, all while implementing chaos engineering practices and spearheading cost optimization initiatives that yield measurable ROI.We are committed to employing the best tools to expedite problem-solving. You will be expected to adeptly utilize AI coding assistants and LLMs to enhance development speed, generate boilerplate code, and troubleshoot intricate debugging scenarios. In addition to basic usage, this role demands the critical judgment to assess AI-generated outputs for security, performance, and accuracy. You should be comfortable incorporating AI tools into your daily tasks to minimize repetitive work, allowing you to concentrate on high-impact architectural and strategic engineering challenges.What You'll DoPlatform Reliability & InfrastructureDesign and maintain highly available AWS infrastructure, including EKS clusters, Fargate (ECS), and multi-region architectures.Take ownership of the reliability of essential services: Skyway (CI/CD), Frontdoor (Tyk), Pantheon (Apollo GraphQL), and associated infrastructure.Establish SLIs, SLOs, and error budgets for Tier 1/2/3 systems; lead architectural reviews focused on reliability and cost-efficiency.Drive...
About Base Power Base Power is a US-based power company focused on transforming the energy grid. The team works to build a decentralized power system by deploying distributed batteries across the country. Engineers, operators, and problem-solvers at Base Power address major challenges in the energy sector together. Role Overview: Deployment Engineer – Site Survey This Deployment Engineer position connects field operations with systems engineering. The role centers on improving how Base Power evaluates, approves, and executes hardware deployments at multiple locations. The engineer will refine site survey processes and set configuration standards to keep deployments consistent, secure, and reliable. Key Responsibilities Design and maintain internal tools and automated workflows to scale site survey reviews and make data ingestion across systems more efficient. Act as the technical authority for hardware configurations, setting and enforcing criteria for deployment approvals. Define, document, and uphold high standards for site survey reviews, supporting safety, consistency, and operational efficiency as deployment volume grows. Use SQL and analytics tools to examine field data and installation results, spot process bottlenecks, and drive improvements in deployment operations. Build internal dashboards with tools such as Python, JavaScript, or Retool to provide real-time insights into the site survey pipeline and key metrics. Work closely with Field Operations, Hardware Engineering, and Software teams to turn deployment challenges into engineering solutions and technical requirements. Develop and maintain detailed documentation for review criteria, internal tools, configuration standards, and operational processes. Location: Austin, TX
Full-time|On-site|Austin, TX/Akron, Ohio/Irvine, CA
Join Restaurant365 as a Site Reliability Engineer II, where you'll play a vital role in ensuring the availability, performance, and reliability of our systems. You will collaborate with cross-functional teams to design, implement, and maintain robust infrastructure solutions that enhance our operational efficiency.
Join Fluidstack as a Sourcing Manager for Contractors & Site ServicesAt Fluidstack, we're revolutionizing the infrastructure for advanced intelligence by collaborating with leading AI laboratories, government agencies, and enterprises, including Mistral, Poolside, Black Forest Labs, and Meta. Our mission is to enable compute capabilities at unprecedented speeds.With a strong commitment to making Artificial General Intelligence (AGI) a reality, our highly driven team is dedicated to building world-class infrastructure. We view our customers' success as our own and take immense pride in the systems we create and the trust we cultivate. If you are passionate about impactful work, dedicated to excellence, and eager to help accelerate the future of intelligence, we welcome you to join our innovative journey.About the RoleAs a Sourcing Manager for Contractor Workforce & Site Services at Fluidstack, you will spearhead the strategy for sourcing, onboarding, and sustaining the skilled workforce essential for our rapid deployment across various sites. This role encompasses the procurement of contract labor from regional trade unions and national staffing agencies, as well as the management of site support services including amenities such as hotels, transportation, and personal protective equipment (PPE). You will also collaborate with our Recruiting team on contract-to-hire processes and develop workforce training programs that enhance community trade capabilities.Close collaboration with Data Center Operations, Construction, and Finance departments will be critical to ensure labor resources align with our ambitious deployment schedule.Your ResponsibilitiesEstablish and oversee a multi-tier supplier network for contract labor including low-voltage electrical contractors, CDU installation and maintenance technicians, facility operations staff, security personnel, structured cabling teams, and general site services contractors.Create preferred supplier lists and MSA/SOW frameworks to facilitate swift mobilization at new site locations, incorporating pre-negotiated rate cards and qualification standards.Stay informed about union jurisdiction maps (IBEW, NECA, BOMA-affiliated trades) and ensure compliance with applicable prevailing wage laws, Davis-Bacon, and local labor agreements.
In 2024, cybercrime rates are anticipated to escalate, as evidenced by the FBI's IC3 report, which highlighted a staggering loss of over $16 billion. The real estate sector, unfortunately, remains a prime target for cybercriminals, particularly through investment fraud and BEC scams. At CertifID, we are committed to combating this threat by offering a secure platform that authentically verifies the identities of transaction participants, validates wire transfer instructions, and identifies potential fraud attempts. Our innovative technology is engineered to reduce risks, ensuring that every transaction is executed with utmost confidence and security.Our success hinges on our exceptional team. Recognized as one of the Best Startups to Work in Austin, we proudly made the Inc. 5000 list and received the award for Best Culture by Purpose Jobs for three consecutive years. Our core values and vision for a world without wire fraud guide us as we strive to create a dynamic work environment where every team member can make a significant impact in enhancing security and combating fraud.Position Overview:We are on the lookout for a Senior Site Reliability Engineer (Senior SRE) to spearhead reliability enhancements within our production SaaS environment. You will play an essential role in developing scalable infrastructure models, advancing our observability efforts, optimizing incident response, and collaborating with engineering teams to integrate reliability into system design and deployment.This position is tailor-made for a seasoned Senior SRE who thrives on tackling intricate operational challenges, building automation solutions, and mentoring fellow engineers.
Full-time|$110K/yr - $128K/yr|Hybrid|Austin, Texas, United States
Join Striveworks as a Site Reliability EngineerAt Striveworks, we empower organizations to leverage artificial intelligence to tackle real-world challenges in national security and business. Our mission is to serve as the command center where data, models, and business outcomes converge.Founded by a team of passionate data scientists and engineers, Striveworks simplifies the journey from deployment to ongoing optimization. We ensure that our clients aren’t just deploying AI; they’re establishing robust systems that are reliable, adaptable, and poised to scale in an ever-changing landscape.As a Site Reliability Engineer, you will play a pivotal role in implementing and managing corporate systems from day one. You will work with an array of systems and infrastructure automation tools while having the opportunity to innovate and enhance our toolset. Your focus will be on developing sustainable solutions that prevent future issues, thereby minimizing operational toil.Your daily responsibilities will include:Developing and maintaining infrastructure as code across private (OpenStack) and commercial (AWS, Azure, GCP) cloud environments.Creating configuration management automation for Windows laptops and Linux servers.Providing comprehensive user support for all corporate systems.This role is based in a hybrid/on-site setting at our northwest Austin office.
Site Reliability Engineer Overview: Join Weedmaps as a Site Reliability Engineer and collaborate with diverse teams across application development, infrastructure, and quality assurance to elevate the performance, reliability, and scalability of our web services at Weedmaps.com. As a fully cloud-native organization, we operate all our services within Docker containers on Kubernetes, hosted on AWS. Our culture promotes observability, proactive monitoring, and CI/CD automation, enabling us to release multiple production updates daily. In this role, you will utilize your engineering expertise to improve system monitoring, streamline CI workflows, and refine our deployment pipelines. You will serve as a knowledge resource for development teams, guiding them in utilizing standardized tools for metrics, logging, and deployment processes. Collaborate closely with both development and infrastructure teams to identify key service metrics that go beyond the basics, working with application teams to develop libraries that facilitate easy instrumentation of their services. Your Impact: Collaborate with stakeholders to establish best practices in monitoring and CI/CD pipelines. Troubleshoot issues within our deployment CI pipeline. Promote and support a strong DevOps culture within Weedmaps. Identify automation opportunities and advocate for codification across all processes. Share best practices regarding collaboration, reliability, security, and performance with all partner teams. Take responsibility for the configuration and scaling of applications, ensuring adherence to organizational practices. Develop and enhance synthetic monitoring workflows.
Join our dynamic team at PIMCO, a premier global asset management firm with a commitment to helping millions of investors achieve their financial aspirations. With over 3,000 employees across 20 offices in 15 countries, we seek innovative thinkers who thrive in a collaborative environment. At PIMCO, we value diversity, hard work, and a continuous learning ethos.As a Java Site Reliability Engineer (SRE) specializing in Messaging Platforms, you will play a critical role in shaping our technology strategies to enhance operational efficiency. Your responsibilities will include supporting various messaging platforms such as MQ, AMPS, and Kafka, ensuring optimal tool selection and sustainable messaging strategies. You will also focus on improving operational efficiency through advanced tools and monitoring systems.This position requires a passion for messaging systems, collaborative problem-solving skills, and a strong foundation in software development. You will have the opportunity to contribute to critical business solutions that align with our strategic vision for trading applications.
qodeworld is seeking a Senior Site Reliability Architect to join the team in Austin, Texas. This position focuses on unified observability, proactive detection, AIOps, and GenAI-driven operations for distributed financial services platforms. The role requires deep technical expertise in designing and maintaining reliable, high-performance systems across complex architectures. Role overview The Senior Site Reliability Architect will drive enhancements in platform reliability and performance. This includes building SLI/SLO-driven monitoring, implementing dynamic thresholds, and developing intelligent alerting and AI/ML-based anomaly detection. The position is central to evolving operational practices from reactive alerting to proactive, insight-driven approaches. Key responsibilities Design and deploy unified observability dashboards that integrate metrics, logs, traces, events, and system topology. Establish and manage SLIs, SLOs, and error budgets aligned with business goals. Create actionable dashboards for operational, engineering, and leadership teams. Implement advanced alerting strategies using both static and dynamic thresholds. Apply AI/ML/AIOps technologies to detect anomalies, forecast incidents, and reduce MTTR. Shift monitoring practices from reactive alerting to proactive insights. Incorporate noise reduction, alert correlation, and root cause analysis. Use baseline modeling, seasonality detection, and anomaly scoring. Oversee and resolve issues in multi-service architectures, including microservices, APIs, Kafka/streaming platforms, and cloud infrastructure (Terraform, Infrastructure as Code). Analyze and trace issues across upstream/downstream dependencies, streaming platforms, infrastructure, and application code. Work extensively with Dynatrace (mandatory requirement). Utilize tools such as OpenTelemetry, Prometheus/Grafana, ELK/EFK, and cloud-native monitoring solutions (AWS, Azure, GCP). Manipulate and enrich telemetry using JSON. Apply GenAI/LLMs for incident summarization, root cause explanations, runbook recommendations, and auto-remediation suggestions. Collaborate with platform teams to operationalize GenAI technologies safely. Requirements 15+ years of experience in Site Reliability Engineering or Production Engineering. Strong background in unified observability, AIOps, and related fields. Proven experience with AI/ML technologies and cloud-native environments.
As the leading online platform for real estate professionals for over 25 years, Realtor.com® connects buyers, sellers, and renters with trusted insights and expert guidance to find their ideal home. Our comprehensive suite of tools significantly impacts the real estate industry and enhances the consumer experience, making it simple, understandable, and empowering for individuals navigating one of life's biggest purchases.Join us in our mission to help people find their way home by dismantling barriers to entry, establishing the right connections, and fostering confidence through expert guidance.About the RoleWe are looking for a Senior Site Reliability Engineer to become a crucial member of our newly established Operations Excellence organization, reporting directly to the Director of Operations Excellence. In this pivotal role, you will enhance the reliability, observability, and operational excellence of our platform infrastructure that serves millions of users. As a Senior SRE, you will be a key technical contributor, implementing best practices, addressing complex challenges, and empowering our team of over 600 engineers to deliver outstanding customer experiences.Your responsibilities will include working on critical platform systems such as EKS infrastructure, Skyway (CI/CD), Frontdoor (Tyk API Gateway), Pantheon (Apollo GraphQL Federation), and our observability stack. You will also play a part in chaos engineering practices and cost optimization initiatives, ensuring measurable ROI.We believe in employing the best tools to solve problems efficiently. You will be expected to adeptly use AI coding assistants and LLMs to accelerate development speed, generate boilerplate code, and resolve complex debugging issues. Beyond mere usage, this role demands the critical judgment to evaluate AI-generated outputs for security, performance, and accuracy. You should be comfortable incorporating AI tools into your daily routines to reduce repetitive tasks, allowing you to concentrate on high-impact architectural and strategic engineering challenges.What You'll DoPlatform Reliability & InfrastructureDesign, implement, and maintain highly available AWS infrastructure, including EKS clusters, Fargate (ECS), and multi-region architectures.Ensure the reliability of essential services: Skyway (CI/CD), Frontdoor (Tyk), Pantheon (Apollo GraphQL), and their supporting infrastructure.Monitor SLIs, SLOs, and error budgets for Tier 1/2/3 systems; participate in architectural reviews focused on reliability and cost-efficiency.Implement reliability patterns such as circuit breakers, graceful degradation, and automatic failover strategies.
At BetterUp, we believe in the power of human transformation, and our approach to the employer-employee relationship reflects that belief.From the moment you engage with us, you will notice a distinct experience. It's not just about filling a position; it's about joining a mission-driven team.Upon accepting an offer, you gain more than just a paycheck—you will receive a dedicated BetterUp Coach, a personalized development plan, and a supportive manager. You'll also be part of an extraordinary team, each member accompanied by their own BetterUp Coach, working on projects that make a real impact.This unique environment fosters a focused and fulfilling work experience. While it may not be for everyone, for those who are passionate and driven, this role represents a transformative career opportunity.Join us for an intense and rewarding journey, where you'll engage in meaningful work within a vibrant and creative culture.If this resonates with you and the job description aligns with your skills, let’s start a conversation.As a hybrid company, we emphasize in-person collaboration when necessary. Employees must be available to work from one of our office hubs a minimum of two days per week, totaling eight days per month. Our US hubs include: Austin, TX; Chicago, IL; New York City, NY; San Francisco, CA; and the Washington, DC metro area. For roles based in Europe, our hubs are located in London, UK, and Amsterdam, NL. Please ensure you can commit to this structure before applying.Key Responsibilities:Utilize AI-driven tools and automation to enhance monitoring, troubleshooting, and maintenance of production systems.Develop and manage cloud infrastructure on AWS, employing Terraform for codifying and version-controlling our environments.Oversee and scale Kubernetes clusters that support BetterUp's platform, ensuring optimal availability and performance.Create intelligent alerting and observability frameworks.Collaborate with engineering teams to integrate reliability into the development lifecycle, proactively addressing operational concerns.Automate incident response processes and establish self-healing infrastructure.Explore and implement cutting-edge AI tools for log analysis, anomaly detection, and predictive maintenance.
Join Ramboll Group as a Principal in Site Investigation and Remediation, where you will lead high-impact projects aimed at environmental sustainability and remediation solutions. Utilize your extensive expertise to oversee complex site investigations, implement innovative remediation strategies, and collaborate with multidisciplinary teams to drive project success. You will be the key point of contact for clients, ensuring regulatory compliance and delivering exceptional results.
About Base Power CompanyAt Base Power Company, we are at the forefront of revolutionizing the energy landscape in America. Our mission is to redefine the future of electricity by implementing a widespread network of distributed battery systems. We are not just a power company; we are a collective of engineers, operators, and creative thinkers dedicated to addressing the intricate challenges of our time and fostering a resilient and sustainable energy grid.About the RoleWe are currently seeking a detail-oriented and skilled Technical Site Surveyor to evaluate customer sites for battery installation compatibility. This role requires thorough analysis of submitted site photos, collaboration on installation configurations, and ensuring adherence to electrical codes. The Site Surveyor will also produce specialized drawings for unique installations and may occasionally conduct field surveys.Key ResponsibilitiesAnalyze customer-provided site documentation and photos to assess installation feasibility and compatibility.Work closely with customers to clarify site specifics and finalize installation plans.Interpret and implement relevant electrical codes to ensure safe installations.Create precise drawings for complex or non-standard installation scenarios.Communicate installation requirements and potential challenges to internal teams effectively.Maintain accurate records of site survey evaluations and customer interactions.Occasionally perform in-person site surveys to troubleshoot issues or verify conditions.Provide technical support and insights to installation teams based on survey findings.Propose enhancements to the site survey review process to improve efficiency.
Join Alpha Insight Inc. as a Customer Service Trainer and play a pivotal role in enhancing the skills and knowledge of our customer service team. Your expertise will help shape our approach to customer interactions, ensuring a world-class experience for our clients. You will develop training materials, facilitate workshops, and provide ongoing support to team members to foster a culture of excellence and continuous improvement.
About Future Secure AI Future Secure AI develops solutions in artificial intelligence for real-world business challenges. The company values courage, precision, and curiosity, and supports an entrepreneurial culture where every team member is recognized. Leadership is experienced and approachable, with a focus on supporting individual growth. Team members work alongside colleagues from diverse backgrounds and contribute to projects that have impact across industries. Role Overview: Site Reliability Engineer The Site Reliability Engineer will design, build, and maintain the platforms that power Future Secure AI's AI Co-Workers. This is a hands-on position with responsibility for reliability throughout the product lifecycle. The role involves close collaboration with product, AI, and engineering teams to ensure platform stability and performance.
Join Alpha Insight Inc. as a Customer Service Representative, where you will be the first point of contact for our valued customers. In this role, you will handle inquiries, resolve issues, and provide top-notch support to enhance customer satisfaction. Your ability to communicate effectively and empathize with customers will be key to your success.
Join our dynamic team at BCforward as a Customer Service Representative in Austin! We are seeking enthusiastic individuals who are passionate about providing exceptional customer support. In this role, you will assist customers with inquiries, troubleshoot issues, and ensure a positive experience with our services.
Join our dynamic team as an Inbound Customer Service Representative! In this role, you will assist our valued customers, providing them with the support they need in various areas. Our company offers a professional environment where you can thrive with consistent hours and a set schedule. Enjoy the flexibility to wear casual attire, including jeans and sneakers, while working in a fun and collaborative team atmosphere.We are seeking individuals who possess outstanding customer service skills and are eager to develop their careers in a supportive setting. Our full-time positions come with competitive pay and additional incentives for certain groups based on performance.
Sep 25, 2014
Sign in to browse more jobs
Create account — see all 1,332 results
Tailoring 0 resumes…
Tailoring 0 resumes…
We'll move completed jobs to Ready to Apply automatically.