Clicking Apply Now takes you to AutoApply where you can tailor your resume and apply.
Unlock Your Potential
Generate Job-Optimized Resume
One Click And Our AI Optimizes Your Resume to Match The Job Description.
Is Your Resume Optimized For This Role?
Find Out If You're Highlighting The Right Skills And Fix What's Missing
Experience Level
Experience
Qualifications
To thrive in this role, candidates should possess a strong background in IT support, excellent problem-solving skills, and a passion for customer service. Ideal candidates will have experience with hardware and software troubleshooting, as well as knowledge of networking concepts. Strong communication skills are essential.
About the job
mks2technologies seeks an On-site IT Customer Service Engineer to join the team in Austin, TX. This position acts as the primary contact for IT support, assisting clients with technical issues to help keep their daily operations running smoothly.
Key responsibilities
Diagnose and troubleshoot technical problems directly at client sites
Offer clear, practical solutions and support
Maintain attentive and timely customer service with every client interaction
Work location
This role is fully on-site in Austin, TX. Regular presence at client locations is required.
About mks2technologies
mks2technologies is a forward-thinking technology company dedicated to providing innovative IT solutions. Our team is committed to excellence, and we foster a collaborative environment that encourages professional growth and development.
Similar jobs
1 - 20 of 1,027 Jobs
Search for Java Site Reliability Engineer Messaging Platforms
Join our dynamic team at PIMCO, a premier global asset management firm with a commitment to helping millions of investors achieve their financial aspirations. With over 3,000 employees across 20 offices in 15 countries, we seek innovative thinkers who thrive in a collaborative environment. At PIMCO, we value diversity, hard work, and a continuous learning ethos.As a Java Site Reliability Engineer (SRE) specializing in Messaging Platforms, you will play a critical role in shaping our technology strategies to enhance operational efficiency. Your responsibilities will include supporting various messaging platforms such as MQ, AMPS, and Kafka, ensuring optimal tool selection and sustainable messaging strategies. You will also focus on improving operational efficiency through advanced tools and monitoring systems.This position requires a passion for messaging systems, collaborative problem-solving skills, and a strong foundation in software development. You will have the opportunity to contribute to critical business solutions that align with our strategic vision for trading applications.
Full-time|On-site|Austin, TX/Akron, Ohio/Irvine, CA
Join Restaurant365 as a Site Reliability Engineer II, where you'll play a vital role in ensuring the availability, performance, and reliability of our systems. You will collaborate with cross-functional teams to design, implement, and maintain robust infrastructure solutions that enhance our operational efficiency.
For over 25 years, Realtor.com® has stood as the premier online platform trusted by real estate professionals, seamlessly connecting buyers, sellers, and renters with invaluable insights and expert advice to discover their ideal home. Our comprehensive suite of tools not only transforms the real estate landscape, but also aids consumers in navigating one of life's most significant decisions—making it simple, intuitive, and empowering.Join us in our mission to enable more individuals to find their way home by dismantling barriers, fostering meaningful connections, and instilling confidence with expert guidance.About the RoleWe are on the lookout for a Staff Site Reliability Engineer to become a vital member of our newly established Operations Excellence organization, reporting directly to the Director of Operations Excellence. This pivotal position will define the reliability, observability, and operational excellence of our platform infrastructure that serves millions of users. As a Staff SRE, you will take on a technical leadership role, mentoring others and establishing best practices, while influencing architectural decisions to empower our team of 600+ engineers in delivering outstanding customer experiences.You will engage with crucial platform systems, including EKS infrastructure, Skyway (CI/CD), Frontdoor (Tyk API Gateway), Pantheon (Apollo GraphQL Federation), and our observability stack, all while implementing chaos engineering practices and spearheading cost optimization initiatives that yield measurable ROI.We are committed to employing the best tools to expedite problem-solving. You will be expected to adeptly utilize AI coding assistants and LLMs to enhance development speed, generate boilerplate code, and troubleshoot intricate debugging scenarios. In addition to basic usage, this role demands the critical judgment to assess AI-generated outputs for security, performance, and accuracy. You should be comfortable incorporating AI tools into your daily tasks to minimize repetitive work, allowing you to concentrate on high-impact architectural and strategic engineering challenges.What You'll DoPlatform Reliability & InfrastructureDesign and maintain highly available AWS infrastructure, including EKS clusters, Fargate (ECS), and multi-region architectures.Take ownership of the reliability of essential services: Skyway (CI/CD), Frontdoor (Tyk), Pantheon (Apollo GraphQL), and associated infrastructure.Establish SLIs, SLOs, and error budgets for Tier 1/2/3 systems; lead architectural reviews focused on reliability and cost-efficiency.Drive...
In 2024, cybercrime rates are anticipated to escalate, as evidenced by the FBI's IC3 report, which highlighted a staggering loss of over $16 billion. The real estate sector, unfortunately, remains a prime target for cybercriminals, particularly through investment fraud and BEC scams. At CertifID, we are committed to combating this threat by offering a secure platform that authentically verifies the identities of transaction participants, validates wire transfer instructions, and identifies potential fraud attempts. Our innovative technology is engineered to reduce risks, ensuring that every transaction is executed with utmost confidence and security.Our success hinges on our exceptional team. Recognized as one of the Best Startups to Work in Austin, we proudly made the Inc. 5000 list and received the award for Best Culture by Purpose Jobs for three consecutive years. Our core values and vision for a world without wire fraud guide us as we strive to create a dynamic work environment where every team member can make a significant impact in enhancing security and combating fraud.Position Overview:We are on the lookout for a Senior Site Reliability Engineer (Senior SRE) to spearhead reliability enhancements within our production SaaS environment. You will play an essential role in developing scalable infrastructure models, advancing our observability efforts, optimizing incident response, and collaborating with engineering teams to integrate reliability into system design and deployment.This position is tailor-made for a seasoned Senior SRE who thrives on tackling intricate operational challenges, building automation solutions, and mentoring fellow engineers.
At Braze, we pride ourselves on our exceptional team dynamics. Our employees are not only highly skilled but also approachable and genuinely kind, which fosters a positive work environment.We aim to channel this passion into our work by establishing high standards, encouraging collaboration, and promoting a healthy work-life balance as we navigate our rapid global expansion while advocating for equity and opportunity both within and outside our organization.To succeed in this environment, you should be ready to set ambitious goals for yourself and your colleagues. We believe in taking initiative, embracing responsibility, and welcoming diverse viewpoints, all of which are vital to our ongoing success.Our insatiable curiosity and willingness to share our unique passions contribute to a vibrant company culture that thrives on balance.If you’re motivated to tackle exhilarating challenges and possess a proactive mindset in times of change, you’ll find the support to make a meaningful impact here, backed by a dedicated and passionate team. If Braze resonates with your aspirations, we look forward to meeting you.WHAT YOU'LL DOWe are seeking a Senior Site Reliability Engineer for our Currents team, responsible for the development, maintenance, and enhancement of Currents, our scalable data export system. This Kafka-based event pipeline processes tens of billions of messages daily, enabling our clients to analyze user behavior in near real-time.You will play a vital role on a collaborative and skilled team, guiding projects from inception to production while optimizing our existing high-scale systems. Your expertise and teamwork will be crucial in addressing the significant engineering challenges associated with managing a critical data streaming system. As a Senior Site Reliability Engineer, you will primarily focus on observability, scalability, and reliability strategies for every project.Key responsibilities include:Troubleshooting and resolving live performance and reliability issues while implementing strategies to prevent recurrence.Writing and reviewing code, mentoring engineers, and fostering a culture of reliability.Implementing sustainable incident response practices and conducting blameless postmortems.Establishing and promoting standards for monitoring, reliability, and performance.Facilitating collaboration between infrastructure and platform engineering teams.Enhancing services by planning for scalability and reliability.Mentoring junior engineers in SRE best practices and agile project management.
At Braze, we pride ourselves on our approachable and passionate team. We foster an environment where high standards, teamwork, and work-life balance are paramount as we navigate rapid global growth while striving for equity and opportunity both within and outside our organization.To thrive at Braze, you must hold yourself and your colleagues to high standards. Autonomy, accountability, and openness to new perspectives are crucial for our continued success.Our culture is vibrant, driven by a deep curiosity to learn and a willingness to share diverse passions. If you enjoy tackling exciting challenges and are proactive in the face of change, you will have the opportunity to make a significant impact with our dedicated team behind you. If this resonates with you, we look forward to meeting you!WHAT YOU'LL DOAs a Senior Site Reliability Engineer (SRE), your primary responsibility will be ensuring the seamless operation of all internal-facing services and platforms, essentially maintaining site uptime. SREs are a unique blend of system administrators and software engineers who apply sound engineering principles and operational discipline, along with sophisticated automation, to our environments and infrastructure services.You will play a vital role in enhancing automation, infrastructure reliability, and empowering Braze’s engineering teams to effectively leverage our infrastructure products and platforms. With over 3.3 billion monthly active users and the processing of hundreds of billions of data points each month, Braze operates at an impressive scale, sending billions of messages daily. Our technology stack includes Ruby on Rails, MongoDB, Redis, Kafka, Kubernetes, and more. In your position, you will collaborate with both your team and consumer engineering teams to continuously enhance our infrastructure, automation, and tooling for internal products.Main responsibilities include:Collaborating with Braze’s engineering teams to:Architect products that effectively utilize infrastructure platforms in a scalable and reliable manner.Debug reliability and scalability issues across all stack layers, including the products built using our infrastructure.Enhance monitoring, and...
Full-time|$110K/yr - $128K/yr|Hybrid|Austin, Texas, United States
Join Striveworks as a Site Reliability EngineerAt Striveworks, we empower organizations to leverage artificial intelligence to tackle real-world challenges in national security and business. Our mission is to serve as the command center where data, models, and business outcomes converge.Founded by a team of passionate data scientists and engineers, Striveworks simplifies the journey from deployment to ongoing optimization. We ensure that our clients aren’t just deploying AI; they’re establishing robust systems that are reliable, adaptable, and poised to scale in an ever-changing landscape.As a Site Reliability Engineer, you will play a pivotal role in implementing and managing corporate systems from day one. You will work with an array of systems and infrastructure automation tools while having the opportunity to innovate and enhance our toolset. Your focus will be on developing sustainable solutions that prevent future issues, thereby minimizing operational toil.Your daily responsibilities will include:Developing and maintaining infrastructure as code across private (OpenStack) and commercial (AWS, Azure, GCP) cloud environments.Creating configuration management automation for Windows laptops and Linux servers.Providing comprehensive user support for all corporate systems.This role is based in a hybrid/on-site setting at our northwest Austin office.
Site Reliability Engineer Overview: Join Weedmaps as a Site Reliability Engineer and collaborate with diverse teams across application development, infrastructure, and quality assurance to elevate the performance, reliability, and scalability of our web services at Weedmaps.com. As a fully cloud-native organization, we operate all our services within Docker containers on Kubernetes, hosted on AWS. Our culture promotes observability, proactive monitoring, and CI/CD automation, enabling us to release multiple production updates daily. In this role, you will utilize your engineering expertise to improve system monitoring, streamline CI workflows, and refine our deployment pipelines. You will serve as a knowledge resource for development teams, guiding them in utilizing standardized tools for metrics, logging, and deployment processes. Collaborate closely with both development and infrastructure teams to identify key service metrics that go beyond the basics, working with application teams to develop libraries that facilitate easy instrumentation of their services. Your Impact: Collaborate with stakeholders to establish best practices in monitoring and CI/CD pipelines. Troubleshoot issues within our deployment CI pipeline. Promote and support a strong DevOps culture within Weedmaps. Identify automation opportunities and advocate for codification across all processes. Share best practices regarding collaboration, reliability, security, and performance with all partner teams. Take responsibility for the configuration and scaling of applications, ensuring adherence to organizational practices. Develop and enhance synthetic monitoring workflows.
As the leading online platform for real estate professionals for over 25 years, Realtor.com® connects buyers, sellers, and renters with trusted insights and expert guidance to find their ideal home. Our comprehensive suite of tools significantly impacts the real estate industry and enhances the consumer experience, making it simple, understandable, and empowering for individuals navigating one of life's biggest purchases.Join us in our mission to help people find their way home by dismantling barriers to entry, establishing the right connections, and fostering confidence through expert guidance.About the RoleWe are looking for a Senior Site Reliability Engineer to become a crucial member of our newly established Operations Excellence organization, reporting directly to the Director of Operations Excellence. In this pivotal role, you will enhance the reliability, observability, and operational excellence of our platform infrastructure that serves millions of users. As a Senior SRE, you will be a key technical contributor, implementing best practices, addressing complex challenges, and empowering our team of over 600 engineers to deliver outstanding customer experiences.Your responsibilities will include working on critical platform systems such as EKS infrastructure, Skyway (CI/CD), Frontdoor (Tyk API Gateway), Pantheon (Apollo GraphQL Federation), and our observability stack. You will also play a part in chaos engineering practices and cost optimization initiatives, ensuring measurable ROI.We believe in employing the best tools to solve problems efficiently. You will be expected to adeptly use AI coding assistants and LLMs to accelerate development speed, generate boilerplate code, and resolve complex debugging issues. Beyond mere usage, this role demands the critical judgment to evaluate AI-generated outputs for security, performance, and accuracy. You should be comfortable incorporating AI tools into your daily routines to reduce repetitive tasks, allowing you to concentrate on high-impact architectural and strategic engineering challenges.What You'll DoPlatform Reliability & InfrastructureDesign, implement, and maintain highly available AWS infrastructure, including EKS clusters, Fargate (ECS), and multi-region architectures.Ensure the reliability of essential services: Skyway (CI/CD), Frontdoor (Tyk), Pantheon (Apollo GraphQL), and their supporting infrastructure.Monitor SLIs, SLOs, and error budgets for Tier 1/2/3 systems; participate in architectural reviews focused on reliability and cost-efficiency.Implement reliability patterns such as circuit breakers, graceful degradation, and automatic failover strategies.
At Apptronik, we are pioneering the future of robotics with our innovative, AI-driven humanoid robot, Apollo. Designed to collaborate with humans in vital sectors such as manufacturing and logistics, Apollo is set to expand into healthcare, home environments, and beyond, enhancing the quality of life across various domains.As a leader in embodied AI, our team is committed to addressing some of the most pressing challenges in society, focusing on safety, commercialization, and scalable production to bring Apollo to market effectively.JOB SUMMARYWe are on the lookout for a highly skilled Senior Site Reliability Engineer to take charge of and uphold the deployment of our cloud-based infrastructure at customer locations. This role requires close collaboration with our Applications Engineers, IT, and software teams to facilitate smooth deployments of our solutions that gather training data and deploy models into real-time robotic systems. Your contributions will be pivotal in integrating Google DeepMind’s Gemini Robotics Model with our humanoid robot hardware.
At BetterUp, we believe in the power of human transformation, and our approach to the employer-employee relationship reflects that belief.From the moment you engage with us, you will notice a distinct experience. It's not just about filling a position; it's about joining a mission-driven team.Upon accepting an offer, you gain more than just a paycheck—you will receive a dedicated BetterUp Coach, a personalized development plan, and a supportive manager. You'll also be part of an extraordinary team, each member accompanied by their own BetterUp Coach, working on projects that make a real impact.This unique environment fosters a focused and fulfilling work experience. While it may not be for everyone, for those who are passionate and driven, this role represents a transformative career opportunity.Join us for an intense and rewarding journey, where you'll engage in meaningful work within a vibrant and creative culture.If this resonates with you and the job description aligns with your skills, let’s start a conversation.As a hybrid company, we emphasize in-person collaboration when necessary. Employees must be available to work from one of our office hubs a minimum of two days per week, totaling eight days per month. Our US hubs include: Austin, TX; Chicago, IL; New York City, NY; San Francisco, CA; and the Washington, DC metro area. For roles based in Europe, our hubs are located in London, UK, and Amsterdam, NL. Please ensure you can commit to this structure before applying.Key Responsibilities:Utilize AI-driven tools and automation to enhance monitoring, troubleshooting, and maintenance of production systems.Develop and manage cloud infrastructure on AWS, employing Terraform for codifying and version-controlling our environments.Oversee and scale Kubernetes clusters that support BetterUp's platform, ensuring optimal availability and performance.Create intelligent alerting and observability frameworks.Collaborate with engineering teams to integrate reliability into the development lifecycle, proactively addressing operational concerns.Automate incident response processes and establish self-healing infrastructure.Explore and implement cutting-edge AI tools for log analysis, anomaly detection, and predictive maintenance.
qodeworld is seeking a Senior Site Reliability Architect to join the team in Austin, Texas. This position focuses on unified observability, proactive detection, AIOps, and GenAI-driven operations for distributed financial services platforms. The role requires deep technical expertise in designing and maintaining reliable, high-performance systems across complex architectures. Role overview The Senior Site Reliability Architect will drive enhancements in platform reliability and performance. This includes building SLI/SLO-driven monitoring, implementing dynamic thresholds, and developing intelligent alerting and AI/ML-based anomaly detection. The position is central to evolving operational practices from reactive alerting to proactive, insight-driven approaches. Key responsibilities Design and deploy unified observability dashboards that integrate metrics, logs, traces, events, and system topology. Establish and manage SLIs, SLOs, and error budgets aligned with business goals. Create actionable dashboards for operational, engineering, and leadership teams. Implement advanced alerting strategies using both static and dynamic thresholds. Apply AI/ML/AIOps technologies to detect anomalies, forecast incidents, and reduce MTTR. Shift monitoring practices from reactive alerting to proactive insights. Incorporate noise reduction, alert correlation, and root cause analysis. Use baseline modeling, seasonality detection, and anomaly scoring. Oversee and resolve issues in multi-service architectures, including microservices, APIs, Kafka/streaming platforms, and cloud infrastructure (Terraform, Infrastructure as Code). Analyze and trace issues across upstream/downstream dependencies, streaming platforms, infrastructure, and application code. Work extensively with Dynatrace (mandatory requirement). Utilize tools such as OpenTelemetry, Prometheus/Grafana, ELK/EFK, and cloud-native monitoring solutions (AWS, Azure, GCP). Manipulate and enrich telemetry using JSON. Apply GenAI/LLMs for incident summarization, root cause explanations, runbook recommendations, and auto-remediation suggestions. Collaborate with platform teams to operationalize GenAI technologies safely. Requirements 15+ years of experience in Site Reliability Engineering or Production Engineering. Strong background in unified observability, AIOps, and related fields. Proven experience with AI/ML technologies and cloud-native environments.
About Future Secure AI Future Secure AI develops solutions in artificial intelligence for real-world business challenges. The company values courage, precision, and curiosity, and supports an entrepreneurial culture where every team member is recognized. Leadership is experienced and approachable, with a focus on supporting individual growth. Team members work alongside colleagues from diverse backgrounds and contribute to projects that have impact across industries. Role Overview: Site Reliability Engineer The Site Reliability Engineer will design, build, and maintain the platforms that power Future Secure AI's AI Co-Workers. This is a hands-on position with responsibility for reliability throughout the product lifecycle. The role involves close collaboration with product, AI, and engineering teams to ensure platform stability and performance.
Full-time|On-site|Austin, TX, Reston, VA, Boston, MA
Join our dynamic App Platform team as a Senior Platform Engineer, where you'll wear multiple hats including Architect, Developer, Consultant, and Leader. You'll design robust systems, write code for scalable applications, and collaborate with server teams to deliver high-quality infrastructure solutions. Your ability to articulate your experiences through blog posts and presentations will contribute to our open-source initiatives, enhancing our presence in the tech community.
About TelnyxTelnyx is a trailblazer in the realm of global connectivity, actively constructing the future rather than just envisioning it. Our innovative solutions, from designing a private, global, multi-cloud IP network to delivering hyperlocal edge technology via user-friendly APIs, are revolutionizing seamless interconnections among people, devices, and applications.We are motivated by a commitment to revamping outdated processes, automating manual tasks, and addressing genuine challenges through advanced connectivity solutions. Our financial stability and profitability empower us to invest in cutting-edge technologies and cultivate a culture of continuous learning and advancement for our team.Our vision is a world where unrestricted connectivity drives boundless innovation. By joining us, you will play a pivotal role in laying the groundwork for this interconnected future. We are currently on the lookout for enthusiastic individuals eager to contribute to an industry-defining company while enhancing their own skills and career trajectories.The RoleAs a Messaging Compliance Specialist, you will be our key expert on Application-to-Person (A2P) messaging standards. This role involves bridging the gap between regulatory requirements and technical implementation while ensuring our platform and clients stay compliant with the rapidly changing messaging landscape. You'll also assist in developing tools that streamline the compliance process.
Join ICON as a Reliability Engineer II on the innovative Titan Team, where we create cutting-edge print systems. Your expertise will be crucial in guiding the Titan machine into Serial Production. In this role, you will evaluate system performance, pinpoint vulnerabilities, and develop strategies to enhance the overall reliability and consistency of our products. This position is based at our Austin, TX office.
Join Saronic as a Civil/Site Engineer specializing in Infrastructure, where you will play a pivotal role in designing and implementing innovative engineering solutions. You will collaborate with a diverse team of professionals to ensure the successful execution of infrastructure projects, enhancing the quality and sustainability of civil engineering.
Full-time|Remote|Remote (Atlanta, Austin, San Francisco, Seattle)
Role overview ditto is seeking a Senior Platform Engineer, Operator for a fully remote role. This position is open to candidates located in Atlanta, Austin, San Francisco, or Seattle. The focus is on designing, building, and maintaining systems that keep company operations running smoothly and efficiently at scale. What you will do Design and implement systems that improve platform scalability, performance, and reliability. Maintain and enhance the existing infrastructure to support ongoing business operations. Work closely with cross-functional teams to address technical challenges and streamline processes. Use technical expertise and leadership to drive key platform initiatives. Requirements Extensive experience in platform engineering. Strong problem-solving abilities and a collaborative mindset. Proven ability to contribute technical insights and lead engineering projects. Location This is a remote position. Candidates must be based in Atlanta, Austin, San Francisco, or Seattle.
We are seeking a talented and motivated Platform Engineer to join our dynamic team at Allen Control Systems. In this role, you will be responsible for designing, implementing, and maintaining scalable and robust platform solutions that meet our company's needs. Your expertise will contribute to our innovative projects and help shape the future of our technology stack.
About Base Power Base Power is a US-based power company focused on transforming the energy grid. The team works to build a decentralized power system by deploying distributed batteries across the country. Engineers, operators, and problem-solvers at Base Power address major challenges in the energy sector together. Role Overview: Deployment Engineer – Site Survey This Deployment Engineer position connects field operations with systems engineering. The role centers on improving how Base Power evaluates, approves, and executes hardware deployments at multiple locations. The engineer will refine site survey processes and set configuration standards to keep deployments consistent, secure, and reliable. Key Responsibilities Design and maintain internal tools and automated workflows to scale site survey reviews and make data ingestion across systems more efficient. Act as the technical authority for hardware configurations, setting and enforcing criteria for deployment approvals. Define, document, and uphold high standards for site survey reviews, supporting safety, consistency, and operational efficiency as deployment volume grows. Use SQL and analytics tools to examine field data and installation results, spot process bottlenecks, and drive improvements in deployment operations. Build internal dashboards with tools such as Python, JavaScript, or Retool to provide real-time insights into the site survey pipeline and key metrics. Work closely with Field Operations, Hardware Engineering, and Software teams to turn deployment challenges into engineering solutions and technical requirements. Develop and maintain detailed documentation for review criteria, internal tools, configuration standards, and operational processes. Location: Austin, TX
Apr 17, 2026
Sign in to browse more jobs
Create account — see all 1,027 results
Tailoring 0 resumes…
Tailoring 0 resumes…
We'll move completed jobs to Ready to Apply automatically.