About the job
Join PayNearMe as a Staff Technical Program Manager, where you'll spearhead our Quality and Reliability initiatives across vital systems and services. This impactful individual contributor position is tailored for a seasoned professional adept at establishing order amidst uncertainty and achieving results across diverse teams and domains.
In this role, you will lead cross-functional programs aimed at enhancing system reliability, scalability, and operational quality. Your responsibilities include refining incident response strategies, ensuring production readiness, and innovating software testing and deployment methods. We seek a TPM with substantial technical expertise and a history of influencing system-level quality and delivery culture on a large scale.
Key Responsibilities
- Oversee the Quality & Reliability Program: Articulate and implement the vision for quality, encompassing proactive practices (testing, deployment, observability), reactive processes (incident response, external communications), and cultural norms (quality ownership, readiness).
- Lead Cross-Functional Initiatives: Propel reliability and quality projects across Engineering, Product, Operations, and Customer Success teams.
- Ensure Production Readiness: Manage the Production Readiness Review (PRR) process, validating that all releases adhere to reliability standards prior to going live.
- Establish and Monitor SLOs: Define and track Service Level Objectives (SLOs), enhancing visibility into reliability metrics and leading initiatives to meet or exceed targets.
- Streamline Incident Management: Optimize incident response and postmortem processes, driving improvements in tooling, communication, and accountability.
- Enhance Tooling & Automation: Collaborate with teams to advance observability, alerting, testing automation, and incident response tools.
- Proactively Manage System Risk: Identify potential risk factors early on, develop mitigation strategies, and drive prompt resolutions.
- Foster Cross-Departmental Alignment: Influence Engineering, Product, Operations, and GTM teams to prioritize reliability, integrating quality into every project.
- Monitor Progress: Utilize tools such as Atlas, Jira, and internal dashboards to maintain clarity on objectives, risks, and outcomes.
- Promote Continuous Learning: Develop programs that ensure lessons are learned from every incident, test edge cases, and strengthen our systems continuously.

