About the job
Main Responsibilities:
Efficiently manage and resolve production incidents to ensure minimal disruption.
Coordinate and monitor technical operations and changes, including electrical shutdowns, server modes, and disaster recovery procedures.
Engage in change management for applications and infrastructure by collaborating with development, sysadmin, DBA, and networking teams.
Maintain comprehensive documentation of application components and operations, including architecture and procedural needs.
Oversee backup data management and conduct regular restoration tests.
Implement effective monitoring of applications and infrastructure in compliance with established SLAs across all business lines.
Assess and manage application load capacity effectively.
Consistently contribute to the automation and enhancement of existing procedures.
Collaborate closely with development teams to define the architecture of new applications while adhering to infrastructure standards.
Technical Areas: Windows and Linux servers, Scripting, Agile methodology.
Technologies: SQL, Python, Networking, PowerShell, Linux Bash, XLDeploy, Jenkins, Control-M, Git, Monitoring Tools, Elastic Stack.
