About the job
- Identify and troubleshoot technical and functional issues at L2/L3 levels, ensuring swift service recovery and comprehensive root-cause analysis.
- Manage daily operations of management chains, including batch processing, data transfers, and workflow oversight.
- Coordinate and execute software releases and hardware infrastructure upgrades within your responsibility, minimizing disruptions to business operations.
- Deploy and monitor analytical tools (Dynatrace, Prometheus/Grafana) to assess load, performance, and overall application health.
- Develop automation scripts using Python/Bash and utilize DevOps tools like Ansible to streamline repetitive tasks, log analysis, and health checks.
- Maintain and enhance application documentation, create runbooks, and define operational procedures to ensure team effectiveness.
- Evaluate application performance and user feedback to recommend and implement necessary functional or technical improvements.
