About the job
Paytm is seeking a Hadoop Administrator to manage and improve on-premises Hadoop clusters at our Noida, Uttar Pradesh location. This position focuses on Apache Hadoop versions 2.7.1 and 3.4.x, with responsibility for cluster stability, performance, and scalability. The role also involves managing key ecosystem components such as Hive and YARN.
Main Responsibilities
- Cluster Administration:
- Install, configure, and manage on-prem Hadoop clusters (Apache Hadoop 2.7.1 and 3.4.x).
- Oversee Hadoop Distributed File System (HDFS), including storage, replication, and data balancing.
- Manage resource allocation and scheduling using Apache YARN.
- Upgrade & Migration:
- Lead Hadoop upgrades from version 2.7 to 3.4 in on-prem environments.
- Handle cluster migrations, validate data integrity (HDFS checksums, block integrity), and maintain backward compatibility for existing jobs.
- Use migration tools and manage changes in distributions.
- Ecosystem Management:
- Administer and optimize Apache Hive, including query tuning and metastore management.
- Work with ecosystem tools such as Spark (basic admin exposure preferred) and Tez (Hive execution engine).
- Maintain high availability for Hive Metastore and other critical services.
- Performance & Optimization:
- Monitor cluster health, job performance, and resource utilization.
- Tune YARN queues, capacity scheduler, Hive queries, and execution plans.
- Manage disk, memory, and network usage for optimal performance.
- Implement enhancements for large-scale data processing.
- Security & Governance:
- Configure authentication and authorization protocols.
- Ensure compliance with data security policies.
- Manage access controls and maintain audit logs.
- Troubleshooting & Support:
- Support and resolve Hadoop cluster issues.
- Troubleshoot node failures, job failures (Hive/YARN), and data inconsistencies.
- Analyze logs and identify root causes.
- Backup & Disaster Recovery:
- Develop and manage backup and disaster recovery strategies.
- Implement DistCp-based replication and set up DR clusters (active-passive or active-active).
Location
Noida, Uttar Pradesh
