Hadoop Administration training for System Administrators is designed for technical operations personnel whose job is to install and maintain production Hadoop clusters in real world. We will cover Hadoop architecture and its components, installation process, monitoring and troubleshooting of the complex Hadoop issues. The training is focused on practical hands-on exercises and encourages open discussions of how people are using Hadoop in enterprises dealing with large data sets.
Introduction to Big Data
• Characteristics of Big Data
• Why is parallel computing important
• Discuss various products developed by vendors
• Components of Hadoop Architecture
• Starting Hadoop
• MapReduce Framework
• Hadoop Cluster
• Identify various processes
Working with HDFS
• Basic file commands
• Web Based User Interface
• Reading & Writing to files
• Run a word count program
• View jobs in the Web UI
Installation & Configuration of Hadoop
• Installation and initial configuration.
• Set up ‘ssh’ for the Hadoop cluster
• Discuss block size and replication factor
• Discuss hadoop modes – psedo-distributed, standalone and multinode cluster.
• Hands on
Advanced administration activities
• Hadoop server roles and usage.
• RAC awareness.
• Adding and de-commissioning nodes
• Purpose of secondary name node
• Recovery from a failed name node
• Managing quotas
• Enabling trash
• Hands on
Hadoop Cluster Planning and Management
• Managing and scheduling jobs.
• Types of schedulers in Hadoop (Oozie).
• Running and Monitoring MapReduce Jobs.
• Upgrade Hadoop Cluster.
• Hands on.
High availibility Hadoop 2.0, HDFS federation and advanced security
• Managing security with Kerberose.
• HDFS Federation and Log Management.
• Quorum Journal Manager (QJM)
Other Components of the Hadoop ecosystem
• Discuss Hive, Sqoop, Pig, HBase, Flume.
• Use cases of each.
• Hive Administration.
• HBase Architecture
• Performance Optimization.
No Reviews found for this course.