Big Data Hadoop
Hadoop Course Content :
1.Introduction to Hadoop
- What is Hadoop,
- History of Hadoop
- How Hadoop name was given
- Problems with Traditional Large-Scale Systems and Need for Hadoop
- Where Hadoop is being used
- Understanding distributed systems and Hadoop
- Hands-On Exercise: Using HDFS commands
- Distributed computing
- Parallel computing
- Concurrency
- Data Past, Present and Future
- Computing Past, Present and Future
- NoSQL
- Hadoop Streaming
- Distributing Debug Scripts
- Getting Started With Eclipse
2.Hadoop Stack
- Hive and Pig
3.Hadoop Hands-on
- Installing Hadoop Single Node cluster(CDH4)
- Understanding Hadoop configuration files
4.HDFS Introduction
- Architecture
- File System
5.Daemons of Hadoop
6.Name Node and its functionality
7.Data Node and its functionality
8.Job Tracker and its functionality
9.Task Track and its functionality
10.Secondary Name Node and its functionality.
11.Data Storage in HDFS
12.Introduction about Blocks
13.Data Replication
14.Understanding Mapreduce
- How MapReduce Works
- Data flow in MapReduce
- Map operation
- Reduce operation
- MapReduce Program In JAVA using Eclipse
- Counting words with Hadoop—Running your first program
- Writing MapReduce Drivers, Mappers and Reducers in Java
15.Creating Input and Output Formats in Map Reduce Jobs
- Text Input Format
- Key Value InputFormat
- Sequence File Input Format
- Real-world “MapReduce” problems
- MapReduce Job
- Java WordCount Code Walkthrough
- How to debug MapReduce Jobs in Local and Pseudo cluster Mode.
- Combiner(Mini Reducer) and Partioner
16.HADOOP Ecosystem
- Sqoop
- HBase
17.Extended Subjects On Sqoop
- Installing Sqoop
- Configure Sqoop
- Import RDBMS data to Hive using Sqoop
- Export from to Hive to RDBMS using Sqoop
- Hands-On Exercise: Import data from RDBMS to HDFS and Hive
- Hands-On Exercise: Export data from HDFS/Hive to RDBM
18.Hive Introduction Installation and Configuration
- Running Hive
- Configuration management overview
- Runtime configuration
- Hive, Map-Reduce and Local-Mode
19.DDL Operations
- Metadata Store
20.DML OperationsSQL Operations
- Queries
- Selects and filters
- Group by
- Multitable insert
- Streaming
21.Exercise
- Movie Lens
- Apache log
22.Hive Architecture
- Data store
- Meta store
- Architecture
- Interface
- Compiler
- Optimizer
23.Hbase
- Hbase introduction
- Hbase usecases
- Hbase basics ? Column families ? Scans
- Hbase Architecture
24.Pig Introduction
- Pig and Dataflow
- Pig Philosophy
- Pig and Hadoop
- Pig vs Hive
- Why Pig
25.Installing and Configuring Pig
- Download and Install from Apache
- Running Pig
- Local
- Cluster
- Cloud
- Command Line Options
26.Grunt
- Understanding Grunt
- entering PigLatin script in Grunt
- HDFS Commands in Grunt
- Controlling Pig from Grunt
27.Pig Data Model
- Problem Statement and Data Model
- Input and Output
- Store
- Relational Operators
- Foreach
- Filter
- Group
- OrderBy
- Distinct
- Limit
- Sample
- Parallel
- User Defined Functions
- Registering UDF
- Defining UDF
- Calling Static Java Functions
28.OOZIE and Flume Introduction
29.Planning and Cloud Manager Set-up Hadoop Multi node Cluster Setup
- Installation and Configuration
- Running Map Reduce Jobs on Multi Node cluster
30.Real Time Project
- Clear explanation of real time Project by taking real time data
- Take the data from different source systems like text files, csv files, RDBMS
- Loading the data in to Hadoop & do some analytics using Map Reduce, HIVE & PIG
31.Test
- This test helps you in clearing the certification for Hadoop.
Course Reviews
No Reviews found for this course.
Write a Review