Big Data Hadoop

Big Data Hadoop


    Hadoop Course Content :

    1.Introduction to Hadoop

    • What is Hadoop,
    • History of Hadoop
    • How Hadoop name was given
    • Problems with Traditional Large-Scale Systems and Need for Hadoop
    • Where Hadoop is being used
    • Understanding distributed systems and Hadoop
    • Hands-On Exercise: Using HDFS commands
    • Distributed computing
    • Parallel computing
    • Concurrency
    • Data Past, Present and Future
    • Computing Past, Present and Future
    • NoSQL
    • Hadoop Streaming
    • Distributing Debug Scripts
    • Getting Started With Eclipse

    2.Hadoop Stack

    • Hive and Pig

    3.Hadoop Hands-on

    • Installing Hadoop Single Node cluster(CDH4)
    • Understanding Hadoop configuration files

    4.HDFS Introduction

    • Architecture
    • File System

    5.Daemons of Hadoop

    6.Name Node and its functionality

    7.Data Node and its functionality

    8.Job Tracker and its functionality

    9.Task Track and its functionality

    10.Secondary Name Node and its functionality.

    11.Data Storage in HDFS

    12.Introduction about Blocks

    13.Data Replication

    14.Understanding Mapreduce

    • How MapReduce Works
    • Data flow in MapReduce
    • Map operation
    • Reduce operation
    • MapReduce Program In JAVA using Eclipse
    • Counting words with Hadoop—Running your first program
    • Writing MapReduce Drivers, Mappers and Reducers in Java

    15.Creating Input and Output Formats in Map Reduce Jobs

    • Text Input Format
    • Key Value InputFormat
    • Sequence File Input Format
    • Real-world “MapReduce” problems
    • MapReduce Job
    • Java WordCount Code Walkthrough
    • How to debug MapReduce Jobs in Local and Pseudo cluster Mode.
    • Combiner(Mini Reducer) and Partioner

    16.HADOOP Ecosystem

    • Sqoop
    • HBase

    17.Extended  Subjects  On Sqoop

    • Installing Sqoop
    • Configure Sqoop
    • Import RDBMS data to Hive using Sqoop
    • Export from to Hive to RDBMS using Sqoop
    • Hands-On Exercise: Import data from RDBMS to HDFS and Hive
    • Hands-On Exercise: Export data from HDFS/Hive to RDBM

    18.Hive Introduction Installation and Configuration

    • Running Hive
    • Configuration management overview
    • Runtime configuration
    • Hive, Map-Reduce and Local-Mode

    19.DDL Operations

    • Metadata Store

    20.DML OperationsSQL Operations

    • Queries
    • Selects and filters
    • Group by
    • Multitable insert
    • Streaming


    • Movie Lens
    • Apache log

    22.Hive Architecture

    • Data store
    • Meta store
    • Architecture
    • Interface
    • Compiler
    • Optimizer


    • Hbase introduction
    • Hbase usecases
    • Hbase basics ? Column families ? Scans
    • Hbase Architecture

    24.Pig Introduction

    • Pig and Dataflow
    • Pig Philosophy
    • Pig and Hadoop
    • Pig vs Hive
    • Why Pig

    25.Installing and Configuring Pig

    • Download and Install from Apache
    • Running Pig
    • Local
    • Cluster
    • Cloud
    • Command Line Options


    • Understanding Grunt
    • entering PigLatin script in Grunt
    • HDFS Commands in Grunt
    • Controlling Pig from Grunt

    27.Pig Data Model

    • Problem Statement and Data Model
    • Input and Output
    • Store
    • Relational Operators
    • Foreach
    • Filter
    • Group
    • OrderBy
    • Distinct
    • Limit
    • Sample
    • Parallel
    • User Defined Functions
    • Registering UDF
    • Defining UDF
    • Calling Static Java Functions

    28.OOZIE and Flume Introduction

    29.Planning and Cloud Manager Set-up Hadoop Multi node Cluster Setup

    • Installation and Configuration
    • Running Map Reduce Jobs on Multi Node cluster

    30.Real Time Project

    • Clear explanation of real time Project by taking real time data
    • Take the data from different source systems like text files, csv files, RDBMS
    •  Loading the data in to Hadoop & do some analytics using Map Reduce, HIVE & PIG


    • This test helps you in clearing the certification for Hadoop.

    Write a Review

    Your email address will not be published. Required fields are marked *

    Course Reviews


    • 5 stars0
    • 4 stars0
    • 3 stars0
    • 2 stars0
    • 1 stars0

    No Reviews found for this course.

    • 1 month, 1 week
    Contact Us

    +1 475-212-0075

    Drop us a query

      Your Details

      * Required

      Job Support

        Your Details

        * Required

        Course Features

        Live Instructor-led Classes

        This isn't canned learning. Its dynamic, its interactive, its effective

        Expert Educators

        Only the best or they're out. We are constantly evaluating our trainers

        24&7 Support

        We never sleep. Need something answered at 3 am? No Problem

        Flexible Schedule

        You don't learn as per our calendar. We work according to yours.

        Customized Training's

        The most part self-managed and adaptable to suit a person's particular adapting technology needs

        Priority Based Training's

        Real-time Scenario based Assignments and Case Studies