Big Data Hadoop Online Training

Big Data Hadoop Online Training


    Hadoop Course Content :

    Introduction to Big Data & Hadoop

    • The Big Data Problem
    • What is Big Data?
    • Challenges in processing Big Data
    • What is Hadoop?
    • Why Hadoop?
    • History of Hadoop
    • Hadoop Components Overview
      1. HDFS
      2. Map Reduce
    • Hadoop Eco System Introduction
    • NoSQL Database Introduction

    Understanding Hadoop Architecture

    • Hadoop 2.x Architecture
    • Introduction to YARN
    • Hadoop Daemons
    • YARN Architecture
      • Resource Manager
      • Application Master
      • Node Manager

    Introduction to HDFS (Hadoop Distributed File System)

    • Rack Awareness
    • HDFS Daemons
    • Writing Files to HDFS
      • Blocks & Splits
      • Input Splits
      • Data Replication
    • Reading Files from HDFS
    • Introduction to HDFS Configuration Files

    Working with HDFS

    • HDFS Commands
    • Accessing HDFS
      • CLI Approach
      • JAVA Approach [Introducing HDFS JAVA API]

     Introduction to Map Reduce Paradigm

    • What is Map Reduce?
    • Detailed Map Reduce Flow
      • Introduction to Key/Value Approach
      • Detailed Mapper Functionality
      • Detailed Reducer Functionality
      • Details of Partitioner
      • Shuffle & Sort Process
    • Understanding Map Reduce Flow with Word Count Example

    Map Reduce Programming

    • Introduction to Map Reduce API [New Map Reduce API]
    • Map Reduce Data Types
    • File Formats
    • Input Formats – Input Splits & Records, text input, binary input
    • Output Formats – Text Output, Binary Output
    • Configuring Development Environment – Eclipse
    • Developing a Map Reduce Application using Default Functionality
      • Identity Mapper
      • Identity Reducer
      • ToolRunner API Introduction
    • Developing Word Count Application
      • Writing Mapper, Reducer & Driver Code
      • Building Application
      • Deploying Application
    • Running the Map Reduce Application
      • Local Mode of Execution
      • Cluster Mode of Execution
    • Monitoring Map Reduce Application
    • Map Reduce Combiner
    • Map Reduce Counters
    • Map Reduce Partitioner
    • File Merge Utility
    Programming with HIVE
    • Introduction to HIVE
    • Hive Architecture
    • Types of Meta store
    • Introduction to Hive Configuration Files
    • Hive Data Types
      • Simple Data Types
      • Collection Data Types
    • Types of Hive Tables
      • Managed Table
      • External Table
    • Hive Query Language (HQL or HIVE QL)
      • Creating Databases
      • Creating Tables
      • Loading Data into table
      • Joins in Hive
      • Group BY and Distinct operations
      • Partitioning
        • Static Partitioning
        • Dynamic Partitioning
      • Bucketing
      • Lateral View & Explode [Introduction to Hive UDFs à UDF, UDAF & UDTF]
      • XML Processing in HIVE
      • JSON processing in HIVE
      • URL Processing in HIVE
    • Hive File Formats [Introduction to Hive SERDE]
      • Parquet
      • ORC
      • AVRO
    • Storage Formats
    • Introduction to HIVE Query Optimizations
    • Developing Hive UDFs in JAVA
    • Hive Views
     Programming with PIG
    • Introduction to PIG
    • PIG Architecture
    • Introduction to PIG Configuration Files
    • PIG vs. HIVE vs. Map Reduce
    • Introduction to Data Flow Language
    • Pig Data Types
    • Pig Programming Modes
    • Pig Access Modes
    • Detailed PIG Latin Programming
    • PIG UDFs & UDF Development in JAVA
    • Hive – PIG Integration à Introduction to HCATALOG
    • Introduction to PIG Optimization


    • Introduction to NoSQL Databases
    • Types of NoSQL Databases
    • Introduction To HBASE
    • HBASE Architecture
    • HBASE Shell Interface
      • Creating Data Bases and Tables
      • Inserting Data in tables
      • Accessing data from Tables
      • HBase Filters
    • Hive & HBASE Integration
    • PIG & HBASE Integration
    • Document Store – MongoDB Overview

    Introduction to Streaming & FLUME

    • Introduction to Streaming
    • Introduction to FLUME
    • FLUME Architecture
    • Flume Agent Setup
    • Types of Source, Channel & Sinks
    • Developing Sample Flume Applications


    • Introduction to SQOOP
    • Connecting to RDBMS Using SQOOP
    • SQOOP Import
      • Import to HDFS
      • Import to HIVE
      • Import to HBASE
      • Bulk Import
        • Full Table
        • Subset of a Tables
        • All tables in DB
      • Incremental Import
        • Incremental Append
        • Incremental Last Modified
      • SQOOP Export
        • Export from HDFS
        • Export from Hive


    • Introduction to Zookeeper
    • Distributed Coordination
    • Zookeeper Data Model
    • Zookeeper Service
    • Zookeeper Commands

    Apache Kafka

    • Introduction to Kafka
    • Kafka Internals
    • Kafka Cluster Architecture
    • Kafka Producer
    • Kafka Consumer
    • Kafka Broker
    • Introduction to Kafka API
    • Kafka Stream Processing
    • Integrating Kafka with various Hadoop Systems

     Introduction to Scala Programming

    • Introduction to Functional Programming & Scala
    • Comparing Java and Scala
    • Setting Up Scala in UNIX
    • Setting Up SBT
    • Introduction to Scala REPL
    • Setting up Scala on Eclipse (Scala IDE)

     Scala Programming Fundamentals

    • Scala Data Types
    • Variable Declarations
    • Variable Type Inference
    • Operators
    • Scala Control Structures
    • Scala Looping Structures
    • Scala Functions
    • Scala Collections
      • Array
      • List
      • Map
      • Tuples
      • Set

    Functional Programming in Scala

    • Introduction to Functional Programming
    • Difference between OOPs & Functional Programming
    • Higher Order Functions
    • Anonymous Functions
    • Closures and Currying
    • Functional Programming on Collections
      • Iteration, Mapping, Filtering and Reduce
    • Maps, Sets, Group By, Flatten and Flat Map
    • File Access and File Processing
    • Scala Pattern Matching

    Object Oriented Programming in Scala

    • Concept of Classes in Scala
    • Implementing Getters and Setters
    • Concept of Objects in Scala
    • Singleton Objects
    • Companion Objects
    • Case Classes
    • Primary Constructor
    • Auxiliary Constructor
    • Overriding Methods
    • Apply Method
    • Traits and Abstract Classes
    • Exception Handling in Scala

     Introduction to Spark

    • What is Apache Spark
    • Spark Unified Stack
      • Spark Core
      • Saprk SQL
      • Spark Streaming
      • MLib
      • GraphX
      • Cluster Managers
    • Users of Spark
    • Spark vs. Mapreduce
    • Introduction to Spark Shell
    • Introduction to Spark Core API for Spark Application Development

     Programming With Spark RDDs

    • Introduction to RDDs
    • Creating RDDs
    • RDD Operations
      • Transformations
      • Actions
      • Lazy Evaluation
    • Passing Functions to Spark
    • Common Transformations and Actions on RDDs
    • Concept of Pair RDDs
    • Transformation and Actions on Paired RDDs
    • Data Partitioning in RDDs
    • Concept of Persistence/Caching in RDDs
    • Accumulators and Broadcast Variables
    • Loading and Saving Data Using RDDs

    File Formats:

    • Text Files
    • CSV and Tab Separated Files
    • JSON
    • Sequence Files
    • Parquet Files
    • Compression Technique – Snappy, Gzip

     Programming with Spark Data Frames & Spark SQL

    • Introduction to Spark Data Frames
    • Dataframes vs. RDDs
    • Introduction to Spark SQL
    • Understanding HiveContext
    • Operations on Data Frames
    • Schema RDDs and Converting Schema RDDs to DataFrames (Custom Case Classes)
    • Temp Tables vs. Persistent Tables
    • Loading and Saving Data in DFs
      • Apache Hive
      • JSON
      • Parquet
      • ORC Files
    • User Defined Functions (UDFs)
      • Spark SQL UDF
      • Hive UDF

     Spark Streaming

    • Introduction to Spark Streaming Architecture
    • Introduction to Discrete Streams (DStreams)
    • Streaming Operations
    • Integrate Spark Streaming with Kafka

    PySpark Overview (Time Permitting)                                                                                        

    • Introduction to PySpark & PySpark Shell
    • Using Python to develop Spark Applications
    • Running PySpark Application

    Write a Review

    Your email address will not be published. Required fields are marked *

    Our trainers have relevant experience in implementing real-time solutions on different queries related to different topics. keen Technologies also verifies their technical background and expertise.

    We record each LIVE class session you undergo through and we will share the recordings of each session/class.

    If you have any queries you can contact our 24/7 dedicated support to raise a ticket. We provide you email support and solution to your queries. If the query is not resolved by email we can arrange for a one-on-one session with our trainers.

    You will work on real world projects wherein you can apply your knowledge and skills that you acquired through our training. We have multiple projects that thoroughly test your skills and knowledge of various aspect and components making you perfectly industry-ready.

    Our Trainers will provide the Environment/Server Access to the students and we ensure practical real-time experience and training by providing all the utilities required for the in-depth understanding of the course.

    If you are enrolled in classes and/or have paid fees, but want to cancel the registration for certain reason, it can be attained within 48 hours of initial registration. Please make a note that refunds will be processed within 30 days of prior request.

    The Training itself is Real-time Project Oriented.

    Yes. All the training sessions are LIVE Online Streaming using either through WebEx or GoToMeeting, thus promoting one-on-one trainer student Interaction.

    There are some Group discounts available if the participants are more than 2.

    As we are one of the leading providers of Online training, We have customers from USA, UK, Canada, Australia, India and other parts of the world.

    Course Reviews


    • 5 stars0
    • 4 stars0
    • 3 stars0
    • 2 stars0
    • 1 stars0

    No Reviews found for this course.

    • 1 month, 1 week
    Contact Us

    +1 475-212-0075

    Drop us a query

    Job Support

    Course Features

    Live Instructor-led Classes

    This isn't canned learning. Its dynamic, its interactive, its effective

    Expert Educators

    Only the best or they're out. We are constantly evaluating our trainers

    24&7 Support

    We never sleep. Need something answered at 3 am? No Problem

    Flexible Schedule

    You don't learn as per our calendar. We work according to yours.

    Customized Training's

    The most part self-managed and adaptable to suit a person's particular adapting technology needs

    Priority Based Training's

    Real-time Scenario based Assignments and Case Studies


    This web site or team is not associated with SAP AG or any other product. We are providing this service absolutely for Education and Training purpose only. We charge for the support services only, not for the actual SAP System access. Keen Technologies does not provide official SAP training courses or certifications and does not provide any access to SAP software. SAP and its product names, including HANA, S/4HANA, HYBRIS, and LEONARDO are trademarks or registered trademarks of SAP in Germany and other countries.

    Keen Technologies does not provide official SAP training courses or certifications and does not provide any access to SAP software. SAP and its product names, including HANA, S/4HANA, HYBRIS, and LEONARDO are trademarks or registered trademarks of SAP in Germany and other countries.

    IBM® is a registered trademark of IBM in the United States.

    Popular Courses


    Keen Technologies offering training in cities: Bangalore, Hyderabad, Chennai, Delhi, Kolkata, UK, London, Texas, Chicago, San Francisco, Dallas, Washington, New York, Houstan, Atlanta, Orlando, Boston, Toronto, Ottawa, Windsor, Australia, Dubai, Leeds.


    WhatsApp us whatsapp