Hadoop Online Training

Hadoop Online Training

STUDENTS ENROLLED

    HADOOP ONLINE TRAINING COURSE INTRODUCTION:

    Hadoop is an open source software project that enables the distributed processing of large data sets across clusters of commodity servers. Hadoop changes the economics and the dynamics of large scale computing. Hadoop training and expertise impact can be boiled down to four salient characteristics. For obvious reason Hadoop certified professional are enjoying now huge demand all over the world with fat pay package and growth potential where sky is the limit.

    HADOOP TRAINING COURSE CONTENT:
    1. BASICS OF HADOOP

    • what is the Motivation for Hadoop
    • Large scale system training
    • Survey of data storage literature
    • Literature survey of data processing
    • Overview Of Networking constraints
    • New approach requirements

    2. BASIC CONCEPTS OF HADOOP

    1. Hadoop Introduction
    2. Distributed file system of Hadoop
    3. Map reduction of Hadoop works
    4. Hadoop cluster and its anatomy
    5. Hadoop demons
    6. Master demons
    7. Name node
    8. Tracking of job
    9. Secondary node detection
    10. Slave daemons
    11. Tracking of task
    12. Hadoop Distributed File System (HDFS)
    13. Spilts and blocks
    14. Input Spilts
    15. HDFS spilts
    16. Replication of data
    17. Awareness of Hadoop racking
    18. High availably of data
    19. Block placement and cluster architecture
    20. Hadoop case studies
    21. Practices & Tuning of performances
    22. Development of mass reduce programs
    23. Local mode
    24. Running without HDFS
    25. Pseudo-distributed mode
    26. All daemons running in a single mode
    27. Fully distributed mode
    28. Dedicated nodes and daemon running

    3. HADOOP ADMINISTRATION

    1. Setup of Hadoop cluster
    2. Cluster of a Hadoop setup.
    3. Configure and Install Apache Hadoop on a multi node cluster.
    4. In a distributed mode, configure and install Cloud era distribution.
    5. In a fully distributed mode, configure and install Horton works distribution
    6. In a fully distributed mode, configure the Green Plum distribution.
    7. Monitor the cluster
    8. Get used to the management console of Horton works and Cloud era.
    9. Name the node in a safe mode
    10. Data backup.
    11. Case studies
    12. Monitoring of clusters

    4. HADOOP DEVELOPMENT :

    • What is Map Reduce Program
    • Sample the mapreduce program.
    • API concepts and their basics
    • Driver code
    • Mapper
    • Reducer
    • Hadoop AVI streaming
    • Performing several Hadoop jobs
    • Configuring close methods
    • files Sequencing
    • Record reading
    • Record writer
    • Reporter and its role
    • Counters
    • Output collection
    • Assessing HDFS
    • Tool runner
    • Use of distributed CACHE
    • Several MapReduce jobs (In Detailed)
    • SEARCH USING MAPREDUCE
    • GENERATING THE RECOMMENDATIONS USING MAPREDUCE
    • PROCESSING THE LOG FILES USING MAPREDUCE
    • Mapper Identification
    • Reducer Identification
    • Exploring the problems using this application
    • Debugging the MapReduce Programs
    • MR unit testing
    • Logging
    • Debugging strategies
    • Advanced MapReduce Programming
    • Secondary sort
    • Output and input format customization
    • Mapreduce joins
    • Monitoring & debugging on a Production Cluster
    • Counters
    • Skipping Bad Records
    • Running the local mode
    • MapReduce performance tuning
    • Reduction network traffic by combiner
    • Partitioners
    • Reducing of input data
    • Using Compression
    • Reusing the JVM
    • Running speculative execution
    • Performance Aspects
    • CASE STUDIES

    5. CDH4 ENHANCEMENTS :

    • Name Node Availability
    • Name Node federation
    • Fencing
    • MapReduce

    6. HADOOP ANALYST

    •  Hive Concepts
    •  Hive and its architecture
    •  Install and configure hive on cluster
    •  Type of tables in hive
    • Functions of Hive library
    •  Buckets
    •  Partitions
    •  Joins ( Inner joins and Outer Joins )
    •  Hive UDF

    7. PIG

    • Basics Of Pig
    • Install and configure PIG
    •  PIG Library Functions
    •  Pig Vs Hive
    •  Writing of sample Pig Latin scripts
    •  Modes of running 1. Grunt shell 2. Java program 7. PIG UDFs 8. Macros of Pig 9. Debugging the PIG

    8. IMPALA

    •  Difference between Pig and Impala Hive
    •  Does Impala give good performance?
    •  Exclusive features
    •  Impala and its Challenges
    •  Use cases

    9. NOSQL

    •   Introduction to HBase
    •  Explain HBase concepts
    •  Overview Of HBase architecture
    •  Server architecture
    •  File storage architecture
    •  Column access
    •  Scans
    •  HBase cases
    •  Installation and configuration of HBase on a multi node
    •  Create database, Develop and run sample applications
    •  Access data stored in HBase using clients like Python, Java and Pearl
    •  Map Reduce client
    •  HBase and Hive Integration
    •  HBase administration tasks
    •  Defining Schema and its basic operations.
    •  Cassandra Basics
    •  MongoDB Basics

    10. ECOSYSTEM COMPONENTS

    •  Sqoop
    •  Configure and Install Sqoop
    • Connecting RDBMS
    •  Installation of Mysql
    •  Importing the data from Oracle/Mysql to hive
    •  Exporting the data to Oracle/Mysql
    •  Internal mechanism

    11. OOZIE

    • Oozie and its architecture
    •  XML file
    •  Install and configuring Apache
    •  Work flow Specification
    •  Action nodes
    •  Control nodes
    •  Job coordinator
    • Avro, Scribe, Flume, Chukwa, Thrift 1. Concepts of Flume and Chukwa 2. Use cases of Scribe, Thrift and Avro 3. Installation and configuration of flume 4. Creation of a sample application

     

    Write a Review

    Your email address will not be published. Required fields are marked *

    Course Reviews

    N.A

    ratings
    • 5 stars0
    • 4 stars0
    • 3 stars0
    • 2 stars0
    • 1 stars0

    No Reviews found for this course.

    PRIVATE COURSE
    • PRIVATE
    • EXPIRED
    Contact Us

    +1 475-212-0075

    Drop us a query

    Error: Contact form not found.

    Job Support

    Error: Contact form not found.

    Course Features

    Live Instructor-led Classes

    This isn't canned learning. Its dynamic, its interactive, its effective

    Expert Educators

    Only the best or they're out. We are constantly evaluating our trainers

    24&7 Support

    We never sleep. Need something answered at 3 am? No Problem

    Flexible Schedule

    You don't learn as per our calendar. We work according to yours.

    Customized Training's

    The most part self-managed and adaptable to suit a person's particular adapting technology needs

    Priority Based Training's

    Real-time Scenario based Assignments and Case Studies

    COPYRIGHT © 2020 KEEN IT TECHNOLOGIES PVT.LTD, ALL RIGHTS RESERVED