google2c7a87877ffe6fb7.html

Hadoop online training with all modules, Hadoop training, Hadoop training in Hyderabad, hadoop training with placement assistant, Big data hadoop training, big data training, projects development on big data

Introduction to Hadoop

· What is big data?

o Big data challenges?

o How hadoop is related to big data?

o Problems with storing/processing of big data

o Working with traditional large scale systems

· What is hadoop ?

o Hadoop core components – HDFS & MR

o Hadoop eco system – other tools

o Hadoop distributions and differences: Cloudera, Horton works, MapR

o Real time scenarios of hadoop with various use cases.

· HDFS (Hadoop Distributed File System)

· DFS vs HDFS and Cluster vs Hadoop Clusters

· Features of HDFS

· HDFS Architecture

· HDFS storage

o blocks, Configuring blocks , default vs custom block sizes

o HDFS architecture

o Replication in HDFS

· Fail over mechanism

· Custom replication and configuring replication factors

· Daemons of Hadoop 1.x :

o NameNode and functionality

o DataNode and functionality

o Secondary Name Node and functionality

o Job Tracker and functionality

o Task Tracker

· Daemons of Hadoop 2.x :

· Name Node, Data Node, Secondary Name Node, Resource Manager, Node Manager

· Hadoop cluster modes

· Single Node vs multi node

· HDFS federation

· High availability

MAP REDUCE

Ø Map Reduce life cycle

o Communication mechanism of processing daemons

o Input format and Record reader classes

o Success case vs Failure case scenarios

o Retry mechanism in Map Reduce

Ø Map Reduce programming

o Different phases of Map Reduce algorithm

o Different data types in Map Reduce

o Primitive data types Vs Map Reduce data types

o How to write map reduce programs

Ø Driver Code

o Importance of driver code in a Map Reduce program

o How to identify the driver code in Map Reduce program

o Different sections of driver code

Ø Mapper Code

o Importance of Mapper Phase in Map Reduce

o How to write a Mapper class,Methods in Mapper Class

Ø Reducer Code

o Importance of Reduce Phase in Map Reduce

o How to write a Reducer class,Methods in Reducer Class

Ø Input split

o Need of input split in Map reduce

o Input Split size vs block size

o Input split vs mappers

Ø Identity Mapper & Identity Reducer

Ø Input format’s in Map Reduce

o Text input format

o Key value text input format

o Sequence file input format

o How to use the specific input format in Map Reduce

o Custom input formats and its record readers

Ø Output format’s in Map Reduce

o Text output format

o Key value text output format

o Sequence file output format

o How to use the specific output format in Map Reduce

o Custom output formats and its record writers

Ø Map Reduce API

o New API vs Deprecated API

Ø Combiner in Map Reduce

o Usage of combiner class in map reduce

o Performance trade-offs

Ø Partitioner in map reduce

o Importance of partitionerclass in map reduce

o Writing custom partitioners

Ø Compression techniques in map reduce

o Importance of compression in map reduce

o What is CODEC

o Compression types

ü GZipCodec

ü BZip and BZip2 Codec

ü LZOCodec

ü Snappy Codec

o map reduce streaming

o data localization

o secondary sorting using map reduce

o enable and disable these techniques for all the job

o enable and disable these techniques for particular job

Ø join in Map Reduce

o map side vs reduce side join

o performance trade off

Ø distributed cache

Ø counters

Ø map reduce schedulers

Ø Debugging map reduce jobs

Ø Chain mappers and reducers

Ø Setting up to no of reducers

Apache pig

Ø Introduction to pig

o Introduction to pig

o Installing and running pig

o Pig Latin scripts

o Pig console: grunt shell

o Data types

o Writing evaluation

o Filter

o Load and store functions

Ø Relational operators in pig

o COGROUP

o CROSS

o DISTINCT

o FILTER

o FOREACH

o GROUP

o JOIN(INNER)

o JOIN(OUTER)

o LIMIT

o LOAD

o ORDER

o SAMPLE

o SPILT

o STORE

o UNION

Ø Diagnostic operators in pig

o describe

o dump

o explain

o illustrate

Ø eval functions in pig

o AVG

o CONCAT

o COUNT

o DIFF

o IF EMPTY

o MAX

o MIN

o SIZE

o SUM

o TOKENIZE

Ø MR Vs Pig

Ø Different mode of execution

Ø Comparison with RDBMS(SQL)

Ø Pig User Defined Functions(UDF)

Ø Need of using UDF

Ø How to use UDFs

Ø REGISTRER key word

HIVE

Ø Hive introduction

Ø comparison with traditional database

Ø need of apache hive in hadoop

Ø Sql Vs Hive QL

Ø map reduce and local mode

Ø hive architecture

o driver

o complier

o executor(semantic analyser)

Ø Meta store in hive

o Importance of hive meta store

o External meta store configuration

o Communication mechanism with meta store

Ø Hive query language(Hive QL)

Ø HiveQL: data types

Ø Operators and functions

Ø Hive tables(managed tables and external tables)

Ø HiveQL data manipulations

o Loading data in tables

o Exporting data

o Different types of joins

Ø Hive scripting

Ø Indexing

Ø Views

Ø Appending data into existing hive table

Ø Data Slicing mechanisms

o Partitions in hive

o Buckets in hive

o Partitioning Vs Bucketing

o Real time use cases

Ø User defined functions(UDF’s) in Hive

o UDFs

ü How to write UDF’s

ü Importance of udf’s

HBASE

Ø Hbase introduction

Ø HDFS Vs hbase

Ø Hbaseusecases

Ø Hbase basic

o Column families

o Htable

Ø Hbase architecture

Ø Hbase tables

Ø Hbase storage handles

Ø Hbase usage

o Key design

o Bloom filters

o Versioning

o Coprocessor

o Filters

SQOOP

Ø Introduction to sqoop

Ø Mysql client and server installation

Ø How to connect to relational database using sqoop

Ø Different sqoop commands

Ø Different flavours of imports

Ø Different flavours of Export

OOZIE

Ø OOZIE introduction

Ø OOZIE architecture

Ø OOZIE Execution

o Workflow.xml

o Coordinator.xml

o Job coordinator. Properties

Ø OOZIE as a scheduler

Ø OOZIE as a workflow designer

Hadoop Administration

Ø Hadoop single node cluster setup

o Operating system installation

o Jdk installation

o SSH configuration

o Dedicated group and user creation

o Hadoop installation

o Different configuration file setting

o Name node format

o Starting the hadoop daemons

Ø PIG installation (local mode, cluster mode)

Ø SQOOP installation

o Sqoop installation with mysql client

Ø Hive installation

Ø Hbase installation (local mode and clustered mode)

Ø OOZIE installation

Hadoop online training with all modules | Hadoop training | Hadoop training in Hyderabad | hadoop training with placement assistant | Big data hadoop training | big data training | projects development on big data

Hadoop online training with all modules, Hadoop training, Hadoop training in Hyderabad, hadoop training with placement assistant, Big data hadoop training, big data training, projects development on big data

Introduction to Hadoop

MAP REDUCE

Total Pageviews