Six Weeks big data and hadoop Summer training in kanpur
About Hadoop Training
Hadoop is an open-source framework that allows to store and process big data in a distributed environment across clusters of computers using
simple programming models. It is designed to scale up from single servers to thousands of machines, each offering local computation and storage.
This brief tutorial provides a quick introduction to Big Data, MapReduce algorithm, and Hadoop Distributed File System .
Introduction to Bigdata
Types of data and their significance
Need for Bigdata Analytics.
Why Bigdata with hadoop?
History Of Hadoop.
Node, Rack , Cluster.
Architecture of Hadoop.
Characteristics of Namenode.
Significance of JobTracker and Tasktrackers.
hase co-ordinatiaon with JobTracker.
Scondary Namenode usage and workaround.
Hadoop releases and their significance.
Workaround with datanodes.
YARN architecture.
Significance of scalability of operation.
Use cases where not to use Hadoop.
Use cases where Hadoop Is used.
Facebook,Twitter,Snapdeal, Flipkart.
Hadoop Java API
Hadoop Classes,What is MapReduceBase?
Mapper Class and its Methods.
What is Partitioner and types.
Hadoop specific datatypes
Working on unstructured data analytics.
What is an iterator and its usage techniques.
Types of mappers and reducers.
What is output collector and its significance.
Workaround with Joining of datasets.
Complications with mapreduce.
Mapreduce anatomy.
Anagram example,Teragen Example,Terasort Example
WordCount Example
Working with multiple mappers.
Working with weather data on multiple datanodes in a Fullydistributed architecture.
UseCases where mapreduce anatomy fails.
Interview questions based on JAVA mapreduce
Working with Pig Latin - I
Introduction to Pig Latin
History and evolution of Pig latin
Why Pig is used only with Bigdata
Pig architecture and overview of Compiler and Execution Engine.
Pig Release and significance with bugfixes.
Pig Specific Datatypes
Complex Datatypes
Bags, Tuples, Fields
Pig Specific Methods.
Comparison Between Yahoo Pig & Facebook Hive.
Working with Grunt Shell.
Grunt commands(total 17)
Pig Data input techniques for flatfiles(comma separated, tab delimited and fixed width). Working with schemaless approach
How to attach schema to a file/table in pig.
Schema referencing for similar tables and files.
Working with delimiters
Working with Pig Latin - II
Working with BinaryStorage and Text Loader.
Bigdata Operations and Read write analogy.
Filtering Datasets
Filtering rows with specific condition
Filtering rows with multiple conditions
Filtering rows with string based conditions
Sorting DataSets
Sorting rows with specific column or columns
Multilevel Sort
Analogy of a sort operation
Grouping datasets and Co-grouping data
Joining DataSets
Types of Joins supported by Pig Latin
Aggregate oprations like average,sum,min,max,count