| Courses Software Training | Locality Marathahalli |
Hi All,
We are providing Classroom & Online training on Hadoop in Marathahalli, Bangalore.
If you are interested please contact us.
Email: [email protected]
Phone: 9066830093
Address:
1st Floor, Dharneesh Building,
5th Cross, Ramanjaneya Layout,
Marathahalli, Bangalore.
Land Mark: Opp. Kalamandir, Beside Vinus Bakery Lane.
Hadoop Course Content
A) Big Data – Motivation & Basics.
B) Hadoop Administration – Architecture, Setups, Manipulation & Maintenances ……… (with Prerequisites -Linux).
C) Hadoop Development – a) MapReduce (Basics) b)Real World MapReduce (Advance) …..(with Prerequisites –Java).
D) Corporate Technologies – Hive, Pig, HBase, Oozie, Flume, Sqoop, Zookeeper, Mahout ……(with Prerequisites –SQL).
E) Cloud Computing – Concepts & Deploying hadoop on cloud (AWS- EC2, S3, EMR, & others as per requirement of Project).
F) Beyond Hadoop – Strom, Sparks, Mesos…..etc & future scope of Hadoop with these coming technologies.
——————&md
ash;-
A) Big Data
What is big data?
Challenges in big data
Challenges in traditional Applications
New Requirement
Introducing Hadoop
Brief History of Hadoop
Features of Hadoop
Overview of Hadoop Ecosystem
Overview of MapReduce
B) Hadoop Administration
1) Linux –
Basic architecture
Important commands
File permission and ownership
Administration
Communication
Pipe etc.
2) Setup Single (pseudo-node) Cluster
Important Directories,
Configuring HDFS & Important Configuration Properties.
3) Interacting with HDFS.
Common Example Operations
HDFS Command Reference
DFSAdmin Command Reference
Using HDFS For MapReduce
HDFS Web Interface
And how to Setup Multi-node Cluster.
Hands–on Exercises and Assignment.
4) Additional HDFS Tasks
Rebalancing Blocks
Copying Large Sets of Files
Decommissioning Nodes
Verifying File System Health
Rack Awareness
Cluster Configuration
Small Clusters: 2-10 Nodes
Medium Clusters: 10-40 Nodes
Large Clusters: Multiple Racks
Performance Monitoring
Ganglia
Nagios
Hands-on Exercises and Assignment
C) Hadoop Development
1) MapReduce -1
Java – basic Oops concepts, Serialization, I/O, Collection, Sorts ..etc.
Configure eclipse environment for Mapreduce development & run first Program.
Hands-on Exercises and Assignment.
2) MapReduce -2
Explanation of first program in details describing Mapper, Reducer, Driver.
MapReduce Algorithms and whole process flow – map, partition, sort, shuffle, reduce.
Related terms – Input formats, Input Splits, Speculative Execution..etc
Other related Algorithm – Combiner, Partitioner.
Hands-on Exercises and Assignment.
3) MapReduce -3
Discussion and solution of various program and their use cases in real world.
Local Runner and Usage of Tool runners.
Setup/Cleanup method in mapper/reducer.
Passing the parameters to mapper and reducer.
Searching Algorithm.
Distributed cache.
Hands-on Exercises and Assignment.
4) Real World MapReduce -1 (Advance)
Create custom keys and values.
Create custom partitioner.
Write custom input format.
Hands-on Exercises and Assignment.
5) Real World MapReduce -2
Implementing Custom comparator.
Secondary sorting.
Relational Manipulation– map-side,reduce-side joins.
Real-World Data mining.
Hands-on exercises and Assignment.
D) Corporate Technologies
1) SQL, HBase & other Components.
Introduction to Sqoop, Hive, Pig, Oozie, Flume, Mahout… their use cases and installation.
Introduction to HBase, Architecture, Map Reduce Integration, Different Client API –Feature and Administration.
Hands-on exercises and Assignment.
2) Hive
Understanding Hive, Architecture, Physical Model, DataTypes.
Hive QL –DDL, DML, Other Operations.
Understanding Tables in Hive, Partitioning, Indexes, Bucketing, Joining Tables, Data Load…etc
Hands-on Exercises and Assignment.
3) Pig
Understanding Pig, Different Mode and Data Model
Advance Pig Latin, Evaluation and Filter Functions..etc.
Real time use cases.
When to use pig and when to use hive.
Hands-on Exercises and Assignment.
4) Cloud Computing
Introduction, Options and how to use.
AWS(Amazon Web Services)- Registration and AMI setup.
Create multimode cluster using S3, EC2..and run MapReduce, pig and hive program.
F) Beyond Hadoop
Current world scenario and near future expectations.
Strom, Sparks, Mesos…..etc & future scope of Hadoop with these coming technologies.
Job exploration.