Big Data With Hadoop

Hadoop is an open source software project that enables the distributed processing of large data sets across clusters of commodity servers. Hadoop changes the economics and the dynamics of large scale computing. Hadoop training and expertise impact can be boiled down to four salient characteristics. For obvious reason Hadoop certified professional are enjoying now huge demand all over the world with fat pay package and growth potential where sky is the limit.

By the end of this training you will:
- Understand the core concepts of Big Data module.
- Be able to apply the knowledge learned to progress in your career as an Big Data Developer.

Essential
Minimum knowledge of OOP languages like Core Java, python, Ruby. Recommended/Additional
Experience in the above mentioned languages is recommended but not essential.

Classroom Training: An Instructor led training in our dynamic learning environment based in our office at West London. The classroom is fitted with all the essential amenities needed to ensure a comfortable training experience and with this training you will have an opportunity to build a Networking with other learners, share experiences and develop social interaction.

Online: Unlike most organisations our online based training is a tutor led training system similar to the classroom based training in every given aspect making it more convenient to the students from any location around the world and also cost effective.

Onsite: This training is specifically made for the Corporate clients who wish to train their staff in different technologies. The clients are given an opportunity where they can tailor the duration of course according to their requirements and the training can be delivered in house/ at your location of choice or online.

Customised one to one: A tailored course for students looking for undeterred attention from the tutor at all the times. The duration of course and contents of the course are specifically customised to suite the students requirements. In addition to it the timings of the trainings can also be customised based on the availability of both the tutor as well as the student.

5Weekends, 40 Hours, 10AM - 2PM

Contractors can expect to earn between £300 and £500 per day depending on the experience. Permanent roles on average offer a salary of between £30 and £60k per annum, again depending on the experience required for the job.

Although there is no guarantee of a job on course completion we are almost certain that you should be able to find a suitable position within a few weeks after successful completion of the course. As a part of Placement service, we offer CV reviewing in which your CV would be reviewed by our experts and essential modifications to be made would be recommended so that your CV suits perfectly to the kind of training you have taken.

Course Preview

- what is the Motivation for Hadoop
- Large scale system training
- Survey of data storage literature
- Literature survey of data processing
- Overview Of Networking constraints
- New approach requirements

- Hadoop Introduction
- Distributed file system of Hadoop
- Map reduction of Hadoop works
- Hadoop cluster and its anatomy
- Hadoop demons
- Master demons
- Name node
- Tracking of job
- Secondary node detection
- Slave daemons
- Tracking of task
- Hadoop Distributed File System (HDFS)
- Spilts and blocks
- Input Spilts
- HDFS spilts
- Replication of data
- Awareness of Hadoop racking
- High availably of data
- Block placement and cluster architecture
- Hadoop case studies
- Practices & Tuning of performances
- Development of mass reduce programs
- Local mode
- Running without HDFS
- Pseudo-distributed mode
- All daemons running in a single mode
- Fully distributed mode
- Dedicated nodes and daemon running

- Setup of Hadoop cluster
- Cluster of a Hadoop setup.
- Configure and Install Apache Hadoop on a multi node cluster.
- In a distributed mode, configure and install Cloud era distribution.
- In a fully distributed mode, configure and install Horton works distribution
- In a fully distributed mode, configure the Green Plum distribution.
- Monitor the cluster
- Get used to the management console of Horton works and Cloud era.
- Name the node in a safe mode
- Data backup.
- Case studies
- Monitoring of clusters

- What is Map Reduce Program
- Sample the mapreduce program.
- API concepts and their basics
- Driver code
- Mapper
- Reducer
- Hadoop AVI streaming
- Performing several Hadoop jobs
- Configuring close methods
- files Sequencing
- Record reading
- Record writer
- Reporter and its role
- Counters
- Output collection
- Assessing HDFS
- Tool runner
- Use of distributed CACHE
- Several MapReduce jobs (In Detailed)
- SEARCH USING MAPREDUCE
- GENERATING THE RECOMMENDATIONS USING MAPREDUCE
-PROCESSING THE LOG FILES USING MAPREDUCE
- Mapper Identification
- Reducer Identification
- Exploring the problems using this application
- Debugging the MapReduce Programs
- MR unit testing
- Logging
- Debugging strategies
- Advanced MapReduce Programming
- Secondary sort
- Output and input format customization
- Mapreduce joins
- Monitoring & debugging on a Production Cluster
- Counters
- Skipping Bad Records
- Running the local mode
- MapReduce performance tuning
- Reduction network traffic by combiner
- Partitioners
- Reducing of input data
- Using Compression
- Reusing the JVM
- Running speculative execution
- Performance Aspects
- CASE STUDIES

- Name Node Availability
- Name Node federation
- Fencing
- MapReduce

- Introduction to HBase
- Explain HBase concepts
- Overview Of HBase architecture
- Server architecture
- File storage architecture
- Column access
- Scans
- HBase cases
- Installation and configuration of HBase on a multi node
- Create database, Develop and run sample applications
- Access data stored in HBase using clients like Python, Java and Pearl
- Map Reduce client
- HBase and Hive Integration
- HBase administration tasks
- Defining Schema and its basic operations.
- Cassandra Basics
- MongoDB Basics

- Sqoop
- Configure and Install Sqoop
- Connecting RDBMS
- Installation of Mysql
- Importing the data from Oracle/Mysql to hive
- Exporting the data to Oracle/Mysql
- Internal mechanism

- Oozie and its architecture
- XML file
- Install and configuring Apache
- Work flow Specification
- Action nodes
-Control nodes
- Job coordinator
- Avro, Scribe, Flume, Chukwa, Thrift
1. Concepts of Flume and Chukwa
2. Use cases of Scribe, Thrift and Avro
3. Installation and configuration of flume
4. Creation of a sample application