Custom Input Format in MapReduce

Custom Input Format: Before implementing Custom Input Format, please find the answer for what is Input Format. InputFormat describes the input-specification for a Map-Reduce job. (wiki) The Map-Reduce framework relies on the InputFormat of the job to: Validate the input-specification of the job. Split-up the input file(s) into logical InputSplits, each of which is then assigned to an individual Mapper. … More Custom Input Format in MapReduce

Hadoop Installation : 2.6.0 Part II

This post is continuation of Part I. Please check the Part I here. We have downloaded the Hadoop and configured the SSH as well. Now we are going to start with Hadoop configuration files. 3. /usr/local/hadoop/etc/hadoop/core-site.xml: The /usr/local/hadoop/etc/hadoop/core-site.xml file contains configuration properties that Hadoop uses when starting up. This file can be used to override the default … More Hadoop Installation : 2.6.0 Part II

Open Data Platform

The Open Data Platform (ODP) initiative is an industry effort focused on simplifying adoption of Apache Hadoop for the enterprise, and enabling big data solutions to flourish through improved ecosystem interoperability. It relies on the governance of the Apache Software Foundation community to innovate and deliver the Apache project technologies included in the ODP core … More Open Data Platform

Moving Big Data from Mainframe to Hadoop

A blog from Cloudera. Apache Sqoop provides a framework to move data between HDFS and relational databases in a parallel fashion using Hadoop’s MR framework. As Hadoop becomes more popular in enterprises, there is a growing need to move data from non-relational sources like mainframe datasets to Hadoop. Following are possible reasons for this: HDFS … More Moving Big Data from Mainframe to Hadoop