Orzota
back

Eclipse Setup for Hadoop Development

Avatar
by Shanti Subramanyam for Blog
Eclipse Setup for Hadoop Development

Objectives

We will learn the following things with this Eclipse Setup for Hadoop tutorial 
  • Setting Up the Eclipse plugin for Hadoop
  • Testing the running of Hadoop MapReduce jobs

Prerequisites

The following are the prerequisites for Eclipse setup for Hadoop program development using MapReduce and further extensions.
  • You should have the latest stable build of Hadoop (as of writing this article 1.0.3)
  • You should have eclipse installed on your machine. Any eclipse before 3.6 is compatible to the eclipse plugin. (Doesn’t work with Eclipse 3.7) Please refer to Link 6 of the Appendix for details of how to get and install Eclipse for your platform of development.
  • It is expected that you have a preliminary knowledge of Java programming and are familiar with the concepts involved in Java Programming such as Classes and Objects, Inheritances, and Interfaces/Abstract Classes.
  • Please refer the blog describing Hadoop installation (refer here).

Procedure

1. Download the Eclipse plugin from the link given here.
eclipse plugin
We are utilizing this plugin to support the newer versions of eclipse and the newer versions of Hadoop. The eclipse-plugins packaged with earlier versions of Hadoop do not work.
2. After downloading the link, it needs to be copied into the plugins folder of eclipse.
  • Windows: Go to the eclipse folder located by default at C:eclipse (You may have installed it elsewhere). Copy the downloaded plugin to the eclipseplugins folder.
  • Mac OS X: Eclipse comes as an archive. Unarchive it and find the folder name ../eclipse/plugins/. Copy the downloaded plugin inside it.
  • Linux Version: The plugins folder is in the eclipse installation directory. Normally the /usr/share/eclipse folder is the installation directory of eclipse on Linux Machines
3. After you have copied the plugin, start Eclipse (restart, if it was already started) to reflect the changes in the Eclipse environment.
Go to the Window option of the menu bar in eclipse, 
eclipse menu bar
Select the option “Other” from the drop-down list.
start eclipse
Select the option Map/Reduce from the list of Perspectives.
map/reduce
As you select the “Map/Reduce” perspective, you’ll notice few additions in the eclipse views.
Notice the Perspective Name 
set up eclipse
The DFS locations are now directly accessible through the Eclipse configurations. Use the MapReduce Locations View to configure Hadoop Access from Eclipse.
DFS Locations
4. You have now setup Eclipse for MapReduce programming.
Right Click on Eclipse Package Explorer view or click on file option in the menu bar of your eclipse distribution.
setup Eclipse
Select the new option and then, the “Other” option in it. (Ctrl+N or Command+N)
eclipse plugin
Select Map/Reduce Project
select map reduce
Fill in the details in the Project Wizard.
MapReduce
The Hadoop library location must be the specified location in $HADOOP_HOME.
hadoop mapreduce
Now, after the Project has been created, create the Mapper, Reduce and the Driver Classes, by right-clicking the project and getting the options from New option.
Eclipse for MapReduce
Create a Mapper classes
mapper create
After creating a Mapper Class, the code snippet created is as following

… // Necessary Classes Imported

public class BookCrossingMapper extends MapReduceBase implements Mapper {

public void map(WritableComparable key, Writable values,

OutputCollector output, Reporter reporter) throws IOException {

}

}

Please replace the above code snippet with the one below,

… // Necessary Classes imported

public class BookCrossingMapper extends MapReduceBase implements Mapper<K,V,K,V> {

@Override

public void map(K arg0, V arg1, OutputCollector<K, V> arg2,

Reporter arg3) throws IOException {

// TODO Replace K and V with the suitable data types

}

}

Replace K and V with the required Key and Value datatypes such as LongWritable, Text, etc. Repeat the same steps while creating the new Reducer Class.

You are now ready to start programming in MapReduce.

Next Steps
My next blog article will be an intro to MapReduce programming, so stay tuned.

 

Single-Node Hadoop Tutorial
Prev post Single-Node Hadoop Tutorial

Objectives We will learn the following things with this single-node Hadoop tutorial Setting Up Hadoop…

MapReduce Tutorial
Next post MapReduce Tutorial

Objective We will learn the following things with this step-by-step MapReduce tutorial MapReduce programming with…