lab 2: running a hadoop application
DESCRIPTION
Cloud Computing Workshop 2013, ITUTRANSCRIPT
2: Running a Hadoop Application
Zubair Nabi
April 18, 2013
Zubair Nabi 2: Running a Hadoop Application April 18, 2013 1 / 8
Running Hadoop
The first order of the day is to format the Hadoop DFS
Jump to the Hadoop directory and execute: bin/hadoopnamenode -format
To run Hadoop and HDFS: bin/start-all.sh
To terminate them: bin/stop-all.sh
Zubair Nabi 2: Running a Hadoop Application April 18, 2013 2 / 8
Running Hadoop
The first order of the day is to format the Hadoop DFS
Jump to the Hadoop directory and execute: bin/hadoopnamenode -format
To run Hadoop and HDFS: bin/start-all.sh
To terminate them: bin/stop-all.sh
Zubair Nabi 2: Running a Hadoop Application April 18, 2013 2 / 8
Running Hadoop
The first order of the day is to format the Hadoop DFS
Jump to the Hadoop directory and execute: bin/hadoopnamenode -format
To run Hadoop and HDFS: bin/start-all.sh
To terminate them: bin/stop-all.sh
Zubair Nabi 2: Running a Hadoop Application April 18, 2013 2 / 8
Running Hadoop
The first order of the day is to format the Hadoop DFS
Jump to the Hadoop directory and execute: bin/hadoopnamenode -format
To run Hadoop and HDFS: bin/start-all.sh
To terminate them: bin/stop-all.sh
Zubair Nabi 2: Running a Hadoop Application April 18, 2013 2 / 8
Generating a dataset
Create a temporary directory to hold the data: mkdir/tmp/gutenberg
Jump to it: cd /tmp/gutenberg
Download text files:I wget www.gutenberg.org/etext/20417I wget www.gutenberg.org/etext/5000I wget www.gutenberg.org/etext/4300
Zubair Nabi 2: Running a Hadoop Application April 18, 2013 3 / 8
Generating a dataset
Create a temporary directory to hold the data: mkdir/tmp/gutenberg
Jump to it: cd /tmp/gutenberg
Download text files:I wget www.gutenberg.org/etext/20417I wget www.gutenberg.org/etext/5000I wget www.gutenberg.org/etext/4300
Zubair Nabi 2: Running a Hadoop Application April 18, 2013 3 / 8
Generating a dataset
Create a temporary directory to hold the data: mkdir/tmp/gutenberg
Jump to it: cd /tmp/gutenberg
Download text files:I wget www.gutenberg.org/etext/20417I wget www.gutenberg.org/etext/5000I wget www.gutenberg.org/etext/4300
Zubair Nabi 2: Running a Hadoop Application April 18, 2013 3 / 8
Copying the dataset to the HDFS
Jump to the Hadoop directory and execute: bin/hadoop dfs-copyFromLocal /tmp/gutenberg /ccw/gutenberg
Zubair Nabi 2: Running a Hadoop Application April 18, 2013 4 / 8
Running Wordcount
Execute: bin/hadoop jar hadoop-examples-1.0.4.jarwordcount /ccw/gutenberg /ccw/gutenberg-output
Zubair Nabi 2: Running a Hadoop Application April 18, 2013 5 / 8
Retrieving results from the HDFS
Copy to the local FS: bin/hadoop dfs -getmerge/ccw/gutenberg-output /tmp/gutenberg-output
Zubair Nabi 2: Running a Hadoop Application April 18, 2013 6 / 8
Accessing the web interface
JobTracker: http://localhost:50030
TaskTracker: http://localhost:50060
Zubair Nabi 2: Running a Hadoop Application April 18, 2013 7 / 8
Accessing the web interface
JobTracker: http://localhost:50030
TaskTracker: http://localhost:50060
Zubair Nabi 2: Running a Hadoop Application April 18, 2013 7 / 8
Reference(s)
Running Hadoop on Ubuntu Linux (Single-Node Cluster):http://www.michael-noll.com/tutorials/running-hadoop-on-ubuntu-linux-single-node-cluster/
Zubair Nabi 2: Running a Hadoop Application April 18, 2013 8 / 8