lab 2: running a hadoop application

14
2: Running a Hadoop Application Zubair Nabi [email protected] April 18, 2013 Zubair Nabi 2: Running a Hadoop Application April 18, 2013 1/8

Upload: zubair-nabi

Post on 11-Nov-2014

850 views

Category:

Technology


2 download

DESCRIPTION

Cloud Computing Workshop 2013, ITU

TRANSCRIPT

Page 1: Lab 2: Running a Hadoop Application

2: Running a Hadoop Application

Zubair Nabi

[email protected]

April 18, 2013

Zubair Nabi 2: Running a Hadoop Application April 18, 2013 1 / 8

Page 2: Lab 2: Running a Hadoop Application

Running Hadoop

The first order of the day is to format the Hadoop DFS

Jump to the Hadoop directory and execute: bin/hadoopnamenode -format

To run Hadoop and HDFS: bin/start-all.sh

To terminate them: bin/stop-all.sh

Zubair Nabi 2: Running a Hadoop Application April 18, 2013 2 / 8

Page 3: Lab 2: Running a Hadoop Application

Running Hadoop

The first order of the day is to format the Hadoop DFS

Jump to the Hadoop directory and execute: bin/hadoopnamenode -format

To run Hadoop and HDFS: bin/start-all.sh

To terminate them: bin/stop-all.sh

Zubair Nabi 2: Running a Hadoop Application April 18, 2013 2 / 8

Page 4: Lab 2: Running a Hadoop Application

Running Hadoop

The first order of the day is to format the Hadoop DFS

Jump to the Hadoop directory and execute: bin/hadoopnamenode -format

To run Hadoop and HDFS: bin/start-all.sh

To terminate them: bin/stop-all.sh

Zubair Nabi 2: Running a Hadoop Application April 18, 2013 2 / 8

Page 5: Lab 2: Running a Hadoop Application

Running Hadoop

The first order of the day is to format the Hadoop DFS

Jump to the Hadoop directory and execute: bin/hadoopnamenode -format

To run Hadoop and HDFS: bin/start-all.sh

To terminate them: bin/stop-all.sh

Zubair Nabi 2: Running a Hadoop Application April 18, 2013 2 / 8

Page 6: Lab 2: Running a Hadoop Application

Generating a dataset

Create a temporary directory to hold the data: mkdir/tmp/gutenberg

Jump to it: cd /tmp/gutenberg

Download text files:I wget www.gutenberg.org/etext/20417I wget www.gutenberg.org/etext/5000I wget www.gutenberg.org/etext/4300

Zubair Nabi 2: Running a Hadoop Application April 18, 2013 3 / 8

Page 7: Lab 2: Running a Hadoop Application

Generating a dataset

Create a temporary directory to hold the data: mkdir/tmp/gutenberg

Jump to it: cd /tmp/gutenberg

Download text files:I wget www.gutenberg.org/etext/20417I wget www.gutenberg.org/etext/5000I wget www.gutenberg.org/etext/4300

Zubair Nabi 2: Running a Hadoop Application April 18, 2013 3 / 8

Page 8: Lab 2: Running a Hadoop Application

Generating a dataset

Create a temporary directory to hold the data: mkdir/tmp/gutenberg

Jump to it: cd /tmp/gutenberg

Download text files:I wget www.gutenberg.org/etext/20417I wget www.gutenberg.org/etext/5000I wget www.gutenberg.org/etext/4300

Zubair Nabi 2: Running a Hadoop Application April 18, 2013 3 / 8

Page 9: Lab 2: Running a Hadoop Application

Copying the dataset to the HDFS

Jump to the Hadoop directory and execute: bin/hadoop dfs-copyFromLocal /tmp/gutenberg /ccw/gutenberg

Zubair Nabi 2: Running a Hadoop Application April 18, 2013 4 / 8

Page 10: Lab 2: Running a Hadoop Application

Running Wordcount

Execute: bin/hadoop jar hadoop-examples-1.0.4.jarwordcount /ccw/gutenberg /ccw/gutenberg-output

Zubair Nabi 2: Running a Hadoop Application April 18, 2013 5 / 8

Page 11: Lab 2: Running a Hadoop Application

Retrieving results from the HDFS

Copy to the local FS: bin/hadoop dfs -getmerge/ccw/gutenberg-output /tmp/gutenberg-output

Zubair Nabi 2: Running a Hadoop Application April 18, 2013 6 / 8

Page 12: Lab 2: Running a Hadoop Application

Accessing the web interface

JobTracker: http://localhost:50030

TaskTracker: http://localhost:50060

Zubair Nabi 2: Running a Hadoop Application April 18, 2013 7 / 8

Page 13: Lab 2: Running a Hadoop Application

Accessing the web interface

JobTracker: http://localhost:50030

TaskTracker: http://localhost:50060

Zubair Nabi 2: Running a Hadoop Application April 18, 2013 7 / 8

Page 14: Lab 2: Running a Hadoop Application

Reference(s)

Running Hadoop on Ubuntu Linux (Single-Node Cluster):http://www.michael-noll.com/tutorials/running-hadoop-on-ubuntu-linux-single-node-cluster/

Zubair Nabi 2: Running a Hadoop Application April 18, 2013 8 / 8