youngil kim awalin sopan sonia ng zeng. introduction concept of the project system architecture ...

18
P2P Control System based on Map/Reduce Youngil Kim Awalin Sopan Sonia Ng Zeng

Upload: maryann-walters

Post on 17-Jan-2016

214 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Youngil Kim Awalin Sopan Sonia Ng Zeng.  Introduction  Concept of the Project  System architecture  Implementation – HDFS  Implementation – System

P2P Control System based on Map/Reduce

Youngil KimAwalin Sopan

Sonia Ng Zeng

Page 2: Youngil Kim Awalin Sopan Sonia Ng Zeng.  Introduction  Concept of the Project  System architecture  Implementation – HDFS  Implementation – System

Introduction Concept of the Project System architecture Implementation – HDFS Implementation – System Analysis

◦ System Information Logger (SIL)◦ System Information Gatherer (SIG)◦ Map/Reduce

Implementation – Visualization Implementation – P2P Application Demo

Outline

Page 3: Youngil Kim Awalin Sopan Sonia Ng Zeng.  Introduction  Concept of the Project  System architecture  Implementation – HDFS  Implementation – System

How can we know system information from many nodes?◦ It is hard to track which node has a problem when

too many nodes exist

But… HDFS and Map/Reduce make it easy!◦ Gather system information of each node to HDFS◦ Analyze system information using Map/Reduce◦ A kind of network managing system like HP’s

OpenView

Introduction

Page 4: Youngil Kim Awalin Sopan Sonia Ng Zeng.  Introduction  Concept of the Project  System architecture  Implementation – HDFS  Implementation – System

Tool to have an overview of the nodes in the P2P◦ Still preserving the de-centralized nature of P2P◦ Can be run on any computer – from within the P2P or

outside of it. So, the computer running the tool is not necessarily the “master”

◦ If the tool is not running, the P2P still remains intact Still, one can control the P2P from the tool The tool will provide an interface to do both:

overview and control◦ Therefore, the user does not need to be an expert to

work with a network system

Concept of the Project

Page 5: Youngil Kim Awalin Sopan Sonia Ng Zeng.  Introduction  Concept of the Project  System architecture  Implementation – HDFS  Implementation – System

System Architecture

p2p Local

P2P app.

p2p Local

P2P app.

p2p Local

P2P app.

p2p Local

P2P app.

P2PNetwork

Page 6: Youngil Kim Awalin Sopan Sonia Ng Zeng.  Introduction  Concept of the Project  System architecture  Implementation – HDFS  Implementation – System

System Architecture

System Info Gatherer

(Hadoop Master)

Hadoop Slave Node

HadoopSlave

HadoopSlave

HadoopSlave

HDFS

p2p Local

P2P app.

p2p Local

P2P app.

p2p Local

P2P app.

p2p Local

P2P app.

Sys Info Logger

Sys InfoLogger

Sys Info Logger

Sys Info Logger

P2PNetwork

Page 7: Youngil Kim Awalin Sopan Sonia Ng Zeng.  Introduction  Concept of the Project  System architecture  Implementation – HDFS  Implementation – System

System Architecture

System Info Gatherer

(Hadoop Master)

Hadoop Slave Node

HadoopSlave

HadoopSlave

HadoopSlave

HDFS

SystemManager

(Visualization)

p2p Local

P2P app.

p2p Local

P2P app.

p2p Local

P2P app.

p2p Local

P2P app.

Sys Info Logger

Sys InfoLogger

Sys Info Logger

Sys Info Logger

SystemControlNetwork

P2PNetwork

SystemInformation

Page 8: Youngil Kim Awalin Sopan Sonia Ng Zeng.  Introduction  Concept of the Project  System architecture  Implementation – HDFS  Implementation – System

Implemented minimal P2P to show how our tool works◦ How to control application or system on each

node using visualization◦ Has STOP/RESUME operations

Functions◦ Response to “QUERY” Show active/inactive

(overview)◦ Response to “CONTROL” Change node status

based on control argument

(active/inactive)

Implementation – P2P Application

Page 9: Youngil Kim Awalin Sopan Sonia Ng Zeng.  Introduction  Concept of the Project  System architecture  Implementation – HDFS  Implementation – System

Hadoop for DFS & Map/Reduce Framework◦ We use bug cluster◦ Master: brood00◦ Slaves: Currently tested with 5 nodes

(bug51 ~ bug55)

◦ Using each local storage Using “/tmp” directory because home directory is not a

local storage but NFS volume.

◦ Network Ports: hdfs(9000), job tracker(9001), Namenode Interface (50070), JobTracker Interface (50030)

Implementation - HDFS

Page 10: Youngil Kim Awalin Sopan Sonia Ng Zeng.  Introduction  Concept of the Project  System architecture  Implementation – HDFS  Implementation – System

Implementation - System Analysis

Page 11: Youngil Kim Awalin Sopan Sonia Ng Zeng.  Introduction  Concept of the Project  System architecture  Implementation – HDFS  Implementation – System

mr_syslog.py◦ Implemented in Python◦ Saves information in both local storage and HDFS◦ Gathers information every 10 secs◦ Creates logfile based on time

Information of each node is saved with the following format◦ < 20110501_2252_bug51.log >◦ bug51 1304304720: mem(75.50), cpu(1.00), disk(10.00)◦ bug51 1304304724: mem(75.50), cpu(1.50), disk(10.00)◦ bug51 1304304727: mem(75.51), cpu(0.40), disk(10.00)◦ bug51 1304304729: mem(75.51), cpu(0.50), disk(10.00)◦ bug51 1304304732: mem(75.50), cpu(0.50), disk(10.00)◦ bug51 1304304734: mem(75.50), cpu(0.40), disk(10.00)

System Information Logger (SIL)

Page 12: Youngil Kim Awalin Sopan Sonia Ng Zeng.  Introduction  Concept of the Project  System architecture  Implementation – HDFS  Implementation – System

Functions◦ Find current resource usage of each node at current

time using Map/Reduce Currently, it shows maximum values per minute time slot

◦ Communication Gateway between nodes and visualization tool Send “QUERY” to each P2P application to check on the

status of each node Send node status to visualization tool

Node ID Status (in/active) CPU Usage Memory Usage Disk Storage

System Information Gatherer (SIG)

Page 13: Youngil Kim Awalin Sopan Sonia Ng Zeng.  Introduction  Concept of the Project  System architecture  Implementation – HDFS  Implementation – System

Map:◦ Input – each node log file

Key: position of file Value: raw data, one line per key

◦ Output Key: node ID Value: set of system information

(CPU/memory/storage usage) Eg: < bug51, [30.0, 29.0, 12.0] >

Map/Reduce

Page 14: Youngil Kim Awalin Sopan Sonia Ng Zeng.  Introduction  Concept of the Project  System architecture  Implementation – HDFS  Implementation – System

Reduce:◦ Input – from Map

Key: node ID Value: set of set of system information Eg: < bug51, [ [30.0, 29.0, 12.0], [33.0, 40.0, 9.0], … ]

>◦ Output

Key: Node ID Value: Maximum values for each piece of information Eg: < bug51, [33.0, 40.0, 12.0] >

Map/Reduce

Page 15: Youngil Kim Awalin Sopan Sonia Ng Zeng.  Introduction  Concept of the Project  System architecture  Implementation – HDFS  Implementation – System

Written in Java Used Prefuse toolkit for a tabular

visualization for the node status Only need to use the right-click menu to

control the node Live communication with the nodes

◦ To query the node status from the SIG◦ To send commands to the nodes in the P2P

network in real-time

Implementation - Visualization

Page 16: Youngil Kim Awalin Sopan Sonia Ng Zeng.  Introduction  Concept of the Project  System architecture  Implementation – HDFS  Implementation – System

Initial view of all nodes

After stopping Bug53

Visualization

Page 17: Youngil Kim Awalin Sopan Sonia Ng Zeng.  Introduction  Concept of the Project  System architecture  Implementation – HDFS  Implementation – System

System set-up and initialization (video file) Show namenode & jobtracker interface

Show Map/Reduce jobs Show Visualization tool

◦ Changes of each status◦ Control each P2P application

Demo

Page 18: Youngil Kim Awalin Sopan Sonia Ng Zeng.  Introduction  Concept of the Project  System architecture  Implementation – HDFS  Implementation – System