quantitative evaluation of unstructured peer-to-peer architectures fabrício benevenuto josé ismael...

Quantitative Evaluation of Unstructured Peer-to-Peer

Architectures

Fabrício Benevenuto José Ismael Jr.

Jussara M. Almeida

Department of Computer ScienceFederal University of Minas Gerais

Brazil

Motivation• P2P systems are responsible for a large portion of

Internet traffic

• First generation unstructured P2P systems are decreasing in popularity due to poor scalability

– Ex: original Gnutella protocol (v. 0.4)

• New popular hybrid unstructured P2P systems

– Explore heterogeneity inherent to peers

• Super-peers: highly available and powerful peers

– Intuitively more scalable

• Ex: KaZaA, Gnutella 2

Goals

• Quantify the main performance benefits provided by each individual feature of super-peer architectures

• Provide insights to guide the design of future P2P systems

Outline• Overview of unstructured P2P architectures

– Message flooding architecture

– Hybrid super-peer architecture

• Evaluation methodology

– Simulation environment

– Performance metrics

• Results

• Conclusions and future Work

Overview of Unstructured Peer-to-Peer Architectures

• Message Flooding Architecture– First generation: Gnutella 0.4

– Poor scalability due to network overload

• Super-Peer Architectures– Explore peers heterogeneity: Gnutella 2.0, Kazaa

• Super-peers: typically more powerful and available

– Intuitively better scalability due to several new features

Content Location in Message Flooding Architecture

Gnutella 0.4

Content Location in Super-Peer Architecture

Gnutella 2.0

Features of Gnutella 2.0 Architecture • Super-peer backbone speed-up search

– A super-peer that receives a query from a leaf or initiates a new query only forwards it to other super-peers directly connected to it (one hop away)

• Content-aware query routing mechanism– A super-peer only forwards a query to other super-peers

or leaves where there is a chance file is stored • Super-peers maintain local query hash tables

• User-controlled query retransmission– User may restart query from other super-peers hoping to

increase number of hits and reduce download time

• Swarm Download– User downloads file pieces from multiple peers: expect

reduced download time (feature in other systems as well)

How much performance benefit does each such feature provide over the

original message-flooding based Gnutella 0.4 protocol?

Evaluation Methodology• Simulators

– Previous optimized message flooding Gnutella with communities

• Communities explore locality of interests among peers• Content search first in a peer’s community.

If not found, use original message flooding mechanism

• Significant system load reductions [BCAA04]

– New super-peer Gnutella 2.0 protocol (specification)

– Both simulators: heterogeneous aspects found in real systems

• Performance Metrics– System load: average # messages processed by each

node

– Query latency: time until download starts

– Query success rate: % queries successfully responded

– Download time: average download time

Most Relevant Results

• Message flooding + peer community vs. Super-Peer backbone

• Content-aware query routing mechanism

• User-controlled query retransmission

• Swarm download

Message-Flooding + Community vs. Super-peer Backbone

Syst

em L

oad

Super-peer: Avg # msgs processed by a peer drops by roughly 95% 50% reduction on latency: limited traffic over the backbone (paper) Shorter average download times (paper)Query success rate is the same for both architectures (~90%)

Content-Aware Query Routing

Further reductions on average system load:

Content-aware query routing: system load drops by a factor of 41

Syst

em L

oad

No Query Routing0 Query

Retransmissions

Query Routing0 Query

Retransmissions

No Query Routing2 Query

Retransmissions

Query Routing2 Query

Retransmissions

User-Controlled Query Retransmission

Query success rate: one or two retransmissions deliver most performance gains (96-98% success rate): diminishing returns

Average Download Time: significant reductions for small music files (more popular workload)

Query latency and system load increases linearly with # retransmissions (paper)

Que

ry S

ucce

ss R

ate

(%

)

Red

ucti

on o

n A

vger

age

Dow

nloa

d T

ime

(%)

Time # Retransmissions

User-Controlled Query Retransmission

Syst

em L

oad

System load increases linearly with # retransmissions

mainly query and query hits, as expected

Swarm Download

Average Download Time (sec)

2393102217410

23721016164All possible

25089752244

23609362353

24058852632

26478112771

VideoTV ShowMusic

Workload# Simultanous

Downloads

Only reduces download time significantly for small files

Higher probability of downloading from low bandwidth or highly utilized peers as number of simultaneous downloads increases

Bottleneck especially critical for large files

Conclusions and Future Work• Conclusions

– Super-peer architecture itself provides much better scalability over optimized message flooding protocol

• 95% system load reduction but same query success rate

– Content-aware query routing provides further load reductions

– One or two query retransmissions should be enough to provide almost max query success rate, 40% download time reduction for small files, while keeping latency and load at low levels

– Swarm download may be detrimental to performance if download sources are not carefully selected

• Future Work– Extend performance evaluation to allow peers to

dynamically join and leave system

– Design new peer selection policies that explore locality of interest (peer communities) and peer characteristics

quantitative evaluation of unstructured peer-to-peer architectures fabrício benevenuto josé ismael...

Documents