quantitative evaluation of unstructured peer-to-peer architectures fabrício benevenuto josé ismael...
TRANSCRIPT
Quantitative Evaluation of Unstructured Peer-to-Peer
Architectures
Fabrício Benevenuto José Ismael Jr.
Jussara M. Almeida
Department of Computer ScienceFederal University of Minas Gerais
Brazil
Motivation• P2P systems are responsible for a large portion of
Internet traffic
• First generation unstructured P2P systems are decreasing in popularity due to poor scalability
– Ex: original Gnutella protocol (v. 0.4)
• New popular hybrid unstructured P2P systems
– Explore heterogeneity inherent to peers
• Super-peers: highly available and powerful peers
– Intuitively more scalable
• Ex: KaZaA, Gnutella 2
Goals
• Quantify the main performance benefits provided by each individual feature of super-peer architectures
• Provide insights to guide the design of future P2P systems
Outline• Overview of unstructured P2P architectures
– Message flooding architecture
– Hybrid super-peer architecture
• Evaluation methodology
– Simulation environment
– Performance metrics
• Results
• Conclusions and future Work
Overview of Unstructured Peer-to-Peer Architectures
• Message Flooding Architecture– First generation: Gnutella 0.4
– Poor scalability due to network overload
• Super-Peer Architectures– Explore peers heterogeneity: Gnutella 2.0, Kazaa
• Super-peers: typically more powerful and available
– Intuitively better scalability due to several new features
Content Location in Message Flooding Architecture
Gnutella 0.4
Content Location in Super-Peer Architecture
Gnutella 2.0
Features of Gnutella 2.0 Architecture • Super-peer backbone speed-up search
– A super-peer that receives a query from a leaf or initiates a new query only forwards it to other super-peers directly connected to it (one hop away)
• Content-aware query routing mechanism– A super-peer only forwards a query to other super-peers
or leaves where there is a chance file is stored • Super-peers maintain local query hash tables
• User-controlled query retransmission– User may restart query from other super-peers hoping to
increase number of hits and reduce download time
• Swarm Download– User downloads file pieces from multiple peers: expect
reduced download time (feature in other systems as well)
How much performance benefit does each such feature provide over the
original message-flooding based Gnutella 0.4 protocol?
Evaluation Methodology• Simulators
– Previous optimized message flooding Gnutella with communities
• Communities explore locality of interests among peers• Content search first in a peer’s community.
If not found, use original message flooding mechanism
• Significant system load reductions [BCAA04]
– New super-peer Gnutella 2.0 protocol (specification)
– Both simulators: heterogeneous aspects found in real systems
• Performance Metrics– System load: average # messages processed by each
node
– Query latency: time until download starts
– Query success rate: % queries successfully responded
– Download time: average download time
Most Relevant Results
• Message flooding + peer community vs. Super-Peer backbone
• Content-aware query routing mechanism
• User-controlled query retransmission
• Swarm download
Message-Flooding + Community vs. Super-peer Backbone
Syst
em L
oad
Super-peer: Avg # msgs processed by a peer drops by roughly 95% 50% reduction on latency: limited traffic over the backbone (paper) Shorter average download times (paper)Query success rate is the same for both architectures (~90%)
Content-Aware Query Routing
Further reductions on average system load:
Content-aware query routing: system load drops by a factor of 41
Syst
em L
oad
No Query Routing0 Query
Retransmissions
Query Routing0 Query
Retransmissions
No Query Routing2 Query
Retransmissions
Query Routing2 Query
Retransmissions
User-Controlled Query Retransmission
Query success rate: one or two retransmissions deliver most performance gains (96-98% success rate): diminishing returns
Average Download Time: significant reductions for small music files (more popular workload)
Query latency and system load increases linearly with # retransmissions (paper)
Que
ry S
ucce
ss R
ate
(%
)
Red
ucti
on o
n A
vger
age
Dow
nloa
d T
ime
(%)
Time # Retransmissions
User-Controlled Query Retransmission
Syst
em L
oad
System load increases linearly with # retransmissions
mainly query and query hits, as expected
Swarm Download
Average Download Time (sec)
2393102217410
23721016164All possible
25089752244
23609362353
24058852632
26478112771
VideoTV ShowMusic
Workload# Simultanous
Downloads
Only reduces download time significantly for small files
Higher probability of downloading from low bandwidth or highly utilized peers as number of simultaneous downloads increases
Bottleneck especially critical for large files
Conclusions and Future Work• Conclusions
– Super-peer architecture itself provides much better scalability over optimized message flooding protocol
• 95% system load reduction but same query success rate
– Content-aware query routing provides further load reductions
– One or two query retransmissions should be enough to provide almost max query success rate, 40% download time reduction for small files, while keeping latency and load at low levels
– Swarm download may be detrimental to performance if download sources are not carefully selected
• Future Work– Extend performance evaluation to allow peers to
dynamically join and leave system
– Design new peer selection policies that explore locality of interest (peer communities) and peer characteristics