efficient content location using interest-based locality in peer-to-peer systems presented by: lin...

26
Efficient Content Location Using Interest-based Locality in Peer- to-Peer Systems Presented by: Lin Wing Kai

Post on 20-Dec-2015

219 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Efficient Content Location Using Interest-based Locality in Peer-to-Peer Systems Presented by: Lin Wing Kai

Efficient Content Location Using Interest-based Locality in Peer-to-Peer

Systems

Presented by: Lin Wing Kai

Page 2: Efficient Content Location Using Interest-based Locality in Peer-to-Peer Systems Presented by: Lin Wing Kai

OutlineBackgroundDesign of Interest-based LocalitySimulation of Interest-based LocalityEnhancement of Interest-based LocalityUnderstanding the scheme

Page 3: Efficient Content Location Using Interest-based Locality in Peer-to-Peer Systems Presented by: Lin Wing Kai

Background3 types of P2P systemsCentralized P2P: NapsterDecentralized Unstructured: GnutellaDecentralized Structured: Distributed Hash Table (DHT)

Page 4: Efficient Content Location Using Interest-based Locality in Peer-to-Peer Systems Presented by: Lin Wing Kai

BackgroundEach peer is connected randomly, and searching is done by flooding.Allow keyword search

Example of searching a mp3 file in Gnutella network. The query is flooded across the network.

Page 5: Efficient Content Location Using Interest-based Locality in Peer-to-Peer Systems Presented by: Lin Wing Kai

BackgroundDHT (Chord):

Given a key, Chord will map the key to the node.Each node need to maintain O(log N) informationEach query use O(log N) messages.Key search means searching by exact name

An chord with about 50 nodes.

The black lines point to adjacent nodes while the red lines are “finger” pointers that allow a node to find key in O(log N) time.

Page 6: Efficient Content Location Using Interest-based Locality in Peer-to-Peer Systems Presented by: Lin Wing Kai

OutlineBackgroundDesign of Interest-based LocalitySimulation of Interest-based LocalityEnhancement of Interest-based LocalityUnderstanding the scheme

Page 7: Efficient Content Location Using Interest-based Locality in Peer-to-Peer Systems Presented by: Lin Wing Kai

Interest-based Locality

Peers have similar interest will share similar contents

Page 8: Efficient Content Location Using Interest-based Locality in Peer-to-Peer Systems Presented by: Lin Wing Kai

ArchitectureShortcuts are modular.Shortcuts are performance enhancement hints.

Page 9: Efficient Content Location Using Interest-based Locality in Peer-to-Peer Systems Presented by: Lin Wing Kai

Creation of shortcutsThe peer use the underlying topology (e.g. Gnutella) for the first few searches.One of the return peers is selected from random and added to the shortcut lists.Each shortcut will be ordered by the metric, e.g. success rate, path latency.Subsequent queries go through the shortcut lists first.If fail, lookup through underlying topology.

Page 10: Efficient Content Location Using Interest-based Locality in Peer-to-Peer Systems Presented by: Lin Wing Kai

OutlineBackgroundDesign of Interest-based LocalitySimulation of Interest-based LocalityEnhancement of Interest-based LocalityUnderstanding the scheme

Page 11: Efficient Content Location Using Interest-based Locality in Peer-to-Peer Systems Presented by: Lin Wing Kai

Performance EvaluationPerformance metric:

success rateload characteristics (query packets per peers process in the system)query scope (the fraction of peers in each query)minimum reply path lengthadditional state kept in each node

Page 12: Efficient Content Location Using Interest-based Locality in Peer-to-Peer Systems Presented by: Lin Wing Kai

Methodology – query workload

Create traffic trace from the real application traffic:Boeing firewall proxiesMicrosoft firewall proxiesPassively collect the web traffic between CMU and the InternetPassively collect typical P2P traffic (Kazza, Gnutella)

Use exact matching rather than keyword matching in the simulation.

“song.mp3” and “my artist – song.mp3” will be treated as different.

Page 13: Efficient Content Location Using Interest-based Locality in Peer-to-Peer Systems Presented by: Lin Wing Kai

Methodology – Underlying peers topology

Based on the Gnutella connectivity graph in 2001, with 95% nodes about 7 hops away.Searching TTL is set to 7.For each kind of traffic (Boeing, Microsoft… etc), run 8 times simulations, each with 1 hour.

Page 14: Efficient Content Location Using Interest-based Locality in Peer-to-Peer Systems Presented by: Lin Wing Kai

Methodology – Storage and replication modeling (web)

The first peer make the web request will be modeled as first node containing the web pages.Subsequent search from other peers will search from this peer and replicate the page.

a

b

cNode a is the first peer to search for a.html, and it will be modeled as the first node containing a.html

a.html

a.html

a.html

node b retrieve a.html from node a

node c can retrieve a.html from node a, node b

Page 15: Efficient Content Location Using Interest-based Locality in Peer-to-Peer Systems Presented by: Lin Wing Kai

Methodology – Storage and replication modeling (P2P)

From the traffic trace collected, if a file is downloaded for download at t0.

The file should also be available for download before t0.

However, if the file isn’t downloaded during the sampled trace,

There is no information to indicate the existence of the file.

tS t=t0 simulation end (tE)

File is downloaded from t0

Page 16: Efficient Content Location Using Interest-based Locality in Peer-to-Peer Systems Presented by: Lin Wing Kai

Simulation Results – success rate

Page 17: Efficient Content Location Using Interest-based Locality in Peer-to-Peer Systems Presented by: Lin Wing Kai

Simulation Results –load, scope and path length

-- Query load for Boeing and Microsoft Traffic:

-- Query scope for shortcut scheme is about 0.3%, where in Gnutella is about 100%.

-- Average path length of the traces:

Page 18: Efficient Content Location Using Interest-based Locality in Peer-to-Peer Systems Presented by: Lin Wing Kai

OutlineBackgroundDesign of Interest-based LocalitySimulation of Interest-based LocalityEnhancement of Interest-based LocalityUnderstanding the scheme

Page 19: Efficient Content Location Using Interest-based Locality in Peer-to-Peer Systems Presented by: Lin Wing Kai

Increase Number of Shortcuts

7 ~ 12 % performance gain

Diminished return

Add all shortcut at a time, no limit on the shortcut size

Add k shortcut at a time, only 100 shortcuts are used.

Page 20: Efficient Content Location Using Interest-based Locality in Peer-to-Peer Systems Presented by: Lin Wing Kai

Using Shortcuts’ ShortcutsIdea:

Add the shortcut’s shortcut

Performance gain of 7% on average

Page 21: Efficient Content Location Using Interest-based Locality in Peer-to-Peer Systems Presented by: Lin Wing Kai

OutlineBackgroundDesign of Interest-based LocalitySimulationEnhancement of Interest-based LocalityUnderstanding the scheme

Page 22: Efficient Content Location Using Interest-based Locality in Peer-to-Peer Systems Presented by: Lin Wing Kai

Interest-based StructuresWhen viewed as an undirected graph:

In the first 10 minutes, there are many connected components, each component has a few peers in between.At the end of simulation, there are few connected components, each component has several hundred peers. Each component is well connected.The clustering coefficient is about 0.6 ~ 0.7, which is higher than that in Web graph.

Page 23: Efficient Content Location Using Interest-based Locality in Peer-to-Peer Systems Presented by: Lin Wing Kai

Web Objects LocalityWebpage contains several web objects, locality should exists in between these objects.

There is performance drop of 10% when we retrieve web objects rather than webpages.

Performance is gained back when we exhaust all the shortcuts.

Page 24: Efficient Content Location Using Interest-based Locality in Peer-to-Peer Systems Presented by: Lin Wing Kai

Locality Across PublishersSame publisher exhibit low interest locality, peer actually may interest different publishers content.

Same publisher shortcuts means shortcuts that are originally created as accessing the same content from the same publisher for the current request.

Page 25: Efficient Content Location Using Interest-based Locality in Peer-to-Peer Systems Presented by: Lin Wing Kai

Sensitivity of ShortcutsRun Interest based shortcuts over DHT (Chord) instead of Gnutella.

Query load is reduced by a factor 2 – 4.

Query scope is reduced from 7/N to 1.5/N

Page 26: Efficient Content Location Using Interest-based Locality in Peer-to-Peer Systems Presented by: Lin Wing Kai

ConclusionInterest based shortcuts are modular and performance enhancement hints over existing P2P topology.Shortcuts are proven can enhance the searching efficiencies.Shortcuts form clusters within a P2P topology, and the clusters are well connected.