tapestry : an infrastructure for fault-tolerant wide-area location and routing

28
Tapestry : An Tapestry : An Infrastructure for Infrastructure for Fault-tolerant Wide-area Fault-tolerant Wide-area Location and Routing Location and Routing Presenter: Chunyuan Liao Presenter: Chunyuan Liao March 6, 2002 March 6, 2002 Ben Y.Zhao , John Kubiatowicz, and Anthony D,Jose phetc. Computer Science Division University of California, Berkeley

Upload: kenneth-mooney

Post on 04-Jan-2016

25 views

Category:

Documents


1 download

DESCRIPTION

Tapestry : An Infrastructure for Fault-tolerant Wide-area Location and Routing. Ben Y.Zhao , John Kubiatowicz, and Anthony D,Josephetc. Computer Science Division University of California, Berkeley. Presenter: Chunyuan Liao March 6, 2002. Outline. Challenges System overview - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Tapestry : An Infrastructure for Fault-tolerant Wide-area Location and Routing

Tapestry : An Infrastructure for Tapestry : An Infrastructure for Fault-tolerant Wide-area Location Fault-tolerant Wide-area Location

and Routingand Routing

Presenter: Chunyuan LiaoPresenter: Chunyuan Liao

March 6, 2002March 6, 2002

Ben Y.Zhao , John Kubiatowicz, and Anthony D,Josephetc.

Computer Science Division

University of California, Berkeley

Page 2: Tapestry : An Infrastructure for Fault-tolerant Wide-area Location and Routing

OutlineOutline ChallengesChallenges System overviewSystem overview Operations, concerned issues & solutionsOperations, concerned issues & solutions

• RouteRoute• Locate Locate • PublishPublish

Evaluation & ConclusionEvaluation & Conclusion ImplementationImplementation Summary & CommentsSummary & Comments

• Insert Insert • DeleteDelete• MoveMove

Page 3: Tapestry : An Infrastructure for Fault-tolerant Wide-area Location and Routing

Project backgroundProject background Driving force : Ubiquitous ComputingDriving force : Ubiquitous Computing OceanStore – A data utility infrastructureOceanStore – A data utility infrastructure Goals:Goals:

– Based on the current untrusted InfrastructureBased on the current untrusted Infrastructure– Achieve Nomadic DataAchieve Nomadic Data

Anytime, AnywhereAnytime, Anywhere Highly scalable, reliableHighly scalable, reliable

and fault-tolerantand fault-tolerant

Basic issues:Basic issues:– Data LocationData Location– RoutingRouting

Page 4: Tapestry : An Infrastructure for Fault-tolerant Wide-area Location and Routing

ChallengesChallenges How to achieve How to achieve

naming, location and routing naming, location and routing with a with a complex & chaotic computing environment complex & chaotic computing environment

Dynamic natureDynamic nature– Mobile and replicated Data & ServicesMobile and replicated Data & Services– Complex interaction between components, even Complex interaction between components, even

in motionin motion

Traditional approachesTraditional approaches– fail to address the extreme dynamic naturefail to address the extreme dynamic nature

Page 5: Tapestry : An Infrastructure for Fault-tolerant Wide-area Location and Routing

Tapestry : Tapestry : An infrastructure forAn infrastructure forFault-tolerant wide-area Location and RoutingFault-tolerant wide-area Location and Routing

An overlay Location & Routing infrastructureAn overlay Location & Routing infrastructure

built on the IPbuilt on the IP

FeaturesFeatures– Highly scalable Highly scalable : : Decentralized, Decentralized,

Point-2-PointPoint-2-Point

Self-OrganizingSelf-Organizing– Highly fault-tolerantHighly fault-tolerant : : Redundancy, AdaptationRedundancy, Adaptation– Good locality Good locality Content-based routing&location Content-based routing&location– Highly durable Highly durable

Page 6: Tapestry : An Infrastructure for Fault-tolerant Wide-area Location and Routing

Basic Model of TapestryBasic Model of Tapestry Originated in Plaxton SchemeOriginated in Plaxton Scheme Basic components:Basic components:

– NodesNodesServers Servers

Routers Routers

ClientsClients

– ObjectsObjects

Data or ServicesData or Services

– LinkLink

Point-2-Point linkPoint-2-Point link

Page 7: Tapestry : An Infrastructure for Fault-tolerant Wide-area Location and Routing

Operations in TrapestryOperations in Trapestry

NamingNaming RoutingRouting Object LocationObject Location Publishing ObjectsPublishing Objects Inserting/Deleting ObjectsInserting/Deleting Objects Mobile ObjectsMobile Objects

Page 8: Tapestry : An Infrastructure for Fault-tolerant Wide-area Location and Routing

Tapestry - NamingTapestry - Naming Node ID/Object IDNode ID/Object ID

– A fixed length bit stringA fixed length bit string

(4 bits in each level )(4 bits in each level )

84F8, 909884F8, 9098– GlobalGlobal– Randomly generatedRandomly generated– Location-IndependentLocation-Independent– Even distributedEven distributed– Not unique ( shared by Not unique ( shared by

replicas )replicas )

Page 9: Tapestry : An Infrastructure for Fault-tolerant Wide-area Location and Routing

Routing : RulesRouting : Rules Suffix matching ( similar to Plaxton )Suffix matching ( similar to Plaxton )

– Incrementally routing digital by digitalIncrementally routing digital by digital

– Maximum hops : logMaximum hops : logbb(N)(N)

6789

B4F8

9098

7598

4598

Msg to 4598

B437

Page 10: Tapestry : An Infrastructure for Fault-tolerant Wide-area Location and Routing

Routing : Neighbor mapsRouting : Neighbor maps•A table with b*logb(N) entries•The i-th level neighbor share (i-1) suffix chunks•Entry( i, j ) Pointer to the neighbor “ j” + (i-1) suffix•Secondary Neighbors•Back Pointers Create bi-direction link

0642

Page 11: Tapestry : An Infrastructure for Fault-tolerant Wide-area Location and Routing

Routing : Fault-tolerantRouting : Fault-tolerant

Detect Server/Link failureDetect Server/Link failure– TCP time out( Ping )TCP time out( Ping )– Periodic “heart beat” msg along back pointersPeriodic “heart beat” msg along back pointers

Resist faultResist fault– Secondary neighborSecondary neighbor

RecoverRecover– Probing messageProbing message– Second ChanceSecond Chance

Page 12: Tapestry : An Infrastructure for Fault-tolerant Wide-area Location and Routing

Locating : basic procedureLocating : basic procedure

4 phrases locating4 phrases locating– Map the Object ID to a “virtual” Node IDMap the Object ID to a “virtual” Node ID– Route the request to that nodeRoute the request to that node– Arrive the surrogate or“root for the objectArrive the surrogate or“root for the object– Direct to the serverDirect to the server

Client : B4F8 Server : B346

1234

8724

F734

B234

6234 <O:1234,S:B346>

Surrogate Routing

Page 13: Tapestry : An Infrastructure for Fault-tolerant Wide-area Location and Routing

Locating : Surrogate Routing(1)Locating : Surrogate Routing(1) Given any client at different place, how to finGiven any client at different place, how to fin

d the same “root”? d the same “root”? – PlaxtonPlaxton1.1. Find the nodes with the maximum matching suffix (Stop at the emptFind the nodes with the maximum matching suffix (Stop at the empt

y entry in neighbor map)y entry in neighbor map)2.2. Order them with the global knowledgeOrder them with the global knowledge3.3. Choose the No.1Choose the No.1

– TapestryTapestry1.1. Go further than Plaxton( choose an alternate entry )Go further than Plaxton( choose an alternate entry )2.2. Stop at a neighbor map where there is only one non-empty entry Stop at a neighbor map where there is only one non-empty entry

pointed to node Rpointed to node R3.3. R is the rootR is the root

Page 14: Tapestry : An Infrastructure for Fault-tolerant Wide-area Location and Routing

Locating : Surrogate Routing(2)Locating : Surrogate Routing(2)

F3145

E1145

51145 <O:12345, S:B3467>

B7645

B3467

12345B3945

92145

B1145

Assumption:

1.Every node is reachable

Ensure the same “patterns”

2.Even distributed ID

Ensure less and less nodes

in mapping table

Conclusion:

1. Root can always be found

2. E. of Sur. Route is 2

Page 15: Tapestry : An Infrastructure for Fault-tolerant Wide-area Location and Routing

PublishingPublishing

Similar to locatingSimilar to locating1.1. Server send msg and pretends to locate the objectServer send msg and pretends to locate the object

2.2. Find the surrogate node as the “root” for the Obj.Find the surrogate node as the “root” for the Obj.

3.3. Save the related info there, such as <O,S>Save the related info there, such as <O,S>

Server :B4F8

1234

8724

F734

B234

6234 <O:1234,S:B4F8>

Surrogate Routing

Page 16: Tapestry : An Infrastructure for Fault-tolerant Wide-area Location and Routing

Locating/Publishing : Locating/Publishing : Fault-Tolerant & LocalityFault-Tolerant & Locality

Multiple “root” (better than Plaxton)Multiple “root” (better than Plaxton)– Map the Obj. ID to several “root”Map the Obj. ID to several “root”– Publish/Locate can be executed simultaneouslPublish/Locate can be executed simultaneousl

yy

Cache 2-tuple <O,S>Cache 2-tuple <O,S>– Clients can get the <O,S> on the way to the roClients can get the <O,S> on the way to the ro

otot– Intermediate notes can receive multiple <O,S> Intermediate notes can receive multiple <O,S>

for the same Obj., the nearest one is chosenfor the same Obj., the nearest one is chosen

Page 17: Tapestry : An Infrastructure for Fault-tolerant Wide-area Location and Routing

Insert a new node: basic procedureInsert a new node: basic procedure

1.1. Get an Node IDGet an Node ID

2.2. Begin with a “Gateway node” GBegin with a “Gateway node” G

3.3. Pretends to route to itselfPretends to route to itself

4.4. Establish nearly optimal neighbor map during the “pseudo routing” by Establish nearly optimal neighbor map during the “pseudo routing” by coping & Choosing nearest ones.coping & Choosing nearest ones.

5.5. Go back and notify neighborsGo back and notify neighbors

Gateway node : B4F8

8724

F734

B234

6234

Surrogate Routing

New node : 1234

Page 18: Tapestry : An Infrastructure for Fault-tolerant Wide-area Location and Routing

Delete a noteDelete a note

Most simple operationMost simple operation

Explicitly notify the neighbors with back pointersExplicitly notify the neighbors with back pointers

Use Soft sateUse Soft sateDon’t send “heart beat” messages and republish Don’t send “heart beat” messages and republish messages any moremessages any more

Page 19: Tapestry : An Infrastructure for Fault-tolerant Wide-area Location and Routing

Maintain System ConsistencyMaintain System Consistency Components in a Tapestry nodeComponents in a Tapestry node

– Neighbor mapNeighbor map

– Back pointersBack pointers

– Object-Location pointersObject-Location pointers<Object, Node><Object, Node>

– Hotspot MonitorHotspot Monitor<Object, Node, Freq><Object, Node, Freq>

– Object storeObject store

Main correct statusMain correct status– Soft sate Soft sate – Proactive explicit updateProactive explicit update

Page 20: Tapestry : An Infrastructure for Fault-tolerant Wide-area Location and Routing

Soft stateSoft state AdvantageAdvantage

– Easy to implementEasy to implement– Suited to slowly changing systemsSuited to slowly changing systems

DisadvantageDisadvantage– Tradeoff between bandwidth overhead and level of Tradeoff between bandwidth overhead and level of

consistencyconsistency– Not suited to the fast changing systemsNot suited to the fast changing systems– Example :Example :

Bytes for the republishing for a server can be 1400MB (!) Bytes for the republishing for a server can be 1400MB (!)

in a single interval.in a single interval.

Page 21: Tapestry : An Infrastructure for Fault-tolerant Wide-area Location and Routing

Proactive explicit updateProactive explicit update( PEU )( PEU )

Proactive explicit updatesProactive explicit updates– Epoch number Epoch number

sequence # of the roundssequence # of the rounds

– Expanded 3-tuple Expanded 3-tuple <Obj. ID, Server ID, LastHopID ><Obj. ID, Server ID, LastHopID >

Soft state : backup resortSoft state : backup resort

Page 22: Tapestry : An Infrastructure for Fault-tolerant Wide-area Location and Routing

PEU : Node MobilityPEU : Node MobilityRoot

C

D

E

F

A

*

*

B

Move Object 123 from A to B

Republishing(123,B)

Deleting (123,A) with “LostHopID”

Page 23: Tapestry : An Infrastructure for Fault-tolerant Wide-area Location and Routing

PEU : Recover location pointers PEU : Recover location pointers

Root

E

F

C

A

Server

B

Exiting Notification

D

Reconstruction(O,S,B)

DeletingOld Data

Page 24: Tapestry : An Infrastructure for Fault-tolerant Wide-area Location and Routing

Introspective Optimization :Introspective Optimization : AAdapting to the changing environmentdapting to the changing environment

Load balanceLoad balance1.1. Periodically Ping by refresher threadPeriodically Ping by refresher thread

2.2. Update neighbor pointersUpdate neighbor pointers

HotspotHotspot1.1. Find the source of the heavy traffic, “Hotspot”Find the source of the heavy traffic, “Hotspot”

2.2. Pub the desired data near the hotspotPub the desired data near the hotspot

Page 25: Tapestry : An Infrastructure for Fault-tolerant Wide-area Location and Routing

EvaluationEvaluation

GainGain– Good LocalityGood Locality– Low Location latencyLow Location latency– High StabilityHigh Stability– High Fault-tolerenceHigh Fault-tolerence

CostCost– Bandwidth overhead linear to the replicasBandwidth overhead linear to the replicas

Page 26: Tapestry : An Infrastructure for Fault-tolerant Wide-area Location and Routing

ImplementationImplementation

Packet level simulators are finished in CPacket level simulators are finished in C Used to support other applicationsUsed to support other applications

– such as OceanStoresuch as OceanStore– Bayeus, application-level multicast protocolBayeus, application-level multicast protocol

Future WorkingFuture Working– Security issuesSecurity issues– Mobile-IP like functionalityMobile-IP like functionality

Page 27: Tapestry : An Infrastructure for Fault-tolerant Wide-area Location and Routing

SummarySummary Urgent need for new Location/Routing SchemeUrgent need for new Location/Routing Scheme

Features of TapestryFeatures of Tapestry– Location-independent namingLocation-independent naming– Integration of location and routingIntegration of location and routing– Content-based routingContent-based routing– Support for the dynamic environment Support for the dynamic environment

inserting/deleting/moving Node/Objectinserting/deleting/moving Node/Object

Page 28: Tapestry : An Infrastructure for Fault-tolerant Wide-area Location and Routing

Comments and QuestionsComments and Questions

Paradox or discrepancy?Paradox or discrepancy?The underlying IP has bad scalability, how can Tapestry The underlying IP has bad scalability, how can Tapestry

achieve high scalability? achieve high scalability?

Just for demo!Just for demo!

What’s the relation between the IP and Tapestry?What’s the relation between the IP and Tapestry?Tapestry doesn’t intend to replace IP, it just tries to Tapestry doesn’t intend to replace IP, it just tries to

establish a higher level locating & routing infrastructure to support establish a higher level locating & routing infrastructure to support the content-based operation.the content-based operation.

How can we achieve the same goal without IP?How can we achieve the same goal without IP?