inria rhône-alpes - planète project 1 a host-based multicast (hbm) solution for group...

INRIA Rhône-Alpes - Planète project 1

A host-based multicast (HBM) solution for group

communications

[email protected] http://www.inrialpes.fr/planete/people/roca

Planète Project

Vincent Roca - INRIA Rhône-Alpes - planète project - 2

Part 1:

Motivations for overlay multicast...

An alternative group communication service


Multicast routing not available everywhere!

[Diot00] “Deployment issues for the IP multicast service and architecture”

many deployment issues for ISPs:offering wide area multicast is technically complexrouting protocols still under development/researchmany functionalities are still not available (e.g.

security)brings router migration problemsrequires complex managementwhen is multicast more interesting than unicast?what billing model?

egg-and-chicken problem?


Using a unicast/multicast reflectorA reflector offers multicast connectivity to hosts

restricted to unicast transmissionscan be rather simple…

(two sockets + a few lines of C)

multicastrouting

reflectorhost

Unicast only site

unicastconnection

host

Unicast only site

unicastconnection

host

host

host

host

Multicastbackbone


Using a unicast/multicast reflector... (cont’)Pros

simpleavailable

Consrequires a manual setuphow many reflectors are needed ?where to place them ?limited and uncontrolled efficiency

(depends on the placement of reflectors, on the number of unicast clients, etc.)

Overlay Multicast (OM) tries to solve these problems...


Part 2:

Overlay Multicast (OM) specificities...


The goals of Overlay Multicast (OM)Definition

Create automatically and easily a group Create automatically and easily a group communication service, using efficiently the communication service, using efficiently the underlying unicast/multicast routing servicesunderlying unicast/multicast routing services

Many different names… Overlay MulticastEnd Host MulticastApplication-level MulticastHost Based Multicastetc.

hosthost

hosthostpoint-to-point connection

Site 1Site 2

Site 3

point-to-point connection


OM is not multicast

An OM node can bea host (general assumption)a (dedicated) server within the sitea (dedicated) server within the ISP

... but no assumption is made on routers (unlike multicast where the routers are supposed to implement a given routing protocol)


OM is not multicast... (cont’)OM is an overlay

the physical topology is hidden at the OM levelcreate a complete virtual graph with all

nodes/distances between themseveral possible metrics (unidirectional delay, RTT,

number of hops)

N1

N3

N2

N4

N5

N2: 17N3: 2N4: 28N5: 27

N1: 17N3: 17N4: 33N5: 32

N1: 2N2: 17N4: 28N5: 27

N1: 27N2: 32N3: 27N4: 3

N1: 28N2: 33N3: 28N5: 3

...to complete virtual graphFrom physical topology...

N1

N3

N2

N4

N5

1

2

1

1

10

20

1

5 Site 3

Site 2

Site 1


OM is not multicast... (cont’)Membership knowledge

In traditional multicast, knowledge is distributed/uncompleterouters only know that an interface leads to receiver

In OM, group members are known...either by a Rendez-vous Point (e.g. HBM)or by the sourceor by everybody (distributed scheme)

receiver N2 receiver N3

N2 and N3 for group G,neighbor is N2

source N1

N1 and N3 for group G,neighbors are N1 and N3

N1 and N2 for group G,neighbor is N2

Membership knowledge with HBM.

receiver N3receiver N2

join

join

rx on if1 for group G

rx on if0 for group G

source N1

Membership knowledge with a per-source tree.

local router knows existance ofat least one local rx for group G


OM is not multicast... (cont’)reliability is more limited

a node is less reliable than routers/linksif OM is implemented in a library, the application may

be stopped/crash…

redundancy, adaptation and fast failure discovery/tree update are required

tree setup is entirely under controlOM topology can be tailored

e.g. several OM topologies are possible depending on requirements

e.g. a specific tunnel can be setup on a lossy, congested path


Using OM for unicast/multicast integration...

Pros: automatic setupmore efficient than reflectorsdynamic adaptation to network conditionsmore security

Cons:can turn out to be rather complexhosts may be unstable (much more than

routers/links!)OM is neither as efficient nor as scalable as native

multicast routing

two key points: robustness and efficiency


PART 3:

Our HBM (Host Based Multicast) proposal


Sketch of the protocolHBM in a nutshell:

creates a self-organizing overlay that periodically self-improves

this is a RP-based protocolthe RP calculates the shared virtual topologythe RP has a complete knowledge of group

membership/communication costsmembers periodically evaluate distances between

them and inform their RPdata flows on the virtual topology (no RP

implication)distinguish:

core-members (CM) can be transit nodenon-core members (nonCM) are always leaves


Sketch of the protocol... (cont’)OK, that’s not scalable...

other OM solutions are not scalable eitherthe only solution is native multicast routing...

...except if you use DM protocols, or MSDP, or any other non scalable piece of protocol

anyway a single HBM node can serve many local participants using native local multicast !

OK, reliability greatly depends on the RP reliability...

not an issue if RP is collocated with primary sourceyou can also setup secondary RPs

(like secondary DNS/NIS/... servers)


Sketch of the protocol... (cont’)...but this is simple

limited coherency problems

(everything is centralized)limited burden on the hosts

(an asset in case of PDAs)

...and it creates a “not too bad” distribution topology

optimal with respect to known distances at that timetopology regularly updated

(periodic update, depending on the group size/stability/etc., or triggered by some event)


A robust group communication service...adding redundancy

see paper for an algorithm that adds some redundant virtual links (RVL)

proba_partitioned_topology_after_a_failure < threshold

fast failure discoverydepends on virtual topology

• with a tree, use ACK Aggregation• with a ring topology 1 or 2 nodes should receive each

packet twice

• with RVL, after a failure a node will receive traffic on a redundant link only

sourcepacket tx

packet tx 1 or 2 opposite hosts should receive 2 copies


A robust group comm. service... (cont’)adding adaptation

unstable nodes should be leaves rather than transit nodes...

each node has a capability

Cap(node) = f(user_desires, node_stability, group_req)

[0; alpha[ disconnected

[alpha, beta] leaf_only

]beta;1] transit_possibleenables adaptation...

reliable host

unreliable hosts


A robust group comm. service... (cont)

Several topologies are possible:

tree:tree: medium fault-tolerance

optimal perf. requires per-source tree

minimum global cost tree possible

ring:ring: 1-fault tolerance if bi-directional

perf. does not depend on source position

balanced load on all links

star:star: good fault-tolerance except for the core

source must be core for optimal perf.

very high traffic load close to core


A robust group comm. service... (cont)sun:sun: balance between ring and star topologies

good fault tolerance if unreliable hosts are moved at the end of sun beams

unreliable hostsreliable host


Some simulation resultsimpacts of a single node failure…


some simulation results… (cont’)comparison of the average delay


Part 4:

Conclusions


Conclusionsdiscussion on issues raised by creation and

management of an OM topology

a centralized solution where group membership is known by a RP has many advantagessimpleefficienttakes advantage of known node features

adding redundancy is important as OM is intrinsically fragile


Additional hidden slides


Using OM for unicast/multicast integration

build a group interconnection topology between participants

use unicast or multicast where available/efficient

OM can include multicast areas for improved:• scalability (all the nodes are collapsed)• efficiency (avoids several point-to-point connections)

hosthost

hosthostpoint-to-point connection

Site 1Site 2

Site 3

point-to-point connection


OM is not multicast... (cont’)

OM can only rely on end-hosts...easy deployment, flexiblebut not very efficient with bandwidth limited sites

OM can include dedicated servers at each site...

a server is certainly more stable than hostsno processing power problems

or even dedicated servers within ISPscertainly the most efficient solution from a

networking point of viewend-hosts are in “leaf-only” mode, so packets cross

the ISP/site link only once


Sketch of the protocol

everything is under the control of central RPknows CM and nonCMknows distances between them (several metrics)is responsible of the distribution topology

calculation and dissemination

CM periodically evaluate distances between them and inform the RP

likewise nonCM evaluate their distances with (a subset of) CM and inform the RP


Sketch of the protocol... (cont’)an example:

CM1

CM3

nonCM5CM2

nonCM4

RP

(1) evaluates inter-node distances

(3) calculate newdistribution topology

(2) send this info to the RP

(4) distribute newtopology information(either in pt-to-ptor along the new tree)

CM2: distCM3: distneighbors in topo: XXX



CM1: distCM2: distCM3: distneighbor: XXX

CM1: distCM2: distCM3: distneighbor: XXX

core treenonCM graft on the core tree(e.g. closest CM)

CM1

CM3

CM2

inria rhône-alpes - planète project 1 a host-based multicast (hbm) solution for group...

Documents

traditional multicast

multicast connectivity

multicast metc

plante project slide

ip multicast service

moffering wide area

cont om

om node