distributed partial information management (dpim) for survivable networks
DESCRIPTION
Distributed Partial Information Management (DPIM) for Survivable Networks. Dahai Xu. Content. Basic Concepts of Protection & Restoration Previous Work on Shared Path Protection Proposed DPIM Schemes what partial info to maintain and how? - PowerPoint PPT PresentationTRANSCRIPT
1
Distributed Partial Information Management
(DPIM) forSurvivable Networks
Dahai Xu
2
Content Basic Concepts of Protection & Restorati
on Previous Work on Shared Path Protectio
n Proposed DPIM Schemes
what partial info to maintain and how?
how a connection is routed under distributed control and with partial info?
how distributed signaling is done and bandwidth (BW) allocated/deallocated?
A heuristic based on Potential Backup Cost
3
Protection Path Protection Link Protection Advantages & Disadvantages
4
Path Protection Use more than one path to guarantee
the data be sent successfully Dedicated Path Protection Shared Path Protection
5
Dedicated Path Protection 1+1 Protection Point-to-Point Protection & Mesh
Network Protection
6
1+1 Protection
7
Mesh Network Protection
8
Shared Path Protection 1:1 Protection 1:N Protection
9
Link Protection Use an alternate path if the link failed Dedicated Link Protection: not practical Shared Link Protection: practical It may fail when a node fails
10
Advantages & Disadvantages of
Protection Simple Quick: Do not require much extra
process time Usually can only recover from single link
fault Inefficient usage of resource
11
Restoration Path Restoration
Route can be computed after failure Link Restoration
Path is discovered at the end nodes of the failed link
More practical than path restoration Advantages & Disadvantages of Restoration
Usually can recover from multiplex element faults More efficient usage of resource Complex Slow: require extra process time to setup path
and reserve resource
12
Characteristic: Protection -- the resource are reserved before the failure, they may be not used; Restoration -- the resource are reserved and used after the failure
Route: Protection -- predetermined; Restoration -- can be dynamically computed
Resource Efficiency: Protection -- Low; Restoration -- High
Comparison between Protection & Restoration
13
Time used: Protection -- Short; Restoration -- Long
Reliability: Protection -- mainly for single fault; Restoration -- can survive under multiplex faults
Implementation: Protection -- Simple; Restoration -- Complex
Comparison between Protection & Restoration
(Cont’)
14
Offline Routing Arrange a set of traffic flows Integer Linear Programming(ILP) to get optimal
results Heuristic Algorithms
Relaxation of ILP Simulated Annealing - A stochastic hill-
climbing heuristic search method. (Explore a larger area in the search space without being trapped in local optimal)
Genetic Algorithm: Evolves the current population of “good solutions” toward the optimality by using carefully designed crossover and mutation operators.
Tabu search
15
Online Routing of Bandwidth Guaranteed
Online routing, bandwidth guaranteed path with simultaneous protection path
Metrics Unlimited Link Capacity
Bandwidth Consumption Limited Link Capacity
Connection drop/block probability Profit / Revenue
16
Assumption Two connections whose active paths are
completely link disjoint can share backup Bandwidth (BBW).
The objective of the algorithm is to exploit this BBW sharing to e.g., reduce the total amount of bandwidth (TBW) consumed by the connections.
17
Information for Routing The amount of BBW sharing depends on
the information available to the routing algorithm.
Three important cases to be considered. No Information on how existing
connections are routed Complete Per-flow/Aggregate
Information Partial Aggregate Information
18
No Sharing (NS) Only know the residual (available)
bandwidth on each link Residual bandwidth = Link capacity -
Reserved active bandwidth (ABW) - Reserved backup bandwidth (BBW)
Can be obtained from OSPF Extensions or IS IS Extensions
Only the total used bandwidth is known (active + backup)
Can not share BBW, thus waste resources.
19
Sharing with Complete Information (SCI)
Know routes for the active and backup paths of all current connections.
May have too much information to maintain. O(LQ). L is the average path length, Q is the number of existing connections.
Permits the best sharing and provides a Performance upper-bound
20
Partial Information for Routing
Know some aggregated information of each link
Two schemes SPI (Sharing with Partial Information):
Centralized control, knows BBW and ABW on each/every link
DPIM (Distributed Partial Information Management): Distributed control, each ingress edge (source) node decides the routes.
21
Notations (I)
22
Notations (II)
23
No Sharing (NS) Remove links Re < w Determine two link disjoint paths for
active/backup Formulation:
standard network flow problem each link has unit cost and unit capacity s supply two units, d demand two units minimum cost flow algorithm can be used
24
Linear Programming for SCI (I)
For new request (s, d, w), the least cost of using a on AP and b on BP
The cost of using e on BP(1)
25
Linear Programming for SCI (II)
Objective Constraints
26
SPI
In SCI, can be calculated from per-flow information. Need maintain per-flow information. Not scalable.
In SPI, is not known, only is known
Same objective and constraints as in SCI Further improvement to be discussed in DPIM
27
Survivable Routing (SR) Distributed control with complete but
aggregated information. Every edge node essentially maintains a
matrix of for all links a and b Uses the active path first (APF) heuristic
instead of ILP formulation Remove links whose Re<w (temporarily) Find a shortest path as AP Put back temporarily removed links, remove
AP links, calculate backup cost using Eq. (1) Find a shortest (cheapest) path as BP
28
Successive SR (SSR) After is updated as a result of
setting up a new connection, some existing BPs may change (route and the amount of additional BBW reserved)
Such changes may in turn trigger changes to other existing BPs until an equilibrium state is reached
Achieve a better BBW sharing, but with a high signaling and control overhead
29
RAFT RAFT: Resource Aggregation for Fault
Tolerance Each node maintains fault management
table (FMT) , which list AP or BP flow on each link e. FMT must be updated each time a request initiates or terminates
AP and BP route are node-disjoint by using shortest path algorithm firstly
A request is accepted only if the bandwidth requirement is available on all the links on its AP and BP, otherwise it is rejected.
30
Doshi’s Each node maintains a link capacity control table
(LCCT) for each local link Source nodes using Content-lock mechanism to avoid
multiple demands deadlock. BP route search: Distributed breadth-first search (BFS)
over a residual network In BFS, it first query the residual spare capacity in
LCCT, only use the link if the link has sufficient capacity
If a route is found, the source node stores it as the restoration route for the demand.
If fail to find the BP route, the capacity optimization procedure is activated by changing previous BP routes
31
Su’s Each node maintains “bucket”-based link state
(equivalent to ) The amount of link states is proportional to the
number of failure/link, not the number of light paths
AP and BP are optimized separately. AP are assumed to using minimum-hop paths, BP are optimized to reduce the wavelength redundancy
The “width” of link l with respect to a failure event k* is defined as the normalized difference between the maximum bucket height and the bucket corresponding to link failure k*, which indicates the sharing capacity of links.
32
By using Bellman-Ford algorithm to identify the widest path between the end nodes of the protected link, the path that offer the most sharing.
In the event that there are more than one such path candidates, the one that traverses the lease number links with width 0 was selected
Su’s (Cont’)
33
DPIM-SAM Distributed Partial Information
Management Edge node maintains (and
exchanges) non-local information: for each link e. (O(E) information)
Each node also maintains profiles of ABW and BBW for each local link e. (O(E) information)
34
Path Determination
This estimated BBW may not be minimal
Using ILP, or APF to find AP and BP DPIM-M-A: APF with Minimal BBW
Allocation
35
Distributed Signaling Minimal BBW Allocation Maintaining Partial Information on AP
and BP Send AP Set-up packet containing BP
to the nodes along AP, each node having an outgoing link e in AP updates
Similar way to update
36
Minimal BBW allocation
37
Connection Release Can’t be done efficiently in SPI AP Tear-Down and BBW Deallocation.
Update PBe and release bw.
38
Network Topology
39
Performance Evaluation Traffic Types
Incremental traffic (Established connection lasts forever)
Dynamic traffic (with connection durations)
Performance Metrics Unlimited Link Capacity
Bandwidth Saving (Ratio): upper bound 50% Limited Link Capacity
Connection drop/block probability Total Earning (Ratio) : Earning Rate matrix
(independent of traffic load)
40
Simulation Results Average Bandwidth Saving Ratio
Total Earning Ratio
41
Active Path First with Potential Backup Cost
(APF-PBC) Challenges
Integer Linear Programming (ILP) based approaches are notoriously time consuming
Guarantee minimal allocation of TBW for each request, but do not guarantee an optimal result for all requests.
Active path first (APF) can only achieve sub-optimal results:
Does not consider the potential cost along the BP when selecting the AP
42
Main idea of APF-PBC Also uses Active Path First In selecting Active Path, Each capable link
a will be assigned a cost We use as the potential backup cost
(and try to minimize TBW). Intuition: PBC increases with w and Can apply to SCI and DPIM-SAM (which
determine backup cost and BP differently)
43
Potential Backup Cost - Derivation
is derived based on the statistical analysis of experimental data. (SCI-ILP) for the 15-node network, infinite link capacity)
challenge: but do not know which link b to be used to backup link a, let alone Bb and
solution: guess the (weighted average) value of Bb (call it x) and (call it s)
44
Derivation based on statistical analysis of Bb
Distribution of Bb/M (w,s,M) is the expected value of a(w)
when s is fixed.
Guess the distribution of and calculated the weighted average value of (w,s,M) over all s to obtain a(w)
45
Distribution of Bb/M
46
Graph of (w,s,M) & approximation
Integral (curves) from adaptive Lobatto quadrature
Approximation (line-fitting Y=c1X+c2)
47
Cumulative distribution function of
48
Graph of
49
Approximation of a(w)
Distribution of Effect of constants c and on performance of APF-PBC
50
Distribution of
51
Effect of constants c and on performance of APF-
PBC
52
Bandwidth consumed after 500 demands
53
Total earning after 500 demands
54
Simulation Results -PBC Average Bandwidth
Saving Ratio Total Earning Ratio
55
Summary On-line Shared path protection (need to
extend to other schemes) Amount of information (Complete/Partial)
affects BBW sharing May use ILP or APF-based heuristics Proposed a DPIM scheme for a distributed,
partial / aggregated information management (including signaling for path set-up/tear-down)
Proposed a potential cost heuristic, which runs faster and better than ILP
56
Summary II Have also extended to cases with unprotected (UP)
and pre-emptable (PE) connections UP: use just one path similar to an AP (i.e., no BP);
affected if (and only if) the path breaks. PE: unprotected and may be affected even if a
failure does not break its path A PE may use the existing BPs/BBW to carry low-
priority traffic in fault-free situations A PE is similar, but not identical to a BP: can share
BBW with other BPs, but cannot share with other PE The idea of potential cost can also be applied to
solving other joint optimization problems with heuristics
57
Reference