safeguard: safe forwarding during route changes ang li†, xiaowei yang†, and david wetherall‡...
TRANSCRIPT
SafeGuard: Safe Forwarding during Route Changes
Ang Li†, Xiaowei Yang†, and David Wetherall‡
†Duke University ‡UW/Intel Research
Real-time Applications Require High Network Availability
•Even short periods of packet loss can degrade the service▫Video pixelization▫Poor voice quality▫Slow gaming
experience
Network Changes Lead to Massive Packet Losses
•TTL expiration•Link congestionPacket Loss!
No valid routePacket Loss!Re-converge!(Hundreds of ms~seconds)
Network Changes Happen Frequently
•Planned events▫Maintenance, policy change, traffic
engineering•Unplanned events
▫Fiber cut, router bug, configuration error
•Sprint: median inter link failure time is only3 minutes [Iannaccone02]
Problem• How to reduce forwarding disruption after network
changes happen?▫ Intra-domain routing
• SafeGuard goals▫Simple
No new routing convergence protocols Routers can update in parallel and independently
▫Effective Minimize disruption period to the failure detection time
▫Efficient Suitable for hardware implementation
Comparison with Related WorkMechanism Simple Effective Efficient
Convergence-free Routing [Lakshminarayanan07]
× √ ×
Consensus Routing [John08]
× √ √
Ordered FIB Update [Francois07]
× × √
Fast Rerouting [Shand08]
√ × √
SafeGuard √ √ √
• Insight: a path’s cost encodes much valuable information in a concise form
• Use cost as a Safeguard
1. Packets carry path costs to detect network changes
2. Routers use path costs to identify safe alternative paths
Key Idea
SrcCos
tDst
IP Packet
OverviewForwarding Table:
SV KA 639
Dst Nhop Cost
KA SV 4456
… … …
SV KA316
1
Forwarding Table:Dst Nhop Cost
KA DV 1934
… … …
Alternative Paths DB:Dst Nhop Cost
KA LA 3161
… … …
SV KA279
5Loop-free!
Challenges• How to encode the costs such
that alternative paths can be uniquely identified?▫Loops may occur if wrong
paths are chosen
• How to obtain the alternative paths prior to network changes?
• How to forward packets efficiently?
D
A
B
C
1 1
1 1DC 2A
Loop!
Enhance the Costs with Random Noises
• Append a fixed length noise to each link cost• An enhanced path cost is the sum of enhanced
link costs• Regular costs and noises are added separately
• Different paths will have different enhanced path costs with high probability for practical scenarios
…0110111000
1001101011
Regular link cost10-bit random noise
Enhanced link cost
Encode Costs in IP Packets
…0110111000
1001101011
32-bit cost label
10-bit random noise
src dst
IP Packet
0
Escort bit
• 32-bit label to encode an enhanced path cost• An extra escort bit to denote whether the packet is
under protection• Potential places to store the cost label and the escort bit
▫ MPLS label▫ IP Option▫ Overload unused header fields
Pre-compute Alternative Paths
• Compute the shortest path after removing each single component▫ Single component: a link, node,
or SRLG
• Stored in the Alternative Path Database (APD)▫ A mapping table from (dst,
enhanced cost) to nexthop
• Update APD after each topology change▫ Background computation
D
A
B
C
(1,3) (1,4)
(1,7) (1,8)
C, (2,7)B
Add Costs to the Forwarding Table
•Add the enhanced shortest path cost for each destination▫Obtained from the normal shortest path
computation with minimum overhead•Also add the enhanced shortest path cost from
each of the nexthops to the destination▫Used to update the outgoing packet costs
Destination
Nexthops
D B, C
… …
Path cost
(2, 7)
…
Nexthop costs
B: (1, 4), C: (1,8)
…
Forwarding Algorithm
•Forward by comparing both cost and destination
•Two modes of forwarding▫Normal mode (escort bit == 0)
Packets can be forwarded along any of the ECMPs
▫Escort mode (escort bit == 1) Packets are forwarded along the path
uniquely identified by the packet cost
Normal Mode Forwarding
•Each router ni only compares its own regular cost ni.cost with the packet’s regular cost pkt.cost1. ni.cost == pkt.cost
Forward to any default nexthop ni+1
Update the packet cost using ni+1’s enhanced cost
ni
ni+
1
Incoming packet
S(cost,nois
e)D 0
2. Higher Local Cost
•ni.cost > pkt.cost•Router is aware of a failure/unaware of a
restoration▫Default shortest paths are still safe
Forward to any default nexthop ni+1
Update the packet cost using the ni+1’s cost Turn on the escort bit
ni
ni+
1
Incoming packet
S(cost,nois
e)D 0
3. Lower Local Cost
•ni.cost < pkt.cost•Router is unaware of a failure/aware of a
restoration▫Default shortest path is no longer safe
Lookup an alternative nexthop n’i+1 from the APD using the full packet cost (pkt.cost, pkt.noise)
Forward to the alternative nexthop Turn on the escort bit
ni n’i+1
Incoming packet
S(cost,nois
e)D 0
Escort Mode Forwarding
• Always try to find a path with the exact enhanced packet cost▫Default shortest path▫Alternative path through APD lookup
• If not found, drop the packet▫To prevent loops in case multiple concurrent
failures happen
ni n’i+1
Incoming packet
S(cost,nois
e)D 1
Loop-free Forwarding with ECMPs
D
A
B
C
D
(1,3) (1,4)
(1,7) (1,8)C 2A C
(2,7)
A 1
Loop-free!
SafeGuard Properties•When network is steady, packets can reach
their destinations through any of the ECMPs
•After one network element changes its status, a packet can still reach its destination▫Not considering failure detection time▫Assume enhanced costs are distinct
•A packet will not be trapped in a loop without being discarded
Evaluation• Router performance
▫A prototype using NetFPGA and Quagga Forwarding overhead is only 48ns Practical memory and computational overhead
▫Suitable for hardware implementation▫Details are in paper
• Network performance▫Event-driven simulations under realistic settings▫Comparison with the vanilla IP forwarding and a
state-of-the-art IP fast restoration technique
SafeGuard Forwarding is Loop-Free
Single link failure Two links failure
Sprint topology from Rocketfuel
Cum
ulat
ive
Fra
ctio
n
1
0.8
0.6
0.4
0.2
0
Flow Amplifying Factor0 10 20 30 40 50 60
SafeGuard + OSPFIP Fast RestorationVanilla IP + OSPF
Flow Amplifying Factor0 10 20 30 40 50 60
Cum
ulat
ive
Fra
ctio
n
1
0.8
0.6
0.4
0.2
0
SafeGuard + OSPFIP Fast RestorationVanilla IP + OSPF
SafeGuard Minimizes Disruption
Single link failure
Failure happens
Failure detected (~200ms)
Pac
ket
Loss
Rat
e
1
0.8
0.6
0.4
0.2
00 1.0 1.5 2.0
Time (s)0.5
SafeGuard + OSPFIP Fast RestorationVanilla IP + OSPF
SafeGuard Does not Delay Convergence
Con
verg
ence
Tim
e (s
)
2
1.5
1
0.5
0link
downlinkup
nodedown
nodeup
2 linksdown
SafeGuard/Vanilla IP + OSPFIP Fast Restoration
Conclusion
•SafeGuard▫Minimize disruption after network changes▫Does not modify routing convergence
•Use costs as path hints▫Detect network changes▫Identify alternative paths
•Simple, effective, and efficient
Noise Collision
•Suppose c paths have the same normal cost and each noise has k-bits, the collision probability is:
▫The birthday probability▫Try different noise if collision existsk-bit Collision
10 0.009716 0.00015
c=5
Simulation Parameters
Simulation Settings
•Realistic topologies and link costs▫Real and inferred topologies from
Rocketfuel▫A random topology with asymmetrical link
costs•Practical OSPF configuration
▫Achieve fast convergence (sub-second)•Comparisons
▫No protection: OSPF + vanilla IP forwarding
▫State-of-the-art: Ordered FIB update (oFIB) + NotVia
Practical Memory and Computation Overhead•Protect all single link and node failures
•On average |APD| < 3 * |FIB|•APD computation time < 100ms
Topology # of FIB Entries
# of APD Entries
APD Computation Time (ms)
Abilene 11 17.3 0.165
Sprint 315 777 79.4
Random 100 276.1 6.2
SafeGuard Forwarding is Loop-Free
Update Type # of Tests Containing Loops
Total # of Micro-loops
Loop Duration (ms)
Avg Max Min Stddev
OSPF
Link Failure 19 81 12.5 44.6 0.32 15.4
Node Failure 17 125 11.5 26.3 0.10 26.7
Link Up 4 7 11.9 40.7 0.80 21.8
Node Up 11 20 6.32 24.8 0.19 6.45
Two Link Failures 38 144 9.0 39.7 0.39 11.2
oFIB
Two Link Failures 36 138 8.8 41,2 0.18 10.8
100 tests with the Sprint topology
SafeGuard does not Increase Convergence Time
Con
verg
ence
Tim
e (s
)
2
1.5
1
0.5
0link
downlinkup
nodedown
nodeup
2 linksdown
SafeGuard+OSPFIP Fast Restoration