towards efficient large-scale vpn monitoring and diagnosis under operational constraints yao zhao,...

26
Towards Efficient Large-Scale VPN Monitoring and Diagnosis under Operational Constraints Yao Zhao, Zhaosheng Zhu, Yan Chen, Northwestern University Dan Pei, Jia Wang, AT&T Labs -Research

Upload: daniel-dorsey

Post on 04-Jan-2016

213 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Towards Efficient Large-Scale VPN Monitoring and Diagnosis under Operational Constraints Yao Zhao, Zhaosheng Zhu, Yan Chen, Northwestern University Dan

Towards Efficient Large-Scale VPN Monitoring and Diagnosis under Operational

Constraints

Yao Zhao, Zhaosheng Zhu, Yan Chen, Northwestern University

Dan Pei, Jia Wang, AT&T Labs -Research

Page 2: Towards Efficient Large-Scale VPN Monitoring and Diagnosis under Operational Constraints Yao Zhao, Zhaosheng Zhu, Yan Chen, Northwestern University Dan

Page 2

OutlineMotivation

Problem Definition

Monitor Setup•Single-round monitoring

•Multi-round monitoring

Evaluation

Related Works

Conclusion

Page 3: Towards Efficient Large-Scale VPN Monitoring and Diagnosis under Operational Constraints Yao Zhao, Zhaosheng Zhu, Yan Chen, Northwestern University Dan

Page 3

Motivation

PE

VPN BackbonePE

CE

CE

VPN1,Site2

VPN2,Site1

VPN1,

Site1

PE

CE

CE

VPN2,Site2

Page 4: Towards Efficient Large-Scale VPN Monitoring and Diagnosis under Operational Constraints Yao Zhao, Zhaosheng Zhu, Yan Chen, Northwestern University Dan

Page 4

MotivationVPN performance monitoring• Reliability

• Quality of service (SLA)

Approaches• Passive measurements: SNMP-based Monitoring– Fixed poll rate– Difficult to measure end-to-end path-level features (e.g. delay, bw)

• Active measurements– Operational constraints

• E.g. monitor, link, path constraints

Page 5: Towards Efficient Large-Scale VPN Monitoring and Diagnosis under Operational Constraints Yao Zhao, Zhaosheng Zhu, Yan Chen, Northwestern University Dan

Page 5

Problem Definition

PE

VPN BackbonePE

CE

CE

VPN2,Site2

VPN2,Site1

VPN1,

Site1

PE

CE

CE

VPN1,Site2

Goal: continuously monitoring and diagnosing VPN performance under operational constraints

Each monitor can measure <=c paths

Each replier can reply <=r paths

Each link is on <=b measured paths

Traffic isolation between VPNs

Challenges with operational constraints•Optimization problem → constraint satisfactory problem•All paths measured simultaneously?

X

Page 6: Towards Efficient Large-Scale VPN Monitoring and Diagnosis under Operational Constraints Yao Zhao, Zhaosheng Zhu, Yan Chen, Northwestern University Dan

Page 6

VScope System Architecture

Two phases:• VScope Setup

• VScope Operation: monitoring + diagnosis

Provides a smooth tradeoff between measurement frequency and monitors deployment/management costs

Page 7: Towards Efficient Large-Scale VPN Monitoring and Diagnosis under Operational Constraints Yao Zhao, Zhaosheng Zhu, Yan Chen, Northwestern University Dan

Page 7

Two PhasesMonitor setup phase• From certain monitor candidates, how to select minimal

number of monitors, which in the measurement phase can measure a selected set of paths that covers all links in the network under the given measurement constraints?

• NP-hard even without considering constraints

Monitoring and fault diagnosis phase• When faulty paths are discovered in the path monitoring

phase, how to quickly select some paths under the operational constraints to be further measured so that the faulty link(s) can be accurately identified?

Page 8: Towards Efficient Large-Scale VPN Monitoring and Diagnosis under Operational Constraints Yao Zhao, Zhaosheng Zhu, Yan Chen, Northwestern University Dan

Page 8

Two PhasesMonitor setup phase• From certain monitor candidates, how to select minimal

number of monitors, which in the measurement phase can measure a selected set of paths that covers all links in the network under the given measurement constraints?

• NP-hard even without considering constraints

Monitoring and fault diagnosis phase• When faulty paths are discovered in the path monitoring

phase, how to quickly select some paths under the operational constraints to be further measured so that the faulty link(s) can be accurately identified?

Page 9: Towards Efficient Large-Scale VPN Monitoring and Diagnosis under Operational Constraints Yao Zhao, Zhaosheng Zhu, Yan Chen, Northwestern University Dan

Page 9

Outline

Motivation

Problem Definition

Monitor Setup•Single-round monitoring

•Multi-round monitoring

Evaluation

Related Works

Conclusion

Page 10: Towards Efficient Large-Scale VPN Monitoring and Diagnosis under Operational Constraints Yao Zhao, Zhaosheng Zhu, Yan Chen, Northwestern University Dan

Page 10

Monitoring Strategies

…P1

…P2

…P3

…P4

…P5

…P6

…P1

…P2

…P3

…P4

…P5

…P6

t t

Single-Round Monitoring Multi-Round Monitoring

Round 1 Round 2

Round 1

Page 11: Towards Efficient Large-Scale VPN Monitoring and Diagnosis under Operational Constraints Yao Zhao, Zhaosheng Zhu, Yan Chen, Northwestern University Dan

Page 11

Multi-Round MonitoringPros• Relax tight constraints• Reduce number of monitors

Cons• Less monitoring frequency

Monitor Selection Algorithm• Consider R rounds of back-to-back measurements• Step 1: convert multi-round monitor selection

problem to single-round problem and solve the single-round monitor selection problem– Relax monitor & link bw constraints by a factor of R

• Step 2: schedule paths measured in R rounds

Page 12: Towards Efficient Large-Scale VPN Monitoring and Diagnosis under Operational Constraints Yao Zhao, Zhaosheng Zhu, Yan Chen, Northwestern University Dan

Page 12

Single-Round Monitor SelectionMonitor Selection Problem• Related to Minimum Set Cover problem

• NP-hard without constraints [Bejerano, Infocom03]

Pure Greedy Algorithm• Simple and locally optimized

Greedy Assisted Integer Linear Programming based algorithm

• Linear programming is good at dealing with constraints

• ILP is NP-hard

• Need to relax ILP to LP

Page 13: Towards Efficient Large-Scale VPN Monitoring and Diagnosis under Operational Constraints Yao Zhao, Zhaosheng Zhu, Yan Chen, Northwestern University Dan

Page 13

Pure Greedy Algorithm

Two-level nested Minimum Set Cover Problem and Maximum Coverage Problem

• Iteratively select a candidate router as a new monitor that can measure paths covering maximum number of un-covered links before the selection

• Computing the maximum gain of adding a router as a monitor is a variant of Maximum Coverage problem (also NP-hard)– Iteratively select a path of the router that

• will not violate the link bandwidth constraints and • covers maximum number of un-covered links before the selection

– Until the number of selected paths reaches the monitor’s constraint

Page 14: Towards Efficient Large-Scale VPN Monitoring and Diagnosis under Operational Constraints Yao Zhao, Zhaosheng Zhu, Yan Chen, Northwestern University Dan

Page 14

Integer Linear Programming

Minimize number of monitorsA path is monitored iff the source router is selected as monitorMonitor constraintReplier constraint

A link is covered if at least one path containing the link is selectedLink bandwidth constraint

It is NP-hard!

Page 15: Towards Efficient Large-Scale VPN Monitoring and Diagnosis under Operational Constraints Yao Zhao, Zhaosheng Zhu, Yan Chen, Northwestern University Dan

Page 15

Relaxation with Random Rounding

Relax the Integer Linear Programming to Linear Programming

• Suppose the solution of linear programming is x*i, y*i

• Rounding rule:

Page 16: Towards Efficient Large-Scale VPN Monitoring and Diagnosis under Operational Constraints Yao Zhao, Zhaosheng Zhu, Yan Chen, Northwestern University Dan

Page 16

Greedy Assisted Linear Programming

Use Linear Programming to select a set of monitors and corresponding measurement paths

• Not all links are covered

Use greedy algorithm to cover uncovered links

• Similar to the pure greedy algorithm

Page 17: Towards Efficient Large-Scale VPN Monitoring and Diagnosis under Operational Constraints Yao Zhao, Zhaosheng Zhu, Yan Chen, Northwestern University Dan

Page 17

Mulit-Round Path SchedulingNP-hard • Can reduce minimum graph coloring problem to path scheduling

problem

Three algorithms• Random algorithm

– Randomly schedule paths independently– Run random algorithm multiple times to get the best one

• Greedy algorithm– Minimize link utilization in every step

• LP based Randomization Algorithm– ILP + relaxation and random rounding

Optimization metrics• Maximum link violation degree (MLVD)• Average link violation degree (ALVD)

Page 18: Towards Efficient Large-Scale VPN Monitoring and Diagnosis under Operational Constraints Yao Zhao, Zhaosheng Zhu, Yan Chen, Northwestern University Dan

Page 18

OutlineMotivation

Problem Definition

Monitor Setup•Single-round monitor selection

•Multi-round monitor selection

Evaluation

Related Works

Conclusion

Page 19: Towards Efficient Large-Scale VPN Monitoring and Diagnosis under Operational Constraints Yao Zhao, Zhaosheng Zhu, Yan Chen, Northwestern University Dan

Page 19

EvaluationTopologies• Synthetic topologies generated by BRITE• Real topologies from a tier-1 ISP: one IP backbone topology (IP-

EX), one VPN backbone topology (VB), and two VPN infrastructure topologies (V1-EX, V2-EX)– Scale from 100s nodes to 100,000s nodes– Heterogeneous real link bw (1.54Mbps ~ 10Gbps)

Operational constraints• From ISP management team

– E.g. percent link bw allowed for probing: 1%

Evaluation metrics• Percentage of monitors selected• Maximum (average) link violation degree after scheduling• Running speed

Page 20: Towards Efficient Large-Scale VPN Monitoring and Diagnosis under Operational Constraints Yao Zhao, Zhaosheng Zhu, Yan Chen, Northwestern University Dan

Page 20

Experimental setup

Default configuration•Monitor constraint = 12

•Replier constraint = 24

•Probing rate per path = 4 pkt /sec

•Measurement BW consumed per path = 1.6Kbps

•Link constraint = 1% x (link capacity)

Page 21: Towards Efficient Large-Scale VPN Monitoring and Diagnosis under Operational Constraints Yao Zhao, Zhaosheng Zhu, Yan Chen, Northwestern University Dan

Page 21

Baseline Monitor Selection Results (VB Topology)

LP+Greedy selects fewer monitors.

Vary Link ConstraintVary Monitor Constraint

Page 22: Towards Efficient Large-Scale VPN Monitoring and Diagnosis under Operational Constraints Yao Zhao, Zhaosheng Zhu, Yan Chen, Northwestern University Dan

Page 22

Multi-Round Monitor Selection Results (V1-EX Topology)

More rounds and fewer monitors and diminishing returns.

Vary Link ConstraintVary Monitor Constraint

Page 23: Towards Efficient Large-Scale VPN Monitoring and Diagnosis under Operational Constraints Yao Zhao, Zhaosheng Zhu, Yan Chen, Northwestern University Dan

Page 23

Multi-Round Monitor Selection Results (V1-EX Topology)

LP > Random > Greedy.

Link violation degree Percentage of links with violation

Greedy Greedy

Random Random

LP LP

Page 24: Towards Efficient Large-Scale VPN Monitoring and Diagnosis under Operational Constraints Yao Zhao, Zhaosheng Zhu, Yan Chen, Northwestern University Dan

Page 24

Related WorkPath Selection• The monitoring problem is not considered or too

simple• Complex path selection goal (basis, SVD, Bayesian

experimental design)

Monitoring Placement• Active Monitoring Systems– Similar problem without operational constraints [Bejerano,

Infocom03]– Robustness consideration

• Passive Monitoring Systems– SNMP Polling– Traffic sampling

Page 25: Towards Efficient Large-Scale VPN Monitoring and Diagnosis under Operational Constraints Yao Zhao, Zhaosheng Zhu, Yan Chen, Northwestern University Dan

Page 25

Conclusions

VScope for continuously monitoring & diagnosis• Consider operational constraints

• Design multi-round monitor selection algorithms– Single-round monitor selection– Monitoring path scheduling

• Evaluated with synthetic and real topologies– Our algorithms are efficient in minimizing number of

monitors with low constraint violation

Page 26: Towards Efficient Large-Scale VPN Monitoring and Diagnosis under Operational Constraints Yao Zhao, Zhaosheng Zhu, Yan Chen, Northwestern University Dan

Q & A?

Thanks!