tools, algorithms & system implementation for end-user performance monitoring
DESCRIPTION
Tools, Algorithms & System Implementation for End-user performance monitoring. dario.rossi. Dario Rossi . [email protected] http://www.enst.fr/~drossi. Agenda. Tools, algorithms System implementation End-user performance monitoring Two perspective: - PowerPoint PPT PresentationTRANSCRIPT
Tools, Algorithms & System Implementation for End-user
performance monitoring
dario.rossiDario Rossi [email protected] http://www.enst.fr/~drossi
Agenda
• Tools, algorithms• System implementation• End-user performance monitoring
• Two perspective:– Background (all available from my webpage)– Foreground (open for collaboration)
Background
Tools, Algorithms
• Classification (C45, SVM,..)• Regression (ARMA,SVR,..)• Statistical analysis (PCA, ANOVA,..)• Inference (Apriori,…)
Applied to:• Traffic analysis & classification
System implementation
Tstat– Passive flow-level sniffer,
classifier, traffic analyzer
ModelNet-TE– Packet-level emulator
with Traffic Engineering capabilities
Demonstration software – at Sigcomm, Sigmetrics, Infocom, Globecom
All available from SOFTWARE and DEMO categories at http://www.enst.fr/~drossi
End-user performance monitoring
• Web– Methodology to infer, from TCP traffic,
if a Web connection has been interrupted• P2P-VoIP– In-depth black-box study of Skype
• P2P-TV systems– Assessment of peer selection strategies
• More athttp://ww.enst.fr/~drossi/index.php?n=Main.PublicationsByTopic
Example: traffic classification
Deep Packet Inspection (DPI)
Stochastic PacketInspection (KISS)
Behavior analysis(Abacus)
GET
MAIL FROM:
BT
Specific Keyword Application syntax
X M L TC S P TR S V PK G B XK G B XA P S TR S V P
Algorithm design
Entropy of L7 header, Chi-square test
Contact “weights” CDFBhattaccharyya distance
8
Kiss vs Abacus algorithms
PPLive
TVAnts
Nor
mal
ized
c2
(firs
t 14
head
er b
ytes
)
Pack
ets p
er se
nder
pee
rs p
df (5
sec
inte
rval
s)
SopCast
http://www.enst.fr/~drossi/index.php?n=Software.ClassificationDemo
9
System implementation
ISP1
HTTPYouTubeBitTorrent
BitTorrent UDPOther UDPOther TCP
eMule
…
ISP5
Foreground
Interests
• Very high-speed implementation (>10Gbps)– Monitoring & classification
• Federation of passive measurement points– Increase statistical relevance of measurement– Challenging per se
• New measures: Workload for CDN/ICN• New algorithms: Bufferbloat inference• New tools: Map-Reduce for traffic analysis
System implementation (1/2)
• Wire-speed classification engines
Submitted to IMC’12
13
System implementation (2/2)
ISP1
…
ISP2
• Federation of passive measurement points– Aim: coalesce RRD data to increase statistical relevance– Incentive model: gain access to the aggregated data– Implementation
• Star topology: the root R fetch ISP1…ISPn, aggregates on ISP* and redispatch
• Chain: ISP2 aggregate ISP1 and ISP2, pass it to ISP3 and so on; chain ends at R that add its own data to ISP* and send it back
• P2P: structured vs unstructured? e.g., BitTorrent only to redispatch ISP*?
ISPn
14
System implementation (3/3)
• Exploit of (new) active measurement points– Compare results between PlanetLab & e.g., Boinc– Boinc http://boinc.berkeley.edu/
• Aim: collaborative/volounteering computing• Used by: More than 295,000 worldwide location• Incentive to provide PCs: being on the top-100.• Unexplored for network resources
End-user performance monitoring (1/2)
• Bufferbloat Large buffer size (≥128KB) + Narrow bw (≤1Mbps)= Queueing delay (≥1 sec)
• Passive accurate methodto measure remote peers queue size
• Integration on Dasu (BitTorrent plugin) to crowdsource ISP characterization ?
Submitted to IMC’12
Bufferbloat!TCP AIMD fills the buffer!Nasty impact on interactive Web, VoIP, gaming traffic
End-user performance monitoring (2/2)
• Workload for CDN/ICN– Goal: assess the relevance of in-network caching– Need: a relevant large-scale workload
• Challenges– Cannot use Tier-1 backbone trace
• current dest. Server IP maps to CDN nodes– Cannot use DNS
• Caching => @root malformed > legitimate queries; frequencies avail at stub resolver, but impossible to get contemporary logs from many (>1000) of them
– Cannot use HTTP• Not everything tunneled in HTTP; still, would need payload of Tier-1
backbone, with a large snaplen to get the full URLs– Solution? In progress (=none so far)
?? || //
Backup slides
Traffic Classification TaxonomyApproach Subcategory Granularity Timeliness Complexity CommentPayload Based
[1,2] Deep Packet Inspection (DPI)
Fine-grained individual applications
Early(first few packets).
Access to packet payload of first few packets.Moderate cost
Deterministic technique;
KISS[Ton’10]Stochastic Packet Inspection
Fine-grained individual applications
Online (100s packets windows)
Access to packet payload of several packets. High cost
Robust technique
StatisticalAnalysis
[4,5,6,7] Coarse-grained, class of application
Late(after the flow end).
Access to flow-level informationLightweight cost
Post-mortem analysis
[8,9] Fine-grained individual applications
Early(first 5 packets)
Access to first few packets Lightweight cost
On the fly classification
BehavioralAnalysis
[10,11] Coarse-grained, class of application
Late(after the flow end).
Lightweight Post-mortem analysis
Abacus [ComNet’11]
Fine-grained, individual P2P applications
Online (1s-5s seconds windows)
Lightweight Online classificationLimited to P2P
Overview
Deep Packet Inspection (DPI)
Stochastic PacketInspection (KISS)
Behavior analysis(Abacus)
GET
MAIL FROM:
BT
Specific Keyword Application syntax
X M L TC S P TR S V PK G B XK G B XA P S TR S V P
Algorithm design
21
Y1 pkt1 cb d2 ... 02 60 Y1 pkt2 cc d5 ... 02 08 Y2 pkt1 01 da ... 02 65 Y1 pkt3 cd c0 ... 02 d9 Y2 pkt2 02 c1 ... 02 5c Y2 pkt3 03 dc ... 02 11 Y1 pkt4 ce cb ... 02 28 Y1 pkt5 cf d1 ... 02 8a Y1 pkt6 d0 ca ... 02 3a Y2 pkt4 04 c2 ... 02 b7
1) Extract the first N bytes of the payload from a window of W consecutive packets2) Divide each byte in 2 chunks of 4 bits3) Collect the frequency distribution Oi of the values assumed by each chunk4) Compare the distribution to a uniform distribution Ei=/24 with a c2-like test
countersC||D = 3 bit fixedrandomdeterministic
XY1
Y2
2g2/
1
2
1
2
~ c
N
i
ig
i
ig
ig
g
E
EOb
measure the randomness
of each chunk
KISS signature: [X1, X2, ... X2N] over W pkts
KISS: Stochastic packet inspection
Header syntax is fixed, binary alphabet
22
1) Count the number of packets/bytes received in a fixed time window DT
2) Count the number of hosts sending a given number of packets/bytes (exponential binning)
3) Normalize the packet/bytewise counts to gather two probabilitymass functions
X
Y1 Y2
2 4 8 ...
Y3 Y4
16
Y5
Freq.
Distribution = [1, 1, 3, 0]Signature = [0.2, 0.2, 0.6]Example using packets
Abacus: Behavioral signatures
Applications implement different activities
(signaling, data chunks) and tuning (chunk size)
23
Kiss vs Abacus signatures
PPLive
TVAnts
Nor
mal
ized
c2
(firs
t 14
head
er b
ytes
)
Pack
ets p
er se
nder
pee
rs p
df (5
sec
inte
rval
s)
SopCast
Oops!
• Sorry, wrong key