detection and identification of network anomalies using sketch subspaces
DESCRIPTION
Detection and Identification of Network Anomalies Using Sketch Subspaces. Xin Li, Fang Bian, Mark Crovella, Christophe Diot, Ramesh Govindan, Gianluca Iannaccone, and Anukool Lakhina . ACM Internet Measurement Conference 2006 . Speaker: Chang Huan Wu 2009/5/1. Outline. Introduction - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Detection and Identification of Network Anomalies Using Sketch Subspaces](https://reader036.vdocuments.mx/reader036/viewer/2022062301/56815e6b550346895dcce81a/html5/thumbnails/1.jpg)
Detection and Identification of Network Anomalies Using
Sketch Subspaces
Xin Li, Fang Bian, Mark Crovella, Christophe Diot, Ramesh Govindan, Gianluca Iannaccone, and Anukool Lakhina
ACM Internet Measurement Conference2006
Speaker: Chang Huan Wu
2009/5/1
![Page 2: Detection and Identification of Network Anomalies Using Sketch Subspaces](https://reader036.vdocuments.mx/reader036/viewer/2022062301/56815e6b550346895dcce81a/html5/thumbnails/2.jpg)
2
OutlineIntroductionPrevious ApproachDefeatEvaluationConclusions
![Page 3: Detection and Identification of Network Anomalies Using Sketch Subspaces](https://reader036.vdocuments.mx/reader036/viewer/2022062301/56815e6b550346895dcce81a/html5/thumbnails/3.jpg)
3
Introduction (1/3)Unusual traffic patterns arise from
network abuse as well as from legitimate activity
These traffic anomalies are often difficult to detect at a single link and require scrutiny of the entire network
![Page 4: Detection and Identification of Network Anomalies Using Sketch Subspaces](https://reader036.vdocuments.mx/reader036/viewer/2022062301/56815e6b550346895dcce81a/html5/thumbnails/4.jpg)
4
Introduction (2/3)Characterizing “normal” traffic
using IP flows representation is intractable– High dimension
Reduce dimension and identify anomalies
![Page 5: Detection and Identification of Network Anomalies Using Sketch Subspaces](https://reader036.vdocuments.mx/reader036/viewer/2022062301/56815e6b550346895dcce81a/html5/thumbnails/5.jpg)
5
Introduction (2/3) Previous work aggregate n
etflow into origin-destination (OD) flows
Modify this approach and increases the detection rate while reducing false alarms and identify the IP-flows responsible for the anomaly
Points of Presence, PoPLink
![Page 6: Detection and Identification of Network Anomalies Using Sketch Subspaces](https://reader036.vdocuments.mx/reader036/viewer/2022062301/56815e6b550346895dcce81a/html5/thumbnails/6.jpg)
6
Previous ApproachReference: Anukool Lakhina, Mark Cro
vella, Christophe Diot, "Mining Anomalies Using Traffic Feature Distributions," In ACM SIGCOMM 2005
![Page 7: Detection and Identification of Network Anomalies Using Sketch Subspaces](https://reader036.vdocuments.mx/reader036/viewer/2022062301/56815e6b550346895dcce81a/html5/thumbnails/7.jpg)
7
Volume vs. Traffic Feature Distribution
Volume based detection schemes have been successful in isolating large traffic changes– But a large of anomalies do NOT cause
detectable disruptions in traffic volume Using traffic feature distribution
– Augments volume-based anomaly detection– Traffic distributions can reveal valuable
information about the structure of anomalies
![Page 8: Detection and Identification of Network Anomalies Using Sketch Subspaces](https://reader036.vdocuments.mx/reader036/viewer/2022062301/56815e6b550346895dcce81a/html5/thumbnails/8.jpg)
8
Port scan anomalies viewed in terms of traffic volume and in terms of entropy
But stands out in feature entropy
Port scan dwarfed in volume metrics…
![Page 9: Detection and Identification of Network Anomalies Using Sketch Subspaces](https://reader036.vdocuments.mx/reader036/viewer/2022062301/56815e6b550346895dcce81a/html5/thumbnails/9.jpg)
9
Traffic Feature Distributions Anomalies can be detected and distinguished
by inspecting traffic features:– 4-tuple: SrcIP, SrcPort, DstIP, DstPort
![Page 10: Detection and Identification of Network Anomalies Using Sketch Subspaces](https://reader036.vdocuments.mx/reader036/viewer/2022062301/56815e6b550346895dcce81a/html5/thumbnails/10.jpg)
Entropy based scheme In volume based scheme, # of packets or bytes per tim
e slot was the variable. In entropy based scheme, in every time slot, the entrop
y of every traffic feature is the variable. This gives us a three way data matrix H.
– H(t, p, k) denotes at time t, the entropy of OD flow p, of the traffic feature k.
To apply subspace method,we need to unfold it into a single-way representation.
![Page 11: Detection and Identification of Network Anomalies Using Sketch Subspaces](https://reader036.vdocuments.mx/reader036/viewer/2022062301/56815e6b550346895dcce81a/html5/thumbnails/11.jpg)
11
Normal subspace, : first k principal components
Anomalous subspace, : remaining principal components
Then, decompose traffic on all links by projecting onto and to obtain:
11
Traffic vector at a particular point in time
Normal trafficvector
Residual trafficvector
Subspace Decomposition
![Page 12: Detection and Identification of Network Anomalies Using Sketch Subspaces](https://reader036.vdocuments.mx/reader036/viewer/2022062301/56815e6b550346895dcce81a/html5/thumbnails/12.jpg)
12 12
Traffic on link 1
Traffic
on lin
k 2
y
In general, anomalous traffic results in a large value ofUse to identify if it is anomalous
Geometric illustration
![Page 13: Detection and Identification of Network Anomalies Using Sketch Subspaces](https://reader036.vdocuments.mx/reader036/viewer/2022062301/56815e6b550346895dcce81a/html5/thumbnails/13.jpg)
Multiway Subspace Method:(Multi-way to single-way)
Decompose into a single-way matrix Now apply the usual subspace decomposition
(PCA)– Every row of the matrix will be decomposed into
# od-pairs
# tim
ebins
H(SrcIP) H(SrcPort) H(DstPort)H(DstIP)
H(s
rcIP
) H(d
stIP
)
H(s
rcPor
t)
H(d
stPor
t)
# od-pairs# od-pairs
# tim
ebins
# tim
ebins
H(SrcIP) H(SrcPort) H(DstPort)H(DstIP)
H(s
rcIP
) H(d
stIP
)
H(s
rcPor
t)
H(d
stPor
t)
![Page 14: Detection and Identification of Network Anomalies Using Sketch Subspaces](https://reader036.vdocuments.mx/reader036/viewer/2022062301/56815e6b550346895dcce81a/html5/thumbnails/14.jpg)
14
Defeat (1/2)
Use random aggregations of IP flows (sketches) Put an IP flow into different hash functions (h1,
h2…)
h1
h2
h3
h4
h5
s buckets
R1, SrcIP
h1
h2
h3
h4
h5
s buckets
R2, SrcIP
h1 s bucketsh1 s buckets
…
Entropy of h1
…
Entropy of h1
Entropy of h1
t1
t2
…
Entropy of h1tn
R1
R2
![Page 15: Detection and Identification of Network Anomalies Using Sketch Subspaces](https://reader036.vdocuments.mx/reader036/viewer/2022062301/56815e6b550346895dcce81a/html5/thumbnails/15.jpg)
15
Defeat (2/2)
Apply multiway subspace method to each hash function
In all m hash functions, see how many ones are identified as anomalous– Voting approach
Entropy of h1
Entropy of h1
t1
t2
…
Entropy of h1tn
SrcIPEntropy of h1
Entropy of h1
…
Entropy of h1
SrcPortEntropy of h1
Entropy of h1
…
Entropy of h1
DstIPEntropy of h1
Entropy of h1
…Entropy of h1
DstPort
![Page 16: Detection and Identification of Network Anomalies Using Sketch Subspaces](https://reader036.vdocuments.mx/reader036/viewer/2022062301/56815e6b550346895dcce81a/html5/thumbnails/16.jpg)
16
Identify Anomalies
Find the element in hash functions that is identified as anomalous
The intersection of the key sets over all hash functions which has raised the alarms, identifies the keys of the IP flows that caused the anomaly (with high likelihood)
Entropy of h1t1 Entropy of h2 Entropy of h3 Entropy of h4
s buckets s buckets s buckets s buckets
![Page 17: Detection and Identification of Network Anomalies Using Sketch Subspaces](https://reader036.vdocuments.mx/reader036/viewer/2022062301/56815e6b550346895dcce81a/html5/thumbnails/17.jpg)
17
Evaluation (1/2)
![Page 18: Detection and Identification of Network Anomalies Using Sketch Subspaces](https://reader036.vdocuments.mx/reader036/viewer/2022062301/56815e6b550346895dcce81a/html5/thumbnails/18.jpg)
18
Evaluation (2/2)
5 or 6 hash functions is enough If m is the number of hash functions, m−2 or more votes may b
e enough
![Page 19: Detection and Identification of Network Anomalies Using Sketch Subspaces](https://reader036.vdocuments.mx/reader036/viewer/2022062301/56815e6b550346895dcce81a/html5/thumbnails/19.jpg)
19
ConclusionUses multiple random traffic
projections to robustly detect anomalies
Higher detection rate and fewer false alarms
Able to automatically infer the IP flows responsible for an anomaly
![Page 20: Detection and Identification of Network Anomalies Using Sketch Subspaces](https://reader036.vdocuments.mx/reader036/viewer/2022062301/56815e6b550346895dcce81a/html5/thumbnails/20.jpg)
20
CommentsOnly can handle offline dataCan other fields in packet header
be used for anomaly detection?