rule-based anomaly detection on ip flows
DESCRIPTION
Rule-based Anomaly Detection on IP Flows. Nick Duffield, Partick Haffner, Balachander Krishnamurthy (AT&T) , Haakon Ringberg (Princeton Univ.) INFOCOM’09. Rule actions. protocol. Source IP & port. direction. Destination IP & port. Detail of rule. Message text. Packet size. - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Rule-based Anomaly Detection on IP Flows](https://reader035.vdocuments.mx/reader035/viewer/2022062408/568137df550346895d9f82da/html5/thumbnails/1.jpg)
Rule-based Anomaly Detection on IP Flows
Nick Duffield, Partick Haffner, Balachander Krishnamurthy (AT&T), Haakon Ringberg (Princeton Univ.)
INFOCOM’09
![Page 2: Rule-based Anomaly Detection on IP Flows](https://reader035.vdocuments.mx/reader035/viewer/2022062408/568137df550346895d9f82da/html5/thumbnails/2.jpg)
2009/4/9 Speaker: Li-Ming Chen 2
Snort
Snort is a powerful, flexible open source NIDS
Rule-based Anomaly Detection on Packets
A Snort rule:
alert udp $EXTERNAL_NET any -> $HOME_NET 1434 (msg:"MS-SQL version ove…"; dsize:>100; content:"|04|"; …)
Rule actions protocol Source IP & port Destination IP & portdirection
Detail of rule Message text Patterns in packet’s payloadPacket size
![Page 3: Rule-based Anomaly Detection on IP Flows](https://reader035.vdocuments.mx/reader035/viewer/2022062408/568137df550346895d9f82da/html5/thumbnails/3.jpg)
2009/4/9 Speaker: Li-Ming Chen 3
Challenge for deploying Snort over a Large Network (e.g., a Tier-1 ISP) Deploy at the edge:
Network scale is huge Deployment issues
Deploy at the core: Links capacity is high Performance issues
Hundreds of rules may need to be operated concurrently for each packet
![Page 4: Rule-based Anomaly Detection on IP Flows](https://reader035.vdocuments.mx/reader035/viewer/2022062408/568137df550346895d9f82da/html5/thumbnails/4.jpg)
2009/4/9 Speaker: Li-Ming Chen 4
Idea: Rules for IP Flows !
Does it possible to construct rules at the flow level that accurately reproduce the action of packet-level rules ? e.g., alerts should be raised for a flow, if some packets
of this flow trigger packet-level rules
Why? Easy to have IP flows
ISPs already collect flow statistics ubiquitously (e.g., NetFlow) More scalable
![Page 5: Rule-based Anomaly Detection on IP Flows](https://reader035.vdocuments.mx/reader035/viewer/2022062408/568137df550346895d9f82da/html5/thumbnails/5.jpg)
2009/4/9 Speaker: Li-Ming Chen 5
Think about Rules for IP Flows… (1/2) If packet-level rule looks like:
alert udp $EXTERNAL_NET any -> $HOME_NET 1434 (msg:"MS-SQL version ove…"; dsize:>100; content:"|04|"; …)
In flow-level, maybe we can do: Alert UDP flows come from $EXTERNAL_NET to $HOME_NET
at port 1434 with mean packet size larger than 100 Yes, we ignore the content !! Although we don’t know the exact packet size, we can measur
e mean packet size of each flow !? What’s the detection accuracy !?
![Page 6: Rule-based Anomaly Detection on IP Flows](https://reader035.vdocuments.mx/reader035/viewer/2022062408/568137df550346895d9f82da/html5/thumbnails/6.jpg)
2009/4/9 Speaker: Li-Ming Chen 6
Think about Rules for IP Flows… (2/2) What about packet-level rule is:
alert icmp any any -> any any (msg:"ICMP Dest. Unreachable Comm. Administratively Prohibited"; icode:13; itype:3; …)
In flow-level, what can do? ICMP destination unreachable is generated by the host or it
s inbound gateway to inform the client that the destination is unreachable for some reason e.g., every packet points to IP address A will trigger this event
Can we LEARN this kind of events?
![Page 7: Rule-based Anomaly Detection on IP Flows](https://reader035.vdocuments.mx/reader035/viewer/2022062408/568137df550346895d9f82da/html5/thumbnails/7.jpg)
2009/4/9 Speaker: Li-Ming Chen 7
Motivation & Goal
For NIDS, inspecting every packet would be ideal, but impractical Signature-based NIDS has scale and performance
problems
Goal: develop an architecture that can translate many existing packet signature to instead operate effectively on IP flows Premise: flow statistics are compact and collected
within most ISPs’ network
![Page 8: Rule-based Anomaly Detection on IP Flows](https://reader035.vdocuments.mx/reader035/viewer/2022062408/568137df550346895d9f82da/html5/thumbnails/8.jpg)
2009/4/9 Speaker: Li-Ming Chen 8
Build Flow Rules via Learning Authors use machine learning (ML) approaches
to learn the association between flow features and packet payload
Problem: Flows: aggregate packet header information, while
lose payload information Flow rules: loss of accuracy !? Does ML mitigate the impact of losing payload
information !?
![Page 9: Rule-based Anomaly Detection on IP Flows](https://reader035.vdocuments.mx/reader035/viewer/2022062408/568137df550346895d9f82da/html5/thumbnails/9.jpg)
2009/4/9 Speaker: Li-Ming Chen 9
Outline
Motivation & Goal
Packet Rule Classification
Packet Rules Flow Rules
Dataset & Evaluation Methodology
Experimental Results
Real Deployment Issues
Conclusion & My Comments
![Page 10: Rule-based Anomaly Detection on IP Flows](https://reader035.vdocuments.mx/reader035/viewer/2022062408/568137df550346895d9f82da/html5/thumbnails/10.jpg)
2009/4/9 Speaker: Li-Ming Chen 10
Why to classify packet rules?Packet Rule Classification (1/3) Not all packet rules can be effectively learned…
Using a taxonomy of packet rules to understand their impacts, and
Evaluate the performance of proposed ML-method
For example: ML-method can learn perfectly …? ML-method is likely to learn very well …? The accuracy of ML-method varies based on the nature of the
rule…?
![Page 11: Rule-based Anomaly Detection on IP Flows](https://reader035.vdocuments.mx/reader035/viewer/2022062408/568137df550346895d9f82da/html5/thumbnails/11.jpg)
2009/4/9 Speaker: Li-Ming Chen 11
What kinds of predicates in a packet rule?Packet Rule Classification (2/3) 3 set of predicates consist a packet rule
FH (flow header): packet fields exactly reported in the flow record
PP (packet payload): content signature MI (meta information): other packet header information that
is reported either inexactly or not at all in the flow record
alert udp $EXTERNAL_NET any -> $HOME_NET 1434
(msg:"MS-SQL version ove…"; dsize:>100; content:"|04|"; …)
(FH) (FH) (FH) (FH) (FH)
(MI) (PP)
![Page 12: Rule-based Anomaly Detection on IP Flows](https://reader035.vdocuments.mx/reader035/viewer/2022062408/568137df550346895d9f82da/html5/thumbnails/12.jpg)
2009/4/9 Speaker: Li-Ming Chen 12
How to classify packet rules?Packet Rule Classification (3/3)
Partition packet rules into disjoint classes Classify rules based on types of predicates present
rule
Rules include at leastone PP predicates
Rules compriseonly FH predicates
Other rules (no PP, do have MI, may include FH)
![Page 13: Rule-based Anomaly Detection on IP Flows](https://reader035.vdocuments.mx/reader035/viewer/2022062408/568137df550346895d9f82da/html5/thumbnails/13.jpg)
2009/4/9 Speaker: Li-Ming Chen 13
Outline
Motivation & Goal
Packet Rule Classification
Packet Rules Flow Rules
Dataset & Evaluation Methodology
Experimental Results
Real Deployment Issues
Conclusion & My Comments
![Page 14: Rule-based Anomaly Detection on IP Flows](https://reader035.vdocuments.mx/reader035/viewer/2022062408/568137df550346895d9f82da/html5/thumbnails/14.jpg)
2009/4/9 Speaker: Li-Ming Chen 14
Rules in Practice
Snort rules: A Boolean formula composed of predicates that check for s
pecific values of various fields present in the IP header, transport header, and payload
Features used to construct flow rules in this paper: Src. port, Dst. port, Src. IP address, Dst. IP address, #packets, #bytes, mean packet size, duration, mean packet interarrival time, TCP flags, protocol, ToS.
FH, MI & PP
![Page 15: Rule-based Anomaly Detection on IP Flows](https://reader035.vdocuments.mx/reader035/viewer/2022062408/568137df550346895d9f82da/html5/thumbnails/15.jpg)
2009/4/9 Speaker: Li-Ming Chen 15
Packet Rules Flow Rules
PacketsPackets
IP flowsIP flows
e.g., NetFlow
Snort
ML-method
…
Snort alerts
Flow rules
(associate the packet alert with the corresponding flow)
Build training
data
![Page 16: Rule-based Anomaly Detection on IP Flows](https://reader035.vdocuments.mx/reader035/viewer/2022062408/568137df550346895d9f82da/html5/thumbnails/16.jpg)
2009/4/9 Speaker: Li-Ming Chen 16
Packet Rules Flow Rules (detailed)
Snort
ML-method
Snort alerts
Flow rules
Build training
data
For each Snort rule,• training data (xi, yi), flow i has flow
features xi, and yi = {–1, 1} indicates
where flow i triggered this snort rule.• then we can run ML algo. by minimizing the classification error:
(xi, yi)Give each featurea weight.Learn these weightsto minimize trainingerror.
Assign eachSnort rule a score
![Page 17: Rule-based Anomaly Detection on IP Flows](https://reader035.vdocuments.mx/reader035/viewer/2022062408/568137df550346895d9f82da/html5/thumbnails/17.jpg)
2009/4/9 Speaker: Li-Ming Chen 17
Learning Flow Rules
Note that A single packet may raise multiple Snort alerts individual flows can be associated with many Snort
alerts Machine learning algorithms
Choose AdaBoost as the candidate algorithm Due to, actual number of features is large AdaBoost use incremental greedy training procedure to only a
dds features needed for finer discrimination Good generalization (than SVM) Low level of noise in the training data
![Page 18: Rule-based Anomaly Detection on IP Flows](https://reader035.vdocuments.mx/reader035/viewer/2022062408/568137df550346895d9f82da/html5/thumbnails/18.jpg)
2009/4/9 Speaker: Li-Ming Chen 18
Outline
Motivation & Goal
Packet Rule Classification
Packet Rules Flow Rules
Dataset & Evaluation Methodology
Experimental Results
Real Deployment Issues
Conclusion & My Comments
![Page 19: Rule-based Anomaly Detection on IP Flows](https://reader035.vdocuments.mx/reader035/viewer/2022062408/568137df550346895d9f82da/html5/thumbnails/19.jpg)
2009/4/9 Speaker: Li-Ming Chen 19
Dataset (during Aug ~ Sep 2005)
29 days (4 weeks) Total: >106 flows, >5 TBytes. Average rate: 2 MBytes/sec. Average: 14.5 pkt/flow. 55% of flows comprised 1 pkt !
For machine learning: Week 1: training Week 2: training & testing Week 3 & 4: testing
PacketsPackets
IP flowsIP flows
unsampledNetFlow
OC-3 linkborder router
(all)
![Page 20: Rule-based Anomaly Detection on IP Flows](https://reader035.vdocuments.mx/reader035/viewer/2022062408/568137df550346895d9f82da/html5/thumbnails/20.jpg)
2009/4/9 Speaker: Li-Ming Chen 20
Dataset (learning performance…!?)Number of flows (106) per week
(Neg: True Negative, Pos: True Positive)
Amount of unique examples is small( speed up training)
Normal flows:
Anomalous flows:
Further speedup: Remove deterministic features reduce # of training data 1) remove flows whose source is part of local network 2) Snort rules only apply to a single protocol train for specific
protocol (TCP, UDP, ICMP)
![Page 21: Rule-based Anomaly Detection on IP Flows](https://reader035.vdocuments.mx/reader035/viewer/2022062408/568137df550346895d9f82da/html5/thumbnails/21.jpg)
2009/4/9 Speaker: Li-Ming Chen 21
Evaluation Criteria
A detection is a boolean action (T or F ?) For each rule, we get a confidence score after te
sting by a classifier require an threshold to determine T or F
Use precision and recall as evaluation criteria Precision = TPk/(TPk + FPk)
Average Precision => value closer to 1 is better !
![Page 22: Rule-based Anomaly Detection on IP Flows](https://reader035.vdocuments.mx/reader035/viewer/2022062408/568137df550346895d9f82da/html5/thumbnails/22.jpg)
2009/4/9 Speaker: Li-Ming Chen 22
Evaluation Methodology
Focus on 21 most triggered rules over wk 1 & 2 Refer to next slide!
Compare the AP (Avg. Precisions) for: 1) Baseline behavior
Training on one full week and testing on the subsequent week E.g., wk1-2 training on wk 1 and testing on wk 2.
2) Data drift Determine how often re-training should be applied (e.g., wk1-3)
3) Sampling of negative example Normal flows are the majority Reduce normal flows keep accuracy while reduce training
time !?
![Page 23: Rule-based Anomaly Detection on IP Flows](https://reader035.vdocuments.mx/reader035/viewer/2022062408/568137df550346895d9f82da/html5/thumbnails/23.jpg)
2009/4/9 Speaker: Li-Ming Chen 23
Show the complexityof a uniqueflow
See alert details
1
3
4
9
10
15
20
(Snort alerts)
flag
sizeflag
ICMP content?
![Page 24: Rule-based Anomaly Detection on IP Flows](https://reader035.vdocuments.mx/reader035/viewer/2022062408/568137df550346895d9f82da/html5/thumbnails/24.jpg)
2009/4/9 Speaker: Li-Ming Chen 24
1
3
4
9
10
15
20
Header
Meta-Info
Payload
Payload rules show great variability
Data Draft:• 2-week drift is acceptable• 3-week drift loss of performance especially for Meta-Info & Payload
![Page 25: Rule-based Anomaly Detection on IP Flows](https://reader035.vdocuments.mx/reader035/viewer/2022062408/568137df550346895d9f82da/html5/thumbnails/25.jpg)
2009/4/9 Speaker: Li-Ming Chen 25
Header
Meta-Info
Payload
1
3
4
9
10
15
20
Sampling of Negative (normal) Example:• measurable loss in performance• while 6x faster in training
![Page 26: Rule-based Anomaly Detection on IP Flows](https://reader035.vdocuments.mx/reader035/viewer/2022062408/568137df550346895d9f82da/html5/thumbnails/26.jpg)
2009/4/9 Speaker: Li-Ming Chen 26
What features are more important than others?
Feature is removed during detection
Payload rules are hard to reproduced in a flow setting.• some rules have several predicates (that could be learned)
![Page 27: Rule-based Anomaly Detection on IP Flows](https://reader035.vdocuments.mx/reader035/viewer/2022062408/568137df550346895d9f82da/html5/thumbnails/27.jpg)
2009/4/9 Speaker: Li-Ming Chen 27
Outline
Motivation & Goal
Packet Rule Classification
Packet Rules Flow Rules
Dataset & Evaluation Methodology
Experimental Results
Real Deployment Issues
Conclusion & My Comments
![Page 28: Rule-based Anomaly Detection on IP Flows](https://reader035.vdocuments.mx/reader035/viewer/2022062408/568137df550346895d9f82da/html5/thumbnails/28.jpg)
2009/4/9 Speaker: Li-Ming Chen 28
Architecture
Other issues: Can rules learned from a site be used for other sites? Some flow features (e.g., duration) are link/network
dependent…
![Page 29: Rule-based Anomaly Detection on IP Flows](https://reader035.vdocuments.mx/reader035/viewer/2022062408/568137df550346895d9f82da/html5/thumbnails/29.jpg)
2009/4/9 Speaker: Li-Ming Chen 29
Other issues
Computational efficiency Initial correlation of Flows and Snort Alarms AdaBoost parameter setup, and learning time Run-time classification
![Page 30: Rule-based Anomaly Detection on IP Flows](https://reader035.vdocuments.mx/reader035/viewer/2022062408/568137df550346895d9f82da/html5/thumbnails/30.jpg)
2009/4/9 Speaker: Li-Ming Chen 30
Conclusion
![Page 31: Rule-based Anomaly Detection on IP Flows](https://reader035.vdocuments.mx/reader035/viewer/2022062408/568137df550346895d9f82da/html5/thumbnails/31.jpg)
2009/4/9 Speaker: Li-Ming Chen 31
My Comments
![Page 32: Rule-based Anomaly Detection on IP Flows](https://reader035.vdocuments.mx/reader035/viewer/2022062408/568137df550346895d9f82da/html5/thumbnails/32.jpg)
Appendix – 21 Snort Rules used in this paper
From snort-rules-version
Back to evaluation
![Page 33: Rule-based Anomaly Detection on IP Flows](https://reader035.vdocuments.mx/reader035/viewer/2022062408/568137df550346895d9f82da/html5/thumbnails/33.jpg)
2009/4/9 Speaker: Li-Ming Chen 33
Header (1/2)
1) alert icmp any any -> any any (msg:"ICMP Destination Unreachable Communication Administratively Prohibited"; icode:13; itype:3; classtype:misc-activity; sid:485; rev:4;)
2) alert icmp any any -> any any (msg:"ICMP Destination Unreachable Communication with Destination Host is Administratively Prohibited"; icode:10; itype:3; classtype:misc-activity; sid:486; rev:4;)
Back to evaluation
![Page 34: Rule-based Anomaly Detection on IP Flows](https://reader035.vdocuments.mx/reader035/viewer/2022062408/568137df550346895d9f82da/html5/thumbnails/34.jpg)
2009/4/9 Speaker: Li-Ming Chen 34
Header (2/2)
3) alert icmp $EXTERNAL_NET any -> $HOME_NET any (msg:"ICMP Source Quench"; icode:0; itype:4; classtype:bad-unknown; sid:477; rev:2;)
![Page 35: Rule-based Anomaly Detection on IP Flows](https://reader035.vdocuments.mx/reader035/viewer/2022062408/568137df550346895d9f82da/html5/thumbnails/35.jpg)
2009/4/9 Speaker: Li-Ming Chen 35
Meta-Information (1/3)
4) alert icmp $EXTERNAL_NET any -> $HOME_NET any (msg:"ICMP webtrends scanner"; icode:0; itype:8; content:"|00 00 00 00|EEEEEEEEEEEE"; reference:arachnids,307; classtype:attempted-recon; sid:476; rev:4;)
5) alert tcp $EXTERNAL_NET any -> $HOME_NET any (msg:"BAD-TRAFFIC data in TCP SYN packet"; flow:stateless; dsize:>6; flags:S,12; reference:url,www.cert.org/incident_notes/IN-99-07.html; classtype:misc-activity; sid:526; rev:11;)
![Page 36: Rule-based Anomaly Detection on IP Flows](https://reader035.vdocuments.mx/reader035/viewer/2022062408/568137df550346895d9f82da/html5/thumbnails/36.jpg)
2009/4/9 Speaker: Li-Ming Chen 36
Meta-Information (2/3)
6) alert icmp $EXTERNAL_NET any -> $HOME_NET any (msg:"ICMP Large ICMP Packet"; dsize:>800; reference:arachnids,246; classtype:bad-unknown; sid:499; rev:4;)
7) alert icmp $EXTERNAL_NET any -> $HOME_NET any (msg:"ICMP PING NMAP"; dsize:0; itype:8; reference:arachnids,162; classtype:attempted-recon; sid:469; rev:3;)
![Page 37: Rule-based Anomaly Detection on IP Flows](https://reader035.vdocuments.mx/reader035/viewer/2022062408/568137df550346895d9f82da/html5/thumbnails/37.jpg)
2009/4/9 Speaker: Li-Ming Chen 37
Meta-Information (3/3)
8) alert tcp $EXTERNAL_NET any -> $HOME_NET any (msg:"SCAN FIN"; flow:stateless; flags:F,12; reference:arachnids,27; classtype:attempted-recon; sid:621; rev:7;)
9) 111 || 8 || spp_stream4: FIN Stealth Scan gid: 111 Snort Pre-processor, 4th stream pre-processor alert id: 8
![Page 38: Rule-based Anomaly Detection on IP Flows](https://reader035.vdocuments.mx/reader035/viewer/2022062408/568137df550346895d9f82da/html5/thumbnails/38.jpg)
2009/4/9 Speaker: Li-Ming Chen 38
Payload (1/6)
10) alert udp $EXTERNAL_NET any -> $HOME_NET 1434 (msg:"MS-SQL version overflow attempt"; flowbits:isnotset,ms_sql_seen_dns; dsize:>100; content:"|04|"; depth:1; reference:bugtraq,5310; reference:cve,2002-0649; reference:nessus,10674; classtype:misc-activity; sid:2050; rev:8;)
11) alert tcp $AIM_SERVERS any -> $HOME_NET any (msg:"CHAT AIM receive message"; flow:to_client; content:"*|02|"; depth:2; content:"|00 04 00 07|"; depth:4; offset:6; classtype:policy-violation; sid:1633; rev:6;)
![Page 39: Rule-based Anomaly Detection on IP Flows](https://reader035.vdocuments.mx/reader035/viewer/2022062408/568137df550346895d9f82da/html5/thumbnails/39.jpg)
2009/4/9 Speaker: Li-Ming Chen 39
Payload (2/6)
12) 2376 || EXPLOIT ISAKMP first payload certificate request length overflow attempt || bugtraq,9582 || cve,2004-0040
13) 483 || ICMP PING CyberKit 2.2 Windows || arachnids,154
14) 480 || ICMP PING speedera
![Page 40: Rule-based Anomaly Detection on IP Flows](https://reader035.vdocuments.mx/reader035/viewer/2022062408/568137df550346895d9f82da/html5/thumbnails/40.jpg)
2009/4/9 Speaker: Li-Ming Chen 40
Payload (3/6)
![Page 41: Rule-based Anomaly Detection on IP Flows](https://reader035.vdocuments.mx/reader035/viewer/2022062408/568137df550346895d9f82da/html5/thumbnails/41.jpg)
2009/4/9 Speaker: Li-Ming Chen 41
Payload (4/6)
![Page 42: Rule-based Anomaly Detection on IP Flows](https://reader035.vdocuments.mx/reader035/viewer/2022062408/568137df550346895d9f82da/html5/thumbnails/42.jpg)
2009/4/9 Speaker: Li-Ming Chen 42
Payload (5/6)
![Page 43: Rule-based Anomaly Detection on IP Flows](https://reader035.vdocuments.mx/reader035/viewer/2022062408/568137df550346895d9f82da/html5/thumbnails/43.jpg)
2009/4/9 Speaker: Li-Ming Chen 43
Payload (6/6)