information integration and assurance laboratory iee594 presentation xiangyang li dept. of...
Post on 21-Dec-2015
217 views
TRANSCRIPT
Information Integration and Assurance Laboratory
IEE594 Presentation
Xiangyang Li
Dept. of Industrial Engineering
Arizona State University Box 875906, Tempe, AZ 85287-5906, USA
Oct. 2000 2
Current People
• DirectorDr. Nong Ye
• StudentsMaster:
Syed Masum Emran
Ph.D.:
Qiang Chen
Xiangyang Li
Mingming Xu
Dawei Zhang
Yebin Zhang
Oct. 2000 3
Current Researches
• Information securityIntrusion detection Technology Study
• Supply chain - Business SchoolEnterprise modeling and simulation
Intrusion Detection Technology
Application of Decision Tree Classifier
Oct. 2000 5
Intrusion Detection - Defensive System
• Security Policy– What should we protect?
• Prevention– How can we prevent an intrusion?
• Detection– If there is an intrusion, how can we detect it?
• Response/Recovery– If we detect an intrusion, how can we response? How can we recover the
system from the damage?
Oct. 2000 6
Intrusion Detection - Methods
• Norm-based Approach– Statistical-based Techniques (SPC)
• Build up a norm profile with statistical methods
– Specification-based Techniques (ANN, BN,...)
• Build up a norm profile with rules and logical specification
• Signature-based Approach (DT, Clustering,...)– Recognize the pre-defined intrusion signature from system activities.
Oct. 2000 7
Problem Definition(1)
• Intrusion Detection
Normality profile method
Signature recognition method– Decision tree technique can be used
to build the signatures of normal activities and attacks automatically. Each path of the tree corresponds to a signature.
– Each leaf represents an IW value. Each leaf corresponds to a specific state of the system.
Oct. 2000 8
Problem Definition(2)
• BSM audit event from Solaris event 217
auid -2
euid 0
egid 0
ruid 0
rgid 0
pid 96
sid 0
RemoteIP 0.0.0.0
time 897047263
error_message 91
process_error 0
retval 0
attack 0
• Target variable– Label : 0 - normal activity, 1 - attack– IW(Intrusion Warning) : 0 - 1
• Predictor variables
Only use the information of event type. (284 event types - Solaris 2.7)
• Data sets– Training data set– Testing data set
Oct. 2000 9
Problem Definition(3)
• Decision tree algorithms– GINI and CHAID (Answer Tree - SPSS Inc.)
• Analysis of testing results– Comparison of Mean, Max and Min of IW values between normal and
attack events.
– ROC (Receiver Operating Curve) with Hit rates and False alarm rates based on the predicted IW values and the true Label values.
Oct. 2000 10
Single-event Decision Tree Classifier
• Single-event classifier– Label -> target variable
– Event type -> the only predictor variable
Oct. 2000 11
Result Analysis(1)
IWValue
Min Max Mean StandardDeviation
Normal 0.00 1.00 0.209 0.135
Attack 0.00 1.00 0.368 0.255
Statistics for single event classifier (CHAID)
Oct. 2000 12
Result Analysis(2)
ROC analysis for single event classifier (CHAID)
0
0.2
0.4
0.6
0.8
1
0 0.2 0.4 0.6 0.8 1
False alarm rate
Hit
rat
e
Oct. 2000 13
EWMA VectorsWe use one variable to represent one event type. Then there are 284 variables for the 284 event types. In our sample data set there are 49 variables. We use these variables as the predictor variables. Each variable is calculated for each event as:
)1(*)1(1*)( tXtX ii if the audit event at time t belongs to the ith event type
)1(*)1(0*)( tXtX ii if the audit event at time t is different from the ith event type
3.0,0)0( iX
Oct. 2000 14
Result Analysis(3)IW
ValueMin Max Mean Standard
DeviationNormal 0.00 1.00 0.209 0.135
Attack 0.00 1.00 0.368 0.255
Statistics for single event classifier (CHAID)
IWValue
Min Max Mean StandardDeviation
Normal 0.00 1.00 0.046 0.210
Attack 0.00 1.00 0.881 0.324
Statistics for EWMA vector classifier (CHAID)
Oct. 2000 15
Result Analysis(4)
ROC analysis for EWMA vectors (GINI-CHAID)
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0 0.2 0.4 0.6 0.8 1
False alarm rate
Hit
ra
te
GINI
CHAID
Oct. 2000 16
Moving Window
Moving Direction
Observation Window
E2 E3 E7 E6 E3 E4 E16 E2
Window Size = 4 units
New datavariables {E1… E2 E3 E4 E5 E6 E7…E284}
values {… 0 1 1 0 1 1 …}
Oct. 2000 17
“Existence” and “Count” Classifiers
• “Existence”
In the transferred data set, variable i records whether event type i exists in current moving window.
• “Count”
In the transferred data set, variable i records how many times event type i appears in current moving window. We use this one in moving window classifiers on event types.
• Truncation
Remove the part of transferred data which includes both normal and attack
events.
Oct. 2000 18
Result Analysis(5)
IWValue
Min Max Mean StandardDeviation
Normal 0.00 1.00 0.080 0.153
Attack 0.00 1.00 0.790 0.384
Statistics for moving window classifier (CHAID)
Oct. 2000 19
Result Analysis(6)
ROC for moving window classifier(CHAID)
0
0.2
0.4
0.6
0.8
1
0 0.2 0.4 0.6 0.8 1
False alarm rate
Hit
rat
e
Oct. 2000 20
Layered Classifiers
Single event classifier
Auditdata
Upper Level
Lower Level
IW
State-ID classifier
IW
Classified States
State-ID Classifiers
Oct. 2000 21
Result Analysis(7)
ROC analysis for state_ID classifiers (CHAID)
0
0.2
0.4
0.6
0.8
1
0 0.2 0.4 0.6 0.8 1
False alarm rate
Hit
rat
e
Count
Existence
Oct. 2000 22
Result Analysis(8)
ROC analysis for "count" state-ID classifiers
0.8
0.85
0.9
0.95
1
0 0.2 0.4 0.6 0.8 1
False alarm rate
Hit
rat
e
Chaid
Gini
Oct. 2000 23
Result Analysis(9)
Comparison of Decision Tree Classifiers(CHAID-Count)
0
0.2
0.4
0.6
0.8
1
0 0.2 0.4 0.6 0.8 1
False alarm rate
Hit
rat
e
Single event
EWMA vectors
Moving window
State_ID
Oct. 2000 24
Conclusions and Problem
Conclusions
• DTCs show promising performance in intrusion detection application
• The performance of a DTC is dependent on its design, i.e. the choice of predictor variables and target variable.
• Different decision tree algorithms impact the results.
Problem
• Computational Feasibility
– Incremental training ability(ITI)
– Scalable/Parallel/Database(ScalParC)
– Bagging and Boosting?
Oct. 2000 25
END
• Other works - http://iia.eas.asu.edu/