anomaly intrusion detection by clustering transactional audit streams in a host computer
DESCRIPTION
Anomaly intrusion detection by clustering transactional audit streams in a host computer. Nam Hun Park , Sang Hyun Oh, Won Suk Lee InS , Vol.180, 2010, pp. 2375–2389. Presenter : Wei- Shen Tai 20 10 / 4/14. Outline. Introduction Related works Clustering transactional user activities - PowerPoint PPT PresentationTRANSCRIPT
Intelligent Database Systems Lab
國立雲林科技大學National Yunlin University of Science and Technology
Anomaly intrusion detection by clustering transactional audit streams in a host computer
Nam Hun Park , Sang Hyun Oh, Won Suk Lee
InS, Vol.180, 2010, pp. 2375–2389.
Presenter : Wei-Shen Tai
2010/4/14
N.Y.U.S.T.
I. M.
Intelligent Database Systems Lab
2
Outline
Introduction Related works Clustering transactional user activities Anomaly detection of clusters on transactional
features Experimental results Conclusion Comments
N.Y.U.S.T.
I. M.
Intelligent Database Systems Lab
3
Motivation
Most anomaly intrusion detection approaches Only the static behavior of a user in the audit data set. For a real-time environment, the current activities of a user
should be processed as soon as possible to be reflected for the anomaly detection.
??
?
N.Y.U.S.T.
I. M.
Intelligent Database Systems Lab
4
Objective
A grid-based clustering algorithm for an audit data stream Detects anomaly intrusions on continuous transactional audit streams
based on partitioned grids.
N.Y.U.S.T.
I. M.
Intelligent Database Systems Lab
5
A transactional data stream and initial cell
A transaction in an audit data stream Contains a set of activities (logs) performed in sequence by a user. The number of data values in a transaction is The number of transactions in the current data stream
is Initial cell g
For each feature, the range of an initial cell g becomes the united intervals of
N.Y.U.S.T.
I. M.
Intelligent Database Systems Lab
6
Grid-based clusters
Distribution statistics of an initial cell gt = 5, 100 transactions in this Dt , the support of g is ct = 20
t = 5, 250 transactions in Dt , t_avg ={50, 20, 30, 50, 20….}
t = 5, 250 transactions in Dt , Tg is the number of data in this range of T
N.Y.U.S.T.
I. M.
Intelligent Database Systems Lab
7
Split of initial cells When a new data element et is generated, distribution statistics of the
cell g were updated
When the current support (ct) of the cell g is greater than split support threshold , two intermediate cells g11 and g2 2 are created as the children of the initial cell.
Those children of initial cells will be split under their support is less than split support threshold.
N.Y.U.S.T.
I. M.
Intelligent Database Systems Lab
8
Dividing grid-cells on distributions
To partition a dense grid-cell μ-partition, σ-partition, and hybrid-partition hybrid-partition: If dev > deve, pick μ-partition.
Otherwise, pick σ-partition
N.Y.U.S.T.
I. M.
Intelligent Database Systems Lab
9
Cluster properties
A cluster C containing a set of v adjacent dense unit grid-cells
N.Y.U.S.T.
I. M.
Intelligent Database Systems Lab
10
Decaying weights on activities
Forgetting factor It is employed to diminish the effects of old patterns.
A decay-base b determines the amount of weight reduction per decay-unit and is greater than 1. A decay-base-life w is defined as the number of decay-units.
A new transaction is generated in the current data stream
N.Y.U.S.T.
I. M.
Intelligent Database Systems Lab
11
Profiling method
Internal summary contains the properties of each cluster.
External summary represents the statistics of noise data objects, i.e., the data objects outside all clusters.
s f
N.Y.U.S.T.
I. M.
Intelligent Database Systems Lab
12
Anomaly detection method
Internal distance, ratio
Normalizing factor γ is a user-defined parameter that can control the effect of an internal difference.
External distance, ratio
N.Y.U.S.T.
I. M.
Intelligent Database Systems Lab
13
Experimental results
N.Y.U.S.T.
I. M.
Intelligent Database Systems Lab
14
Conclusion
An anomaly detection method based on a grid-based clustering algorithm For each feature, clusters can be effectively found without
physically maintaining any data elements of an audit data stream.
A user’s new activities are continuously reflected to the ongoing clustering results and the profile of the user at the same time.
N.Y.U.S.T.
I. M.
Intelligent Database Systems Lab
15
Comments Advantage
This proposed method provides a solution for anomaly intrusion detection. It seems plausible to apply this method to detect anomaly activities in different
fields. Drawback
Cold start problem will occur under no manual supervision. That is, the system cannot distinguish normal clusters from abnormal clusters in the beginning.
Application Dynamic data clustering for continuous data stream.