10/26/09, wilfrid laurier university 1 temporal relationship among clusters for data streams...

48
10/26/09, Wilfrid Laurier University 1 Temporal Relationship Among Clusters for Data Streams Margaret H. Dunham, Michael Hahsler, Doug Raiford Students: Yu Meng, Donya Quick, Jie Huang, Charlie Isaksson, Mallik Kotamarti CSE Department Southern Methodist University Dallas, Texas 75275 [email protected] This material is based upon work supported by the National Science Foundation under Grant No

Post on 19-Dec-2015

214 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: 10/26/09, Wilfrid Laurier University 1 Temporal Relationship Among Clusters for Data Streams Margaret H. Dunham, Michael Hahsler, Doug Raiford Students:

10/26/09, Wilfrid Laurier University

1

Temporal Relationship Among Clusters for Data Streams

Margaret H. Dunham, Michael Hahsler, Doug Raiford

Students: Yu Meng, Donya Quick, Jie Huang, Charlie Isaksson, Mallik Kotamarti

CSE Department

Southern Methodist University

Dallas, Texas 75275

[email protected]

This material is based upon work supported by the National Science Foundation under Grant No IIS-0948893.

Page 2: 10/26/09, Wilfrid Laurier University 1 Temporal Relationship Among Clusters for Data Streams Margaret H. Dunham, Michael Hahsler, Doug Raiford Students:

10/26/09, Wilfrid Laurier University

2

Objectives/Outline

Introduction Background TRAC-DS TRAC-DS Applications Conclusions/Future Work

Traditional Clustering of Data Streams Ignores one of the most Salient

Features of Streams: Ordering

Page 3: 10/26/09, Wilfrid Laurier University 1 Temporal Relationship Among Clusters for Data Streams Margaret H. Dunham, Michael Hahsler, Doug Raiford Students:

10/26/09, Wilfrid Laurier University

3

Objectives/Outline

Introduction Stream Data Motivation

Background TRAC-DS TRAC-DS Applications Conclusions/Future Work

Page 4: 10/26/09, Wilfrid Laurier University 1 Temporal Relationship Among Clusters for Data Streams Margaret H. Dunham, Michael Hahsler, Doug Raiford Students:

10/26/09, Wilfrid Laurier University

4

Stream Data

A growing number of applications generate streams of data. Computer network monitoring data Call detail records in telecommunications Highway transportation traffic data Online web purchase log records Sensor network data Stock exchange, transactions in retail chains, ATM

operations in banks, credit card transactions.Clustering techniques play a key role in

modeling and analyzing this data.

Page 5: 10/26/09, Wilfrid Laurier University 1 Temporal Relationship Among Clusters for Data Streams Margaret H. Dunham, Michael Hahsler, Doug Raiford Students:

10/26/09, Wilfrid Laurier University

5

Stream Data Format

Events arriving in a stream At any time, t, we can view the state

of the problem as represented by a vector of n numeric values:

Vt = <S1t, S2t, ..., Snt>

V1 V2 … VqS1 S11 S12 … S1q

S2 S21 S22 … S2q

… … … … …Sn Sn1 Sn2 … Snq

Time

Page 6: 10/26/09, Wilfrid Laurier University 1 Temporal Relationship Among Clusters for Data Streams Margaret H. Dunham, Michael Hahsler, Doug Raiford Students:

10/26/09, Wilfrid Laurier University

6

Data Stream Modeling

Single pass: Each record is examined at most once Bounded storage: Limited Memory for storing synopsis Real-time: Per record processing time must be low Summarization (Synopsis )of data Use data NOT SAMPLE Temporal and Spatial Dynamic Continuous (infinite stream) Learn Forget Sublinear growth rate - Clustering

6

Page 7: 10/26/09, Wilfrid Laurier University 1 Temporal Relationship Among Clusters for Data Streams Margaret H. Dunham, Michael Hahsler, Doug Raiford Students:

Traditional Clustering

10/26/09, Wilfrid Laurier University

7

Page 8: 10/26/09, Wilfrid Laurier University 1 Temporal Relationship Among Clusters for Data Streams Margaret H. Dunham, Michael Hahsler, Doug Raiford Students:

TRAC-DS

10/26/09, Wilfrid Laurier University

8

Page 9: 10/26/09, Wilfrid Laurier University 1 Temporal Relationship Among Clusters for Data Streams Margaret H. Dunham, Michael Hahsler, Doug Raiford Students:

Motivation

Temporal Ordering is a major feature of stream data.

Many stream applications depend on this ordering

Prediction of future values Anomaly (rare event) detection Concept drift

10/26/09, Wilfrid Laurier University

9

Page 10: 10/26/09, Wilfrid Laurier University 1 Temporal Relationship Among Clusters for Data Streams Margaret H. Dunham, Michael Hahsler, Doug Raiford Students:

10/26/09, Wilfrid Laurier University

10

Objectives/Outline

Introduction Background

Clustering Stream Data Extensible Markov Model - EMM

TRAC-DS TRAC-DS Applications Conclusions/Future Work

Page 11: 10/26/09, Wilfrid Laurier University 1 Temporal Relationship Among Clusters for Data Streams Margaret H. Dunham, Michael Hahsler, Doug Raiford Students:

Stream Clustering Requirements

Dynamic updating of the clusters Identify outliers Barbara [2]:

compactness fast incremental processing

10/26/09, Wilfrid Laurier University

11

Page 12: 10/26/09, Wilfrid Laurier University 1 Temporal Relationship Among Clusters for Data Streams Margaret H. Dunham, Michael Hahsler, Doug Raiford Students:

Stream Clustering Algorithms LOCALSEARCH [4]

Partitions stream into segments Clusters each segment individually by solving the k-

medians problem Iteratively reclusters the resulting centers

CluStream [1] Micro-clusters represented by summary statistics. Micro-clusters are handled online Micro-clusters merged offline

MONIC [13] Evolution of clusters over time Cluster transitions over time

10/26/09, Wilfrid Laurier University

12

Page 13: 10/26/09, Wilfrid Laurier University 1 Temporal Relationship Among Clusters for Data Streams Margaret H. Dunham, Michael Hahsler, Doug Raiford Students:

10/26/09, Wilfrid Laurier University

13

MM

A first order Markov Chain is a finite or countably infinite sequence of events {E1, E2, … } over discrete time points, where Pij = P(Ej | Ei), and at any time the future behavior of the process is based solely on the current state

A Markov Model (MM) is a graph with m vertices or states, S, and directed arcs, A, such that:

S ={N1,N2, …, Nm}, and A = {Lij | i 1, 2, …, m, j 1, 2, …, m} and Each arc,

Lij = <Ni,Nj> is labeled with a transition probability

Pij = P(Nj | Ni).

Page 14: 10/26/09, Wilfrid Laurier University 1 Temporal Relationship Among Clusters for Data Streams Margaret H. Dunham, Michael Hahsler, Doug Raiford Students:

10/26/09, Wilfrid Laurier University

14

Extensible Markov Model (EMM)

Time Varying Discrete First Order Markov Model Nodes are clusters of real world states. Learning continues during application phase. Learning:

Transition probabilities between states(clusters)

State labels (Cluster summary) State are modified as clusters are

Page 15: 10/26/09, Wilfrid Laurier University 1 Temporal Relationship Among Clusters for Data Streams Margaret H. Dunham, Michael Hahsler, Doug Raiford Students:

10/26/09, Wilfrid Laurier University

15

EMM for TRAC-DS Modeling

<18,10,3,3,1,0,0>

<17,10,2,3,1,0,0>

<16,9,2,3,1,0,0>

<14,8,2,3,1,0,0>

<14,8,2,3,0,0,0>

<18,10,3,3,1,1,0.>

1/3

N1

N2

2/3

N3

1/11/3

N1

N2

2/3

1/1

N3

1/1

1/2

1/3

N1

N2

2/31/2

1/2

N3

1/1

2/3

1/3

N1

N2

N1

2/21/1

N1

1

Page 16: 10/26/09, Wilfrid Laurier University 1 Temporal Relationship Among Clusters for Data Streams Margaret H. Dunham, Michael Hahsler, Doug Raiford Students:

10/26/09, Wilfrid Laurier University

16

Objectives/Outline

Introduction Background TRAC-DS

Definition Relationship to Traditional Clustering Operations

TRAC-DS Applications Conclusions/Future Work

Page 17: 10/26/09, Wilfrid Laurier University 1 Temporal Relationship Among Clusters for Data Streams Margaret H. Dunham, Michael Hahsler, Doug Raiford Students:

TRAC-DS NOTE

TRAC-DS is not: Another stream clustering

algorithm TRAC-DS is:

A new way of looking at clustering Built on top of an existing clustering

algorithm TRAC-DS may be used with any

stream clustering algorithm

10/26/09, Wilfrid Laurier University

17

Page 18: 10/26/09, Wilfrid Laurier University 1 Temporal Relationship Among Clusters for Data Streams Margaret H. Dunham, Michael Hahsler, Doug Raiford Students:

TRAC-DS Overview

10/26/09, Wilfrid Laurier University

18

Page 19: 10/26/09, Wilfrid Laurier University 1 Temporal Relationship Among Clusters for Data Streams Margaret H. Dunham, Michael Hahsler, Doug Raiford Students:

Data Stream Clustering

At each point in time a data stream clustering ζ is a partitioning of D', the data seen thus far.

Instead of the whole partitions C1, C2,..., Ck only synopses Cc1,Cc2,...,Cck are available and k is allowed to change over time.

The summaries Cci with i =1, 2,...,k typically contain information about the size, distribution and location of the data points in Ci.

10/26/09, Wilfrid Laurier University

19

Page 20: 10/26/09, Wilfrid Laurier University 1 Temporal Relationship Among Clusters for Data Streams Margaret H. Dunham, Michael Hahsler, Doug Raiford Students:

TRAC-DS Definition

Given a data stream clustering ζ, a temporal relationship among clusters (TRAC-DS) overlays a data stream clustering ζ with a EMM M, in such a way that the following are satisfied:

(1) There is a one-to-one correspondence between the clusters in ζ and the states S in M.

(2) A transition aij in the EMM M represents the probability that given a data point in cluster i, the next data point in the data stream will belong to cluster j with i; j = 1; 2; : : : ; k.

(3) The EMM M is created online together with the data stream clustering

10/26/09, Wilfrid Laurier University

20

Page 21: 10/26/09, Wilfrid Laurier University 1 Temporal Relationship Among Clusters for Data Streams Margaret H. Dunham, Michael Hahsler, Doug Raiford Students:

Clustering Operations

A clustering operation is a function q : ζ × x → ζ which is used by the data stream clustering algorithm to up date the clustering ζ given some additional information x which either is a new data point or other information (e.g., the number of the cluster to be deleted to be simplified the clustering).

10/26/09, Wilfrid Laurier University

21

Page 22: 10/26/09, Wilfrid Laurier University 1 Temporal Relationship Among Clusters for Data Streams Margaret H. Dunham, Michael Hahsler, Doug Raiford Students:

TRAC-DS Operations

A TRAC-DS operation is a function r : M × sc × y → M × sc that updates the temporal relationship among clusters represented by the EMM M with states S given a current state sc S and ∈additional information y and returns an updated EMM and possibly a new current state.

In order to be able to dynamically update the EMM M we need to store a transition count matrix C. The count cij in C contains the number of times we observed a new point being assigned by the clustering algorithm to cluster i followed by a point being assigned to cluster j.

10/26/09, Wilfrid Laurier University

22

Page 23: 10/26/09, Wilfrid Laurier University 1 Temporal Relationship Among Clusters for Data Streams Margaret H. Dunham, Michael Hahsler, Doug Raiford Students:

Stream Clustering Operations *

qassign point(ζ,x): Assigns the new data point x to an existing cluster.

qnew cluster(ζ,x): Create a new cluster. qremove cluster(ζ,x): Removes a cluster. Here x

is the cluster, i, to be removed. In this case the associated summary Cci is removed from ζ and k is decremented by one.

qmerge clusters(ζ,x): Merges two clusters. qfade clusters(ζ,x): Fades the cluster structure. qsplit clusters(ζ,x): Splits a cluster.

* Inspired by MONIC [?]10/26/09, Wilfrid Laurier University

23

Page 24: 10/26/09, Wilfrid Laurier University 1 Temporal Relationship Among Clusters for Data Streams Margaret H. Dunham, Michael Hahsler, Doug Raiford Students:

TRAC-DS Operations

rassign point(M,sc,y): Assigns the new data point to the state representing an existing cluster

rnew cluster(M,sc,y): Create a state for a new cluster.

rremove cluster(M,sc,y): Removes state. rmerge clusters(M,sc,y): Merges two states. rfade clusters(M,sc,y): Fades the transition

probabilities using an exponential decay f(t)=2−λt

rsplit clusters(M,sc,y): Splits states. Y clustering operations.

10/26/09, Wilfrid Laurier University

24

Page 25: 10/26/09, Wilfrid Laurier University 1 Temporal Relationship Among Clusters for Data Streams Margaret H. Dunham, Michael Hahsler, Doug Raiford Students:

TRAC-DS Example

10/26/09, Wilfrid Laurier University

25

Page 26: 10/26/09, Wilfrid Laurier University 1 Temporal Relationship Among Clusters for Data Streams Margaret H. Dunham, Michael Hahsler, Doug Raiford Students:

TRAC-DS Advantages

Dynamic Flexible –

Use any Clustering Algorithm Supports and clustering operations

Scalable Merges Clustering & Markov Modeling

10/26/09, Wilfrid Laurier University

26

Page 27: 10/26/09, Wilfrid Laurier University 1 Temporal Relationship Among Clusters for Data Streams Margaret H. Dunham, Michael Hahsler, Doug Raiford Students:

10/26/09, Wilfrid Laurier University

27

Objectives/Outline

Introduction Background: TRAC-DS TRAC-DS Applications

Anomaly Detection Bioinformatics

Conclusions/Future Work

Page 28: 10/26/09, Wilfrid Laurier University 1 Temporal Relationship Among Clusters for Data Streams Margaret H. Dunham, Michael Hahsler, Doug Raiford Students:

10/26/09, Wilfrid Laurier University

28

What is Anomaly in Stream Data?

Rare - Anomalous – Surprising Out of the ordinary Not outlier detection

No knowledge of data distribution Data is not static Must take temporal and spatial values into account May be interested in sequence of events

Ex: Snow in upstate New York is not an anomaly Snow in upstate New York in June is rare

Rare events may change over time

Page 29: 10/26/09, Wilfrid Laurier University 1 Temporal Relationship Among Clusters for Data Streams Margaret H. Dunham, Michael Hahsler, Doug Raiford Students:

10/26/09, Wilfrid Laurier University

29

TRAC-DS Approach to Detect Anomalies

By learning what is normal, the model can predict what is not

Normal is based on likelihood of occurrence Use TRAC-DS to build clusters and behavior

between clusters We view a rare event as:

Unusual event Transition between events states which does

not frequently occur. Continue learning

Page 30: 10/26/09, Wilfrid Laurier University 1 Temporal Relationship Among Clusters for Data Streams Margaret H. Dunham, Michael Hahsler, Doug Raiford Students:

10/26/09, Wilfrid Laurier University

30

Determining Rare

Occurrence Frequency (OFi) of an EMM state Si is normalized count of state:

Normalized Transition Probability (NTPmn),

from one state, Sm, to another, Sn, is a

normalized transition Count:

i

iii nnOF /

i

inmnm nCNTP )/()( ,,

Page 31: 10/26/09, Wilfrid Laurier University 1 Temporal Relationship Among Clusters for Data Streams Margaret H. Dunham, Michael Hahsler, Doug Raiford Students:

10/26/09, Wilfrid Laurier University

31

Datasets/Anomalies

MnDot – Minnesota Department of Transportation Automobile Accident

Ouse and Serwent – River flow data from England Flood Drought

KDD Cup 1999 & 2000http://kdd.ics.uci.edu/databases/kddcup99/kddcup99.html Intrusion

Cisco VoIP – VoIP traffic data obtained at Cisco Unusual Phone Call

Page 32: 10/26/09, Wilfrid Laurier University 1 Temporal Relationship Among Clusters for Data Streams Margaret H. Dunham, Michael Hahsler, Doug Raiford Students:

10/26/09, Wilfrid Laurier University

32

EMM Sublinear Growth

Servent Data

Page 33: 10/26/09, Wilfrid Laurier University 1 Temporal Relationship Among Clusters for Data Streams Margaret H. Dunham, Michael Hahsler, Doug Raiford Students:

TRAC-DS River Prediction

10/26/09, Wilfrid Laurier University

33

0

1

2

3

4

5

6

7

8

1 48 95 142 189 236 283 330 377 424 471 518 565 612 659

Wat

er L

evel

(m

)

Input Time Series

RLF Prediction EMM Prediction Observed

Page 34: 10/26/09, Wilfrid Laurier University 1 Temporal Relationship Among Clusters for Data Streams Margaret H. Dunham, Michael Hahsler, Doug Raiford Students:

10/26/09, Wilfrid Laurier University

34

TRAC-DA Rare Event Detection

Weekdays Weekend

Minnesota DOT Traffic Data

Detected unusual weekend traffic pattern

Page 35: 10/26/09, Wilfrid Laurier University 1 Temporal Relationship Among Clusters for Data Streams Margaret H. Dunham, Michael Hahsler, Doug Raiford Students:

TRAC-DS Intrusion Detection

DARPA 1999/2000 Synthetic Dataset MIT Lincoln Lab The DARPA 1999 dataset which is

free of attacks for two weeks (1st week and 3rd week) is used as training data

DARPA 2000 dataset which contains DDoS attacks is used a test data.

10/26/09, Wilfrid Laurier University

35

Page 36: 10/26/09, Wilfrid Laurier University 1 Temporal Relationship Among Clusters for Data Streams Margaret H. Dunham, Michael Hahsler, Doug Raiford Students:

10/26/09, Wilfrid Laurier University

36

DARPA 1999, and 2000Thresh

oldDetection

RateFalse Positive

Rate

0.9 6% 94%

0.8 20% 80%

0.7 50% 50%

0.6 100% 0%

Table 8. EMM detection and false positive rates.

TRAC-DS Intrusion Detection

Page 37: 10/26/09, Wilfrid Laurier University 1 Temporal Relationship Among Clusters for Data Streams Margaret H. Dunham, Michael Hahsler, Doug Raiford Students:

10/17/06 37

TRAC-DS & Bioinformatics

Analysis DNA/RNA Sequences Applications:

Classification Differentiation

16s RNA 1542 nt rRNA Highly conserved across species

miRNA Short (20-25nt) sequence of noncoding RNA Known since 1993 but significance not widely

appreciated until 2001 Impact / Prevent translation of mRNA

Page 38: 10/26/09, Wilfrid Laurier University 1 Temporal Relationship Among Clusters for Data Streams Margaret H. Dunham, Michael Hahsler, Doug Raiford Students:

10/17/06 38

First – Convert Sequence to NSVacgtgcacgtaactgattccggaaccaaatgtgcccacgtcga

Moving Window

A C G T

Pos 0-8 2 3 3 1

Pos 1-9 1 3 3 2

…Pos 34-42 2 4 2 1

Page 39: 10/26/09, Wilfrid Laurier University 1 Temporal Relationship Among Clusters for Data Streams Margaret H. Dunham, Michael Hahsler, Doug Raiford Students:

Next – Apply TRAC-DS

10/26/09, Wilfrid Laurier University

39

Page 40: 10/26/09, Wilfrid Laurier University 1 Temporal Relationship Among Clusters for Data Streams Margaret H. Dunham, Michael Hahsler, Doug Raiford Students:

10/17/06 41

TRAC-DS Predictionwith miRNA

Positive Data Model Cutoff Probability = 0.3 False Positive Rate = 0% True Positive Rate = 66%

Test results could be improved by meta classifiers combining multiple positive and negative classifiers together.

Page 41: 10/26/09, Wilfrid Laurier University 1 Temporal Relationship Among Clusters for Data Streams Margaret H. Dunham, Michael Hahsler, Doug Raiford Students:

Profile EMMs

10/26/09, Wilfrid Laurier University

42

• Examples of three different Profile EMMs constructed for 16S data from 3 different bacteria families

Page 42: 10/26/09, Wilfrid Laurier University 1 Temporal Relationship Among Clusters for Data Streams Margaret H. Dunham, Michael Hahsler, Doug Raiford Students:

Profile EMMs for Organism Classification

10/26/09, Wilfrid Laurier University

43

Page 43: 10/26/09, Wilfrid Laurier University 1 Temporal Relationship Among Clusters for Data Streams Margaret H. Dunham, Michael Hahsler, Doug Raiford Students:

16S Classification Accuracy

Classification accuracy using different scoring metrics on 16S rRNA data from NCBI.

We learned 31 classification models (at the phylogenetic class level) from 98 organisms and tested with 23 randomly chosen organisms.

The Profile EMM approach was able to achieve classification of more than 90% after tuning the resolution settings.

 10/26/09, Wilfrid Laurier University

44

Page 44: 10/26/09, Wilfrid Laurier University 1 Temporal Relationship Among Clusters for Data Streams Margaret H. Dunham, Michael Hahsler, Doug Raiford Students:

TRAC-DS and Bioinformatics

Efficient Alignment free sequence analysis Clustering reduces size of model

Flexible Any sequence Applicability to Metagenomics

Scoring based on similarity between EMMs or EMM and input sequence

Applications Classification Differentiation

10/26/09, Wilfrid Laurier University

45

Page 45: 10/26/09, Wilfrid Laurier University 1 Temporal Relationship Among Clusters for Data Streams Margaret H. Dunham, Michael Hahsler, Doug Raiford Students:

10/26/09, Wilfrid Laurier University

46

Objectives/Outline

Introduction Background TRAC-DS TRAC-DS Applications Conclusions/Future Work

Page 46: 10/26/09, Wilfrid Laurier University 1 Temporal Relationship Among Clusters for Data Streams Margaret H. Dunham, Michael Hahsler, Doug Raiford Students:

TRAC-DS Ongoing/Future

Create online tool suite Improve TRAC algorithms:

Aging Delete state Merge states Split states

Apply to Image Recognition Bioinformatics

Build Profile EMM database of NCBI 16S Bacteria Data

Perform classification using Metagenomic Data collected from Yellowstone National Park

10/26/09, Wilfrid Laurier University

47

Page 47: 10/26/09, Wilfrid Laurier University 1 Temporal Relationship Among Clusters for Data Streams Margaret H. Dunham, Michael Hahsler, Doug Raiford Students:

10/26/09, Wilfrid Laurier University

48

Page 48: 10/26/09, Wilfrid Laurier University 1 Temporal Relationship Among Clusters for Data Streams Margaret H. Dunham, Michael Hahsler, Doug Raiford Students:

10/26/09, Wilfrid Laurier University

49

Bibliography

1) C. C. Aggarwal, J. Han, J. Wang, and P. S. Yu. A framework for clustering evolving data streams. Proceedings of the International Conference on Very Large Data Bases (VLDB), pp 81-92, 2003.

2) D. Barbara, “Requirements for clustering data streams,” SIGKDD Explorations, Vol 3, No 2, pp 23-27, 2002.

3) Margaret H. Dunham, Donya Quick, Yuhang Wang, Monnie McGee, Jim Waddle, “Visualization of DNA/RNA Structure using Temporal CGRs,”Proceedings of the IEEE 6th Symposium on Bioinformatics & Bioengineering (BIBE06), October 16-18, 2006, Washington D.C. ,pp 171-178.

4) S. Guha, A. Meyerson, N. Mishra, R. Motwani, and L. O'Callaghan, “Clustering data streams: Theory and practice,” IEEE Transactions on Knowledge and Data Engineering, Vol 15, No 3, pp 515-528, 2003.

5) Michael Hahsler and Margaret H. Dunham, “TRACDS: Temporal Relationship Among Clusters for Data Streams,” October 2009, submitted to SIAM International Conference on Data Mining.

6) Jie Huang, Yu Meng, and Margaret H. Dunham, “Extensible Markov Model,” Proceedings IEEE ICDM Conference, November 2004, pp 371-374.

7) Charlie Isaksson, Yu Meng, and Margaret H. Dunham, “Risk Leveling of Network Traffic Anomalies,” International Journal of Computer Science and Network Security, Vol 6, No 6, June 2006, pp 258-265.

8) Charlie Isaksson and Margaret H. Dunham, “A Comparative Study of Outlier Detection,” July 2009, Proceedings of the IEEE MLDM Conference, pp 440-453.

9) Mallik Kotamarti, Douglas W. Raiford, M. L. Raymer, and Margaret H. Dunham, “A Data Mining Approach to Predicting Phylum for Microbial Organisms Using Genome-Wide Sequence Data,” Proceedings of the IEEE Ninth International Conference on Bioinformatics and Bioengineering, pp 161-167, June 22-24 2009.

10) Yu Meng and Margaret H. Dunham, “Efficient Mining of Emerging Events in a Dynamic Spatiotemporal,” Proceedings of the IEEE PAKDD Conference, April 2006, Singapore. (Also in Lecture Notes in Computer Science, Vol 3918, 2006, Springer Berlin/Heidelberg, pp 750-754.)

11) Yu Meng and Margaret H. Dunham, “Mining Developing Trends of Dynamic Spatiotemporal Data Streams,” Journal of Computers, Vol 1, No 3, June 2006, pp 43-50.

12) MIT Lincoln Laboratory.: DARPA Intrusion Detection Evaluation. http://www.ll.mit.edu/mission/communications/ist/corpora/ideval/index.html, (2008)

13) M. Spiliopoulou, I. Ntoutsi, Y. Theodoridis, and R. Schult. MONIC: Modeling and monitoring cluster transitions. In Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining, Philadelphia, PA, USA, pages 706–711, 2006.