dot plots for time series analysis

19
1 Dot Plots For Time Series Analysis Dragomir Yankov, Eamonn Keogh, Stefano Lonardi Dept. of Computer Science & Eng. University of California Riverside Ada Waichee Fu Dept. of Computer Science & Eng. The Chinese University of Hong Kong

Upload: soyala

Post on 11-Feb-2016

45 views

Category:

Documents


0 download

DESCRIPTION

Dot Plots For Time Series Analysis. Dragomir Yankov, Eamonn Keogh, Stefano Lonardi Dept. of Computer Science & Eng. University of California Riverside Ada Waichee Fu Dept. of Computer Science & Eng. The Chinese University of Hong Kong. Sequence analysis with dot plots. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Dot Plots For Time Series Analysis

1

Dot Plots For Time Series Analysis

Dragomir Yankov, Eamonn Keogh, Stefano LonardiDept. of Computer Science & Eng.University of California Riverside

Ada Waichee FuDept. of Computer Science & Eng.

The Chinese University of Hong Kong

Page 2: Dot Plots For Time Series Analysis

2

Sequence analysis with dot plots

tagtaa t g t a g

• Introduced by Gibbs & McIntyre (1970)

• Observed patterns– Matches (homologies)– Reverses– Gaps (differences or

mutations)

Page 3: Dot Plots For Time Series Analysis

3

Dot Plots For Time Series Analysis

• Problem statement: How can we meaningfully adapt the DP analysis for real value data

• The DP method would ideally be:– Robust to noise– Invariant to value and time shifts– Invariant to certain amount of time warping– Efficiently computable

Page 4: Dot Plots For Time Series Analysis

4

Related work

nji

xxxrHM jiiij

..1,

))((

Recurrence plots (Eckman et al (1987))

Problem with recurrence plotsMatches are locally (point) based ratherthan subsequence based

)( ixr- Provide intuitive 2D view of multidimensional dynamical systems

- Matrix is computed over the heaviside function

Page 5: Dot Plots For Time Series Analysis

5

The proposed solution• Reducing the dot plot procedure to the

motif finding problem • Applying the Random Projection algorithm for

finding motifs in time series data (Chiu et al 2003)

• Presegmenting the series to achieve time warping invariance

It satisfies the initial requirements of robustness to outliers and invariance to time and value shifts

Page 6: Dot Plots For Time Series Analysis

6

Dot plots and motif finding• Def: match, trivial match, motif

- D(P,Q) <= R, we say that Q is a match of P

- D(P,Q) <= R,D(P,Q1)<= R, we say that Q1 is a trivial match of P

- A non trivial match is a motif• Def: Time series dot plot – a plot that contains a point at position (i,j) iff TS1(i) and TS2(j) represent the same motif

Page 7: Dot Plots For Time Series Analysis

7

The Random Projection algorithm• Based on PROJECTION (Buhler & Tompa 2002)

• Algorithm outline– Split the TS into subsequences and symbolize them

– Separate the symbolic sequences into classes of equivalence using PROJECTION

– Mark as motifs sequences from the same class of equivalence

Page 8: Dot Plots For Time Series Analysis

8

Random Projection – symbolization

iwn

iwnj

ji pnwp

1)1(

- Applies PAA (Piecewise Aggregate Approximation)

Input TS:

PAA TS:

npppP ...21

wpppP ...21

- Assigns letters to the PAA segments

Utilizes the Symbolic Aggregate Approximation (SAX) scheme:

Page 9: Dot Plots For Time Series Analysis

9

Random Projection–motif finding- The symbolic representations of the plotted time series are stored into tables

- d random dimensions are masked and the strings are divided into separate bins

Page 10: Dot Plots For Time Series Analysis

10

Random Projection–motif finding- Updating the dot plot collision matrix

- The update is performed for m iterations.

Page 11: Dot Plots For Time Series Analysis

11

Random Projection for streaming• Complexity: space – O(|M|), time – O(m|M|)

– For practical data sets M is “very sparse”– For time series data small values of m (order of 10) generate

highly descriptive plots

• Random Projection as online algorithm– Good time performance– Updatability

Page 12: Dot Plots For Time Series Analysis

12

Experimental evaluation

Recurrent data with variable state length- The anomaly is

of the same type: A

- Small time warpings (shifts) are detected: B

- Larger time warpings are omitted: C

Dot Plots for anomaly detection

Page 13: Dot Plots For Time Series Analysis

13

Experimental evaluation

Recurrent data with fixed state length

Dot Plots for anomaly detection

Page 14: Dot Plots For Time Series Analysis

14

Experimental evaluationDot Plots for pattern detection

Stock marketdata

Page 15: Dot Plots For Time Series Analysis

15

Experimental evaluationDot Plots for pattern detection

Audio data

Page 16: Dot Plots For Time Series Analysis

16

Experimental evaluationDot Plots for pattern detection

Discrete data: for some tasks obtaining a real value representation is beneficial

MUMer

Random Projection

Page 17: Dot Plots For Time Series Analysis

17

Dynamic sliding window• The fixed window does not perform well

when:– The size of the recurrent states varies– We do not “guess” correctly the size of the states

• Solution: use time series segmentation heuristics and a dynamic sliding window

Page 18: Dot Plots For Time Series Analysis

18

Dynamic sliding windowComparison of the dynamic and fixed sliding windows

The dynamic sliding window preserves moreinformation about the frequency variability

Synthetic dataset Tide data set

Page 19: Dot Plots For Time Series Analysis

19

Conclusion• This work studies the problem of building dot plots for

real value time series data• It demonstrates its equivalence to the motif finding

problem• Introduced is an efficient and robust approach for

building the dot plots• The performance of the tool is evaluated empirically

on a number of data sets with different characteristics• Finally, a dynamic sliding window technique is

proposed, which improves the quality and the descriptiveness of the plots