1 dot plots for time series analysis dragomir yankov, eamonn keogh, stefano lonardi dept. of...
Post on 22-Dec-2015
227 views
TRANSCRIPT
![Page 1: 1 Dot Plots For Time Series Analysis Dragomir Yankov, Eamonn Keogh, Stefano Lonardi Dept. of Computer Science & Eng. University of California Riverside](https://reader035.vdocuments.mx/reader035/viewer/2022062314/56649d815503460f94a65f6a/html5/thumbnails/1.jpg)
1
Dot Plots For Time Series Analysis
Dragomir Yankov, Eamonn Keogh, Stefano LonardiDept. of Computer Science & Eng.University of California Riverside
Ada Waichee FuDept. of Computer Science & Eng.
The Chinese University of Hong Kong
![Page 2: 1 Dot Plots For Time Series Analysis Dragomir Yankov, Eamonn Keogh, Stefano Lonardi Dept. of Computer Science & Eng. University of California Riverside](https://reader035.vdocuments.mx/reader035/viewer/2022062314/56649d815503460f94a65f6a/html5/thumbnails/2.jpg)
2
Sequence analysis with dot plots
t
a
g
t
a
a t g t a g
• Introduced by Gibbs & McIntyre (1970)
• Observed patterns– Matches (homologies)– Reverses– Gaps (differences or
mutations)
![Page 3: 1 Dot Plots For Time Series Analysis Dragomir Yankov, Eamonn Keogh, Stefano Lonardi Dept. of Computer Science & Eng. University of California Riverside](https://reader035.vdocuments.mx/reader035/viewer/2022062314/56649d815503460f94a65f6a/html5/thumbnails/3.jpg)
3
Dot Plots For Time Series Analysis
• Problem statement: How can we meaningfully adapt the DP analysis for real value data
• The DP method would ideally be:– Robust to noise– Invariant to value and time shifts– Invariant to certain amount of time warping– Efficiently computable
![Page 4: 1 Dot Plots For Time Series Analysis Dragomir Yankov, Eamonn Keogh, Stefano Lonardi Dept. of Computer Science & Eng. University of California Riverside](https://reader035.vdocuments.mx/reader035/viewer/2022062314/56649d815503460f94a65f6a/html5/thumbnails/4.jpg)
4
Related work
nji
xxxrHM jiiij
..1,
))((
Recurrence plots (Eckman et al (1987))
Problem with recurrence plots
Matches are locally (point) based ratherthan subsequence based
)( ixr- Provide intuitive 2D view of multidimensional dynamical systems
- Matrix is computed over the heaviside function
![Page 5: 1 Dot Plots For Time Series Analysis Dragomir Yankov, Eamonn Keogh, Stefano Lonardi Dept. of Computer Science & Eng. University of California Riverside](https://reader035.vdocuments.mx/reader035/viewer/2022062314/56649d815503460f94a65f6a/html5/thumbnails/5.jpg)
5
The proposed solution
• Reducing the dot plot procedure to the motif finding problem
• Applying the Random Projection algorithm for finding motifs in time series data (Chiu et al 2003)
• Presegmenting the series to achieve time warping invariance
It satisfies the initial requirements of robustness to outliers and invariance to time and value shifts
![Page 6: 1 Dot Plots For Time Series Analysis Dragomir Yankov, Eamonn Keogh, Stefano Lonardi Dept. of Computer Science & Eng. University of California Riverside](https://reader035.vdocuments.mx/reader035/viewer/2022062314/56649d815503460f94a65f6a/html5/thumbnails/6.jpg)
6
Dot plots and motif finding
• Def: match, trivial match, motif
- D(P,Q) <= R, we say that Q is a match of P
- D(P,Q) <= R,D(P,Q1)<= R, we say that Q1 is a trivial match of P
- A non trivial match is a motif
• Def: Time series dot plot – a plot that contains a point at position (i,j) iff TS1(i) and TS2(j) represent the same motif
![Page 7: 1 Dot Plots For Time Series Analysis Dragomir Yankov, Eamonn Keogh, Stefano Lonardi Dept. of Computer Science & Eng. University of California Riverside](https://reader035.vdocuments.mx/reader035/viewer/2022062314/56649d815503460f94a65f6a/html5/thumbnails/7.jpg)
7
The Random Projection algorithm
• Based on PROJECTION (Buhler & Tompa 2002)
• Algorithm outline– Split the TS into subsequences and symbolize them
– Separate the symbolic sequences into classes of equivalence using PROJECTION
– Mark as motifs sequences from the same class of equivalence
![Page 8: 1 Dot Plots For Time Series Analysis Dragomir Yankov, Eamonn Keogh, Stefano Lonardi Dept. of Computer Science & Eng. University of California Riverside](https://reader035.vdocuments.mx/reader035/viewer/2022062314/56649d815503460f94a65f6a/html5/thumbnails/8.jpg)
8
Random Projection – symbolization
iw
n
iw
nj
ji pn
wp
1)1(
- Applies PAA (Piecewise Aggregate Approximation)
Input TS:
PAA TS:
npppP ...21
wpppP ...21
- Assigns letters to the PAA segments
Utilizes the Symbolic Aggregate Approximation (SAX) scheme:
![Page 9: 1 Dot Plots For Time Series Analysis Dragomir Yankov, Eamonn Keogh, Stefano Lonardi Dept. of Computer Science & Eng. University of California Riverside](https://reader035.vdocuments.mx/reader035/viewer/2022062314/56649d815503460f94a65f6a/html5/thumbnails/9.jpg)
9
Random Projection–motif finding- The symbolic representations of the plotted time series are stored into tables
- d random dimensions are masked and the strings are divided into separate bins
![Page 10: 1 Dot Plots For Time Series Analysis Dragomir Yankov, Eamonn Keogh, Stefano Lonardi Dept. of Computer Science & Eng. University of California Riverside](https://reader035.vdocuments.mx/reader035/viewer/2022062314/56649d815503460f94a65f6a/html5/thumbnails/10.jpg)
10
Random Projection–motif finding
- Updating the dot plot collision matrix
- The update is performed for m iterations.
![Page 11: 1 Dot Plots For Time Series Analysis Dragomir Yankov, Eamonn Keogh, Stefano Lonardi Dept. of Computer Science & Eng. University of California Riverside](https://reader035.vdocuments.mx/reader035/viewer/2022062314/56649d815503460f94a65f6a/html5/thumbnails/11.jpg)
11
Random Projection for streaming
• Complexity: space – O(|M|), time – O(m|M|)– For practical data sets M is “very sparse”– For time series data small values of m (order of 10) generate
highly descriptive plots
• Random Projection as online algorithm– Good time performance– Updatability
![Page 12: 1 Dot Plots For Time Series Analysis Dragomir Yankov, Eamonn Keogh, Stefano Lonardi Dept. of Computer Science & Eng. University of California Riverside](https://reader035.vdocuments.mx/reader035/viewer/2022062314/56649d815503460f94a65f6a/html5/thumbnails/12.jpg)
12
Experimental evaluation
Recurrent data with variable state length- The anomaly is
of the same type: A
- Small time warpings (shifts) are detected: B
- Larger time warpings are omitted: C
Dot Plots for anomaly detection
![Page 13: 1 Dot Plots For Time Series Analysis Dragomir Yankov, Eamonn Keogh, Stefano Lonardi Dept. of Computer Science & Eng. University of California Riverside](https://reader035.vdocuments.mx/reader035/viewer/2022062314/56649d815503460f94a65f6a/html5/thumbnails/13.jpg)
13
Experimental evaluation
Recurrent data with fixed state length
Dot Plots for anomaly detection
![Page 14: 1 Dot Plots For Time Series Analysis Dragomir Yankov, Eamonn Keogh, Stefano Lonardi Dept. of Computer Science & Eng. University of California Riverside](https://reader035.vdocuments.mx/reader035/viewer/2022062314/56649d815503460f94a65f6a/html5/thumbnails/14.jpg)
14
Experimental evaluation
Dot Plots for pattern detection
Stock marketdata
![Page 15: 1 Dot Plots For Time Series Analysis Dragomir Yankov, Eamonn Keogh, Stefano Lonardi Dept. of Computer Science & Eng. University of California Riverside](https://reader035.vdocuments.mx/reader035/viewer/2022062314/56649d815503460f94a65f6a/html5/thumbnails/15.jpg)
15
Experimental evaluation
Dot Plots for pattern detection
Audio data
![Page 16: 1 Dot Plots For Time Series Analysis Dragomir Yankov, Eamonn Keogh, Stefano Lonardi Dept. of Computer Science & Eng. University of California Riverside](https://reader035.vdocuments.mx/reader035/viewer/2022062314/56649d815503460f94a65f6a/html5/thumbnails/16.jpg)
16
Experimental evaluation
Dot Plots for pattern detection
Discrete data: for some tasks obtaining a real value representation is beneficial
MUMer
Random Projection
![Page 17: 1 Dot Plots For Time Series Analysis Dragomir Yankov, Eamonn Keogh, Stefano Lonardi Dept. of Computer Science & Eng. University of California Riverside](https://reader035.vdocuments.mx/reader035/viewer/2022062314/56649d815503460f94a65f6a/html5/thumbnails/17.jpg)
17
Dynamic sliding window
• The fixed window does not perform well when:– The size of the recurrent states varies– We do not “guess” correctly the size of the states
• Solution: use time series segmentation heuristics and a dynamic sliding window
![Page 18: 1 Dot Plots For Time Series Analysis Dragomir Yankov, Eamonn Keogh, Stefano Lonardi Dept. of Computer Science & Eng. University of California Riverside](https://reader035.vdocuments.mx/reader035/viewer/2022062314/56649d815503460f94a65f6a/html5/thumbnails/18.jpg)
18
Dynamic sliding window
Comparison of the dynamic and fixed sliding windows
The dynamic sliding window preserves moreinformation about the frequency variability
Synthetic dataset Tide data set
![Page 19: 1 Dot Plots For Time Series Analysis Dragomir Yankov, Eamonn Keogh, Stefano Lonardi Dept. of Computer Science & Eng. University of California Riverside](https://reader035.vdocuments.mx/reader035/viewer/2022062314/56649d815503460f94a65f6a/html5/thumbnails/19.jpg)
19
Conclusion
• This work studies the problem of building dot plots for real value time series data
• It demonstrates its equivalence to the motif finding problem
• Introduced is an efficient and robust approach for building the dot plots
• The performance of the tool is evaluated empirically on a number of data sets with different characteristics
• Finally, a dynamic sliding window technique is proposed, which improves the quality and the descriptiveness of the plots