1. 2 general problem retrieval of time-series similar to a given pattern
Post on 21-Dec-2015
216 views
TRANSCRIPT
![Page 1: 1. 2 General problem Retrieval of time-series similar to a given pattern](https://reader030.vdocuments.mx/reader030/viewer/2022032521/56649d565503460f94a3427d/html5/thumbnails/1.jpg)
1
LOCATING PATTERNS INDISCRETE TIME-SERIES
Kevin B. Pratt
Committee:
Eugene FinkDmitry Goldgof
Rafael Perez
![Page 2: 1. 2 General problem Retrieval of time-series similar to a given pattern](https://reader030.vdocuments.mx/reader030/viewer/2022032521/56649d565503460f94a3427d/html5/thumbnails/2.jpg)
2
General problem
Retrieval of time-series similar to a given pattern.
![Page 3: 1. 2 General problem Retrieval of time-series similar to a given pattern](https://reader030.vdocuments.mx/reader030/viewer/2022032521/56649d565503460f94a3427d/html5/thumbnails/3.jpg)
3
Example: Stock chartsDatabase of time-series
![Page 4: 1. 2 General problem Retrieval of time-series similar to a given pattern](https://reader030.vdocuments.mx/reader030/viewer/2022032521/56649d565503460f94a3427d/html5/thumbnails/4.jpg)
4
Example: Stock chartsDatabase of time-series Pattern
![Page 5: 1. 2 General problem Retrieval of time-series similar to a given pattern](https://reader030.vdocuments.mx/reader030/viewer/2022032521/56649d565503460f94a3427d/html5/thumbnails/5.jpg)
5
Example: Stock chartsDatabase of time-series Pattern Retrieval results
![Page 6: 1. 2 General problem Retrieval of time-series similar to a given pattern](https://reader030.vdocuments.mx/reader030/viewer/2022032521/56649d565503460f94a3427d/html5/thumbnails/6.jpg)
6
Example: Stock chartsDatabase of time-series Pattern Retrieval results
.92
.87
.86
.84
![Page 7: 1. 2 General problem Retrieval of time-series similar to a given pattern](https://reader030.vdocuments.mx/reader030/viewer/2022032521/56649d565503460f94a3427d/html5/thumbnails/7.jpg)
7
Example: ElectrocardiogramDatabase of time-series
![Page 8: 1. 2 General problem Retrieval of time-series similar to a given pattern](https://reader030.vdocuments.mx/reader030/viewer/2022032521/56649d565503460f94a3427d/html5/thumbnails/8.jpg)
8
Example: ElectrocardiogramDatabase of time-series Pattern
![Page 9: 1. 2 General problem Retrieval of time-series similar to a given pattern](https://reader030.vdocuments.mx/reader030/viewer/2022032521/56649d565503460f94a3427d/html5/thumbnails/9.jpg)
9
Example: ElectrocardiogramDatabase of time-series Pattern Retrieval results
.91
.87
.98
1.0
![Page 10: 1. 2 General problem Retrieval of time-series similar to a given pattern](https://reader030.vdocuments.mx/reader030/viewer/2022032521/56649d565503460f94a3427d/html5/thumbnails/10.jpg)
10
Outline
• Previous work
• Important points
• Indexing and retrieval
• Empirical results
• Conclusions
![Page 11: 1. 2 General problem Retrieval of time-series similar to a given pattern](https://reader030.vdocuments.mx/reader030/viewer/2022032521/56649d565503460f94a3427d/html5/thumbnails/11.jpg)
11
Outline
• Previous work
• Important points
• Indexing and retrieval
• Empirical results
• Conclusions
Contributions}
![Page 12: 1. 2 General problem Retrieval of time-series similar to a given pattern](https://reader030.vdocuments.mx/reader030/viewer/2022032521/56649d565503460f94a3427d/html5/thumbnails/12.jpg)
12
Criteria for retrieval methods
Gunopulos [2000]:
• Work for erratic time-series
• Accept any pattern
• Find inexact matches
• Work when some points are missing
• Work on streaming data
![Page 13: 1. 2 General problem Retrieval of time-series similar to a given pattern](https://reader030.vdocuments.mx/reader030/viewer/2022032521/56649d565503460f94a3427d/html5/thumbnails/13.jpg)
13
Outline
• Previous work• Important points
• Indexing and retrieval
• Empirical results
• Conclusions
![Page 14: 1. 2 General problem Retrieval of time-series similar to a given pattern](https://reader030.vdocuments.mx/reader030/viewer/2022032521/56649d565503460f94a3427d/html5/thumbnails/14.jpg)
14
Previous work
• Feature choice
• Similarity metrics
• Indexing and retrieval
![Page 15: 1. 2 General problem Retrieval of time-series similar to a given pattern](https://reader030.vdocuments.mx/reader030/viewer/2022032521/56649d565503460f94a3427d/html5/thumbnails/15.jpg)
15
Previous work: Feature choice
• Discrete Fourier transforms
• Alphabets
• Statistical features
• Subsets of points
![Page 16: 1. 2 General problem Retrieval of time-series similar to a given pattern](https://reader030.vdocuments.mx/reader030/viewer/2022032521/56649d565503460f94a3427d/html5/thumbnails/16.jpg)
16
Previous work: Similarity metrics
• Euclidean distance
• Bounding rectangles
• Envelope count
• Aggregate similarity
![Page 17: 1. 2 General problem Retrieval of time-series similar to a given pattern](https://reader030.vdocuments.mx/reader030/viewer/2022032521/56649d565503460f94a3427d/html5/thumbnails/17.jpg)
17
Previous work: Indexing and retrievalAdvanced techniques:
• B-trees
• R-trees
• KD-trees
• VP-trees
• Grids
Applied techniques:
• Linear search with compression
![Page 18: 1. 2 General problem Retrieval of time-series similar to a given pattern](https://reader030.vdocuments.mx/reader030/viewer/2022032521/56649d565503460f94a3427d/html5/thumbnails/18.jpg)
18
Outline
• Previous work
• Important points• Indexing and retrieval
• Empirical results
• Conclusions
![Page 19: 1. 2 General problem Retrieval of time-series similar to a given pattern](https://reader030.vdocuments.mx/reader030/viewer/2022032521/56649d565503460f94a3427d/html5/thumbnails/19.jpg)
19
Important points
Choose “important” maxima and minima, and discard the other points.
![Page 20: 1. 2 General problem Retrieval of time-series similar to a given pattern](https://reader030.vdocuments.mx/reader030/viewer/2022032521/56649d565503460f94a3427d/html5/thumbnails/20.jpg)
20
Important points
Choose “important” maxima and minima, and discard the other points.
Original series
Example:
![Page 21: 1. 2 General problem Retrieval of time-series similar to a given pattern](https://reader030.vdocuments.mx/reader030/viewer/2022032521/56649d565503460f94a3427d/html5/thumbnails/21.jpg)
21
Important points
Choose “important” maxima and minima, and discard the other points.
Original series
Example:
![Page 22: 1. 2 General problem Retrieval of time-series similar to a given pattern](https://reader030.vdocuments.mx/reader030/viewer/2022032521/56649d565503460f94a3427d/html5/thumbnails/22.jpg)
22
Important points
Choose “important” maxima and minima, and discard the other points.
Original series
Example:
Compressed series
![Page 23: 1. 2 General problem Retrieval of time-series similar to a given pattern](https://reader030.vdocuments.mx/reader030/viewer/2022032521/56649d565503460f94a3427d/html5/thumbnails/23.jpg)
23
Definition of important points
Important minimum
![Page 24: 1. 2 General problem Retrieval of time-series similar to a given pattern](https://reader030.vdocuments.mx/reader030/viewer/2022032521/56649d565503460f94a3427d/html5/thumbnails/24.jpg)
24
Definition of important points
Important minimum• am is the minimum among
ai,…, aj
![Page 25: 1. 2 General problem Retrieval of time-series similar to a given pattern](https://reader030.vdocuments.mx/reader030/viewer/2022032521/56649d565503460f94a3427d/html5/thumbnails/25.jpg)
25
Definition of important points
Important minimum• am is the minimum among
ai,…, aj
• ai/am R and aj/am R
![Page 26: 1. 2 General problem Retrieval of time-series similar to a given pattern](https://reader030.vdocuments.mx/reader030/viewer/2022032521/56649d565503460f94a3427d/html5/thumbnails/26.jpg)
26
Definition of important points
Important minimum• am is the minimum among
ai,…, aj
• ai/am R and aj/am R
• R is a knob that determines
compression rate
![Page 27: 1. 2 General problem Retrieval of time-series similar to a given pattern](https://reader030.vdocuments.mx/reader030/viewer/2022032521/56649d565503460f94a3427d/html5/thumbnails/27.jpg)
27
Definition of important points
Important maximum• am is the maximum among ai,
…, aj
• am/ai R and am/aj R
• R is a knob that determines
compression rate
![Page 28: 1. 2 General problem Retrieval of time-series similar to a given pattern](https://reader030.vdocuments.mx/reader030/viewer/2022032521/56649d565503460f94a3427d/html5/thumbnails/28.jpg)
28
Compression example
Original series
![Page 29: 1. 2 General problem Retrieval of time-series similar to a given pattern](https://reader030.vdocuments.mx/reader030/viewer/2022032521/56649d565503460f94a3427d/html5/thumbnails/29.jpg)
29
Compression example
Original series
Compressed series
![Page 30: 1. 2 General problem Retrieval of time-series similar to a given pattern](https://reader030.vdocuments.mx/reader030/viewer/2022032521/56649d565503460f94a3427d/html5/thumbnails/30.jpg)
30
Compression example
Original series
Compressed series
![Page 31: 1. 2 General problem Retrieval of time-series similar to a given pattern](https://reader030.vdocuments.mx/reader030/viewer/2022032521/56649d565503460f94a3427d/html5/thumbnails/31.jpg)
31
Compression example
Original series
Compressed series
![Page 32: 1. 2 General problem Retrieval of time-series similar to a given pattern](https://reader030.vdocuments.mx/reader030/viewer/2022032521/56649d565503460f94a3427d/html5/thumbnails/32.jpg)
32
Compression algorithm
• Linear time
• Constant memory
• Accepts streaming data
For a series with n values, compression time is 0.0133 n milliseconds (300 MHz PC, Visual Basic 6.0).
![Page 33: 1. 2 General problem Retrieval of time-series similar to a given pattern](https://reader030.vdocuments.mx/reader030/viewer/2022032521/56649d565503460f94a3427d/html5/thumbnails/33.jpg)
33
Outline
• Previous work
• Important points
• Indexing and retrieval• Empirical results
• Conclusions
![Page 34: 1. 2 General problem Retrieval of time-series similar to a given pattern](https://reader030.vdocuments.mx/reader030/viewer/2022032521/56649d565503460f94a3427d/html5/thumbnails/34.jpg)
34
RetrievalRetrieval of time-series similar to a given pattern.
Intuition:
• Find a prominent feature in the pattern
• Find candidate segments with a similar feature
• Compare similarity of candidates to the pattern
![Page 35: 1. 2 General problem Retrieval of time-series similar to a given pattern](https://reader030.vdocuments.mx/reader030/viewer/2022032521/56649d565503460f94a3427d/html5/thumbnails/35.jpg)
35
Example: Stock chartsDatabase of time-series
![Page 36: 1. 2 General problem Retrieval of time-series similar to a given pattern](https://reader030.vdocuments.mx/reader030/viewer/2022032521/56649d565503460f94a3427d/html5/thumbnails/36.jpg)
36
Example: Stock chartsDatabase of time-series
![Page 37: 1. 2 General problem Retrieval of time-series similar to a given pattern](https://reader030.vdocuments.mx/reader030/viewer/2022032521/56649d565503460f94a3427d/html5/thumbnails/37.jpg)
37
Example: Stock chartsDatabase of time-series Pattern
![Page 38: 1. 2 General problem Retrieval of time-series similar to a given pattern](https://reader030.vdocuments.mx/reader030/viewer/2022032521/56649d565503460f94a3427d/html5/thumbnails/38.jpg)
38
Example: Stock chartsDatabase of time-series Pattern
![Page 39: 1. 2 General problem Retrieval of time-series similar to a given pattern](https://reader030.vdocuments.mx/reader030/viewer/2022032521/56649d565503460f94a3427d/html5/thumbnails/39.jpg)
39
Example: Stock chartsDatabase of time-series Pattern
![Page 40: 1. 2 General problem Retrieval of time-series similar to a given pattern](https://reader030.vdocuments.mx/reader030/viewer/2022032521/56649d565503460f94a3427d/html5/thumbnails/40.jpg)
40
Example: Stock chartsDatabase of time-series Pattern Retrieval results
.92
.87
.86
.84
![Page 41: 1. 2 General problem Retrieval of time-series similar to a given pattern](https://reader030.vdocuments.mx/reader030/viewer/2022032521/56649d565503460f94a3427d/html5/thumbnails/41.jpg)
41
Algorithm
• Identify the prominent leg in the pattern
• Retrieve similar legs from the database
• Identify corresponding candidate segments
• For each candidate segment, compute its similarity to the pattern
• Output the candidates whose similarity is above the threshold
![Page 42: 1. 2 General problem Retrieval of time-series similar to a given pattern](https://reader030.vdocuments.mx/reader030/viewer/2022032521/56649d565503460f94a3427d/html5/thumbnails/42.jpg)
42
Important details
• Use compressed pattern and compressed sequences in the retrieval process
• The prominent feature is the leg having the greatest ratio of right end to left end
• All legs in the database are indexed by their prominence, using a binary search tree
![Page 43: 1. 2 General problem Retrieval of time-series similar to a given pattern](https://reader030.vdocuments.mx/reader030/viewer/2022032521/56649d565503460f94a3427d/html5/thumbnails/43.jpg)
43
Alternative versions
• Different prominence definitions
• Different similarity metrics
The end-point ratio prominence usually gives the best empirical results.
![Page 44: 1. 2 General problem Retrieval of time-series similar to a given pattern](https://reader030.vdocuments.mx/reader030/viewer/2022032521/56649d565503460f94a3427d/html5/thumbnails/44.jpg)
44
Extended legs
Similar sequence
![Page 45: 1. 2 General problem Retrieval of time-series similar to a given pattern](https://reader030.vdocuments.mx/reader030/viewer/2022032521/56649d565503460f94a3427d/html5/thumbnails/45.jpg)
45
Indexing on extended legs
• Advantage: More accurate retrieval
• Disadvantage: Larger index, more memory
If a compressed sequence has n legs:
• Worst case: n2/2 extended legs
• Average case: (n lg n) extended legs
![Page 46: 1. 2 General problem Retrieval of time-series similar to a given pattern](https://reader030.vdocuments.mx/reader030/viewer/2022032521/56649d565503460f94a3427d/html5/thumbnails/46.jpg)
46
Outline
• Previous work
• Important points
• Indexing and retrieval
• Empirical results• Conclusions
![Page 47: 1. 2 General problem Retrieval of time-series similar to a given pattern](https://reader030.vdocuments.mx/reader030/viewer/2022032521/56649d565503460f94a3427d/html5/thumbnails/47.jpg)
47
Data sets
• Stock charts
• Air and sea temperatures
• Wind speeds
• Electroencephalograms
• Electrocardiograms
![Page 48: 1. 2 General problem Retrieval of time-series similar to a given pattern](https://reader030.vdocuments.mx/reader030/viewer/2022032521/56649d565503460f94a3427d/html5/thumbnails/48.jpg)
48
Data sets
• Stock charts
• Air and sea temperatures
• Wind speeds
• Electroencephalograms
• Electrocardiograms
60,000 points
445,000 points
79,000 points
17,000 points
2,000 points
![Page 49: 1. 2 General problem Retrieval of time-series similar to a given pattern](https://reader030.vdocuments.mx/reader030/viewer/2022032521/56649d565503460f94a3427d/html5/thumbnails/49.jpg)
49
PatternsCompressed patterns with 4 to 27 legs
Examples:
![Page 50: 1. 2 General problem Retrieval of time-series similar to a given pattern](https://reader030.vdocuments.mx/reader030/viewer/2022032521/56649d565503460f94a3427d/html5/thumbnails/50.jpg)
50
Retrieval timeRetrieval time: 0.07 m k milliseconds
m legs in a pattern
k candidates
![Page 51: 1. 2 General problem Retrieval of time-series similar to a given pattern](https://reader030.vdocuments.mx/reader030/viewer/2022032521/56649d565503460f94a3427d/html5/thumbnails/51.jpg)
51
Retrieval accuracy: Stock charts
20 % candidates
C = 3
10 %
C = 2
5 %
C = 1.5
1 %
C = 1.1
![Page 52: 1. 2 General problem Retrieval of time-series similar to a given pattern](https://reader030.vdocuments.mx/reader030/viewer/2022032521/56649d565503460f94a3427d/html5/thumbnails/52.jpg)
52
Retrieval accuracy: Wind speeds
20 % candidates
C = 1.5
10 %
C = 1.2
5 %
C = 1.1
![Page 53: 1. 2 General problem Retrieval of time-series similar to a given pattern](https://reader030.vdocuments.mx/reader030/viewer/2022032521/56649d565503460f94a3427d/html5/thumbnails/53.jpg)
53
Retrieval candidate quality
Stock charts (5,400 legs) 4 4 7
Air and sea temperatures (5,500 legs) 4 5 6
Wind speeds (10,500 legs) 3 7 9
Candidates
5% 10% 20%
Found matches among ten best:
![Page 54: 1. 2 General problem Retrieval of time-series similar to a given pattern](https://reader030.vdocuments.mx/reader030/viewer/2022032521/56649d565503460f94a3427d/html5/thumbnails/54.jpg)
54
Outline
• Previous work
• Important points
• Indexing and retrieval
• Empirical results
• Conclusions
![Page 55: 1. 2 General problem Retrieval of time-series similar to a given pattern](https://reader030.vdocuments.mx/reader030/viewer/2022032521/56649d565503460f94a3427d/html5/thumbnails/55.jpg)
55
Criteria for retrieval methods
Gunopulos [2000]:
• Work for erratic time-series
• Accept any pattern
• Find inexact matches
• Work when some points are missing
• Work on streaming data
![Page 56: 1. 2 General problem Retrieval of time-series similar to a given pattern](https://reader030.vdocuments.mx/reader030/viewer/2022032521/56649d565503460f94a3427d/html5/thumbnails/56.jpg)
56
Criteria for retrieval methods
Gunopulos [2000]:
• Work for erratic time-series
• Accept any pattern
• Find inexact matches
• Work when some points are missing
• Work on streaming data
![Page 57: 1. 2 General problem Retrieval of time-series similar to a given pattern](https://reader030.vdocuments.mx/reader030/viewer/2022032521/56649d565503460f94a3427d/html5/thumbnails/57.jpg)
57
Criteria for retrieval methods
Gunopulos [2000]:
• Work for erratic time-series
• Accept any pattern
• Find inexact matches
• Work when some points are missing
• Work on streaming data
~
![Page 58: 1. 2 General problem Retrieval of time-series similar to a given pattern](https://reader030.vdocuments.mx/reader030/viewer/2022032521/56649d565503460f94a3427d/html5/thumbnails/58.jpg)
58
Criteria for retrieval methods
Gunopulos [2000]:
• Work for erratic time-series
• Accept any pattern
• Find inexact matches
• Work when some points are missing
• Work on streaming data
~
![Page 59: 1. 2 General problem Retrieval of time-series similar to a given pattern](https://reader030.vdocuments.mx/reader030/viewer/2022032521/56649d565503460f94a3427d/html5/thumbnails/59.jpg)
59
Criteria for retrieval methods
Gunopulos [2000]:
• Work for erratic time-series
• Accept any pattern
• Find inexact matches
• Work when some points are missing
• Work on streaming data
~
![Page 60: 1. 2 General problem Retrieval of time-series similar to a given pattern](https://reader030.vdocuments.mx/reader030/viewer/2022032521/56649d565503460f94a3427d/html5/thumbnails/60.jpg)
60
Criteria for retrieval methods
Gunopulos [2000]:
• Work for erratic time-series
• Accept any pattern
• Find inexact matches
• Work when some points are missing
• Work on streaming data
~~
![Page 61: 1. 2 General problem Retrieval of time-series similar to a given pattern](https://reader030.vdocuments.mx/reader030/viewer/2022032521/56649d565503460f94a3427d/html5/thumbnails/61.jpg)
61
Main results
Compression
• Fast compression procedure
• Preserves similarity
Retrieval
• Works with compressed data
• Controlled trade-off between speed and accuracy