time%seriesipeople.dsv.su.se/~panagiotis/dami2014/timeseries1.pdf · 2014-12-07 · syllabus% nov4...

67
Time Series I 1

Upload: others

Post on 04-Apr-2020

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Time%SeriesIpeople.dsv.su.se/~panagiotis/DAMI2014/timeseries1.pdf · 2014-12-07 · Syllabus% Nov4 Introduc8on%to%datamining% Nov5 Associaon%Rules% Nov10,14 Clustering%and%DataRepresentaon%

Time  Series  I  

1  

Page 2: Time%SeriesIpeople.dsv.su.se/~panagiotis/DAMI2014/timeseries1.pdf · 2014-12-07 · Syllabus% Nov4 Introduc8on%to%datamining% Nov5 Associaon%Rules% Nov10,14 Clustering%and%DataRepresentaon%

Syllabus  Nov  4   Introduc8on  to  data  mining  

Nov  5   Associa8on  Rules  

Nov  10,  14   Clustering  and  Data  Representa8on  

Nov  17   Exercise  session  1  (Homework  1  due)  

Nov  19   Classifica8on  

Nov  24,  26   Similarity  Matching  and  Model  Evalua8on  

Dec  1   Exercise  session  2  (Homework  2  due)  

Dec  3   Combining  Models  

Dec  8,  10   Time  Series  Analysis  

Dec  15   Exercise  session  3  (Homework  3  due)  

Dec  17   Ranking  

Jan  13   Review  

Jan  14   EXAM  

Feb  23   Re-­‐EXAM  

Page 3: Time%SeriesIpeople.dsv.su.se/~panagiotis/DAMI2014/timeseries1.pdf · 2014-12-07 · Syllabus% Nov4 Introduc8on%to%datamining% Nov5 Associaon%Rules% Nov10,14 Clustering%and%DataRepresentaon%

Why  deal  with  sequen8al  data?  •  Because  all  data  is  sequen8al  J    •  All  data  items  arrive  in  the  data  store  in  some  order    •  Examples  

–  transac8on  data  –  documents  and  words    

•  In  some  (or  many)  cases  the  order  does  not  maXer    •  In  many  cases  the  order  is  of  interest  

3  

Page 4: Time%SeriesIpeople.dsv.su.se/~panagiotis/DAMI2014/timeseries1.pdf · 2014-12-07 · Syllabus% Nov4 Introduc8on%to%datamining% Nov5 Associaon%Rules% Nov10,14 Clustering%and%DataRepresentaon%

Time-­‐series  data:  example  

         Financial  8me  series  4  

Page 5: Time%SeriesIpeople.dsv.su.se/~panagiotis/DAMI2014/timeseries1.pdf · 2014-12-07 · Syllabus% Nov4 Introduc8on%to%datamining% Nov5 Associaon%Rules% Nov10,14 Clustering%and%DataRepresentaon%

Ques8ons  

•  What  is  8me  series?  

•  How  do  we  compare  8me  series  data?  

•  What  is  the  structure  of  8me  series  data?  

•  Can  we  represent  this  structure  compactly  and  accurately?  

5  

Page 6: Time%SeriesIpeople.dsv.su.se/~panagiotis/DAMI2014/timeseries1.pdf · 2014-12-07 · Syllabus% Nov4 Introduc8on%to%datamining% Nov5 Associaon%Rules% Nov10,14 Clustering%and%DataRepresentaon%

Time Series •  A sequence of observations:

–  X = (x1, x2, x3, x4, …, xn) •  Each xi is a real number

–  e.g., (2.0, 2.4, 4.8, 5.6, 6.3, 5.6, 4.4, 4.5, 5.8, 7.5)

8me  axis  

value  axis  

Page 7: Time%SeriesIpeople.dsv.su.se/~panagiotis/DAMI2014/timeseries1.pdf · 2014-12-07 · Syllabus% Nov4 Introduc8on%to%datamining% Nov5 Associaon%Rules% Nov10,14 Clustering%and%DataRepresentaon%

Time  Series  Databases  •  A  <me  series  is  an  ordered  set  of  real  numbers,  

represen8ng  the  measurements  of  a  real  variable  at  equal  8me  intervals    

– Stock  prices  – Volume  of  sales  over  8me  – Daily  temperature  readings  – ECG  data    

•  A  <me  series  database  is  a  large  collec8on  of  8me  series  

7  

Page 8: Time%SeriesIpeople.dsv.su.se/~panagiotis/DAMI2014/timeseries1.pdf · 2014-12-07 · Syllabus% Nov4 Introduc8on%to%datamining% Nov5 Associaon%Rules% Nov10,14 Clustering%and%DataRepresentaon%

•  Given  two  8me  series                  X  =  (x1,  x2,  …,  xn)                            Y  =  (y1,  y2,  …,  yn)  

 •  Define  and  compute  D  (X,  Y)    •  Or  be@er…  

Time  Series  Similarity  

Page 9: Time%SeriesIpeople.dsv.su.se/~panagiotis/DAMI2014/timeseries1.pdf · 2014-12-07 · Syllabus% Nov4 Introduc8on%to%datamining% Nov5 Associaon%Rules% Nov10,14 Clustering%and%DataRepresentaon%

database

query X

D (X, Y) 1-NN

Time  Series  Similarity  Search  •  Given  a  8me  series  database  and  a  query  X  •  Find  the  best  match  of  X  in  the  database              

       •  Why  is  that  useful?  

Page 10: Time%SeriesIpeople.dsv.su.se/~panagiotis/DAMI2014/timeseries1.pdf · 2014-12-07 · Syllabus% Nov4 Introduc8on%to%datamining% Nov5 Associaon%Rules% Nov10,14 Clustering%and%DataRepresentaon%

Examples  

•  Find   companies   with   similar   stock   prices   over   a  

8me  interval  

•  Find  products  with  similar  sell  cycles  

•  Cluster  users  with  similar  credit  card  u8liza8on  

•  Find  similar  subsequences  in  DNA  sequences  

•  Find  scenes  in  video  streams    

10  

Page 11: Time%SeriesIpeople.dsv.su.se/~panagiotis/DAMI2014/timeseries1.pdf · 2014-12-07 · Syllabus% Nov4 Introduc8on%to%datamining% Nov5 Associaon%Rules% Nov10,14 Clustering%and%DataRepresentaon%

Types  of  queries  

•  whole  match  vs  subsequence  match    •  range  query  vs  nearest  neighbor  query  

11  

Page 12: Time%SeriesIpeople.dsv.su.se/~panagiotis/DAMI2014/timeseries1.pdf · 2014-12-07 · Syllabus% Nov4 Introduc8on%to%datamining% Nov5 Associaon%Rules% Nov10,14 Clustering%and%DataRepresentaon%

day

$price

1 365

day

$price

1 365

day

$price

1 365

distance function: by expert

(e.g., Euclidean distance)

12  

Page 13: Time%SeriesIpeople.dsv.su.se/~panagiotis/DAMI2014/timeseries1.pdf · 2014-12-07 · Syllabus% Nov4 Introduc8on%to%datamining% Nov5 Associaon%Rules% Nov10,14 Clustering%and%DataRepresentaon%

Problems  

•  Define  the  similarity  (or  distance)  func8on  •  Find  an  efficient  algorithm  to  retrieve  similar  8me    series  from  a  database  –  (Faster  than  sequen8al  scan)  

The Similarity function depends on the Application

13  

Page 14: Time%SeriesIpeople.dsv.su.se/~panagiotis/DAMI2014/timeseries1.pdf · 2014-12-07 · Syllabus% Nov4 Introduc8on%to%datamining% Nov5 Associaon%Rules% Nov10,14 Clustering%and%DataRepresentaon%

Metric  Distances  

•  What  proper8es  should    a  similarity  distance  have  to  allow  (easy)  indexing?  

–  D(A,B)  =  D(B,A)      Symmetry    –   D(A,A)  =  0      Constancy  of  Self-­‐Similarity  –   D(A,B)  >=  0        Posi4vity  –   D(A,B)  ≤  D(A,C)  +  D(B,C)    Triangle  Inequality    

•  Some8mes   the   distance   func8on   that   best   fits   an  applica8on  is  not  a  metric  

•  Then  indexing  becomes  interes8ng  and  challenging  14  

Page 15: Time%SeriesIpeople.dsv.su.se/~panagiotis/DAMI2014/timeseries1.pdf · 2014-12-07 · Syllabus% Nov4 Introduc8on%to%datamining% Nov5 Associaon%Rules% Nov10,14 Clustering%and%DataRepresentaon%

Euclidean  Distance  

15

•  Each  8me  series:  a  point  in  the  n-­‐dim  space  

•  Euclidean  distance  – pair-­‐wise  point  distance  

v1 v2

L2 = (xi − yi )2

i=1

n

 X  =  x1,  x2,  …,  xn                  

   Y  =  y1,  y2,  …,  yn  

Page 16: Time%SeriesIpeople.dsv.su.se/~panagiotis/DAMI2014/timeseries1.pdf · 2014-12-07 · Syllabus% Nov4 Introduc8on%to%datamining% Nov5 Associaon%Rules% Nov10,14 Clustering%and%DataRepresentaon%

Euclidean  model  Query Q

n datapoints

Database

n datapoints 16  

Page 17: Time%SeriesIpeople.dsv.su.se/~panagiotis/DAMI2014/timeseries1.pdf · 2014-12-07 · Syllabus% Nov4 Introduc8on%to%datamining% Nov5 Associaon%Rules% Nov10,14 Clustering%and%DataRepresentaon%

Query Q

n datapoints

D Q,X( ) ≡ qi − xi( ) 2i=1

n∑

S

Q

Euclidean Distance between two time series Q = {q1, q2, …, qn} and X = {x1, x2, …, xn}

Database

n datapoints 17  

Euclidean  model  

Page 18: Time%SeriesIpeople.dsv.su.se/~panagiotis/DAMI2014/timeseries1.pdf · 2014-12-07 · Syllabus% Nov4 Introduc8on%to%datamining% Nov5 Associaon%Rules% Nov10,14 Clustering%and%DataRepresentaon%

Query Q

n datapoints

D Q,X( ) ≡ qi − xi( ) 2i=1

n∑

S

Q

Euclidean Distance between two time series Q = {q1, q2, …, qn} and X = {x1, x2, …, xn}

Distance

0.98

0.07

0.21

0.43

Rank

4

1

2

3

Database

n datapoints 18  

Euclidean  model  

Page 19: Time%SeriesIpeople.dsv.su.se/~panagiotis/DAMI2014/timeseries1.pdf · 2014-12-07 · Syllabus% Nov4 Introduc8on%to%datamining% Nov5 Associaon%Rules% Nov10,14 Clustering%and%DataRepresentaon%

 

•  Easy  to  compute:  O(n)  •  Allows  scalable  solu8ons  to  other  problems,  such  as  –  indexing  – clustering  – etc...    

Advantages  

19  

Page 20: Time%SeriesIpeople.dsv.su.se/~panagiotis/DAMI2014/timeseries1.pdf · 2014-12-07 · Syllabus% Nov4 Introduc8on%to%datamining% Nov5 Associaon%Rules% Nov10,14 Clustering%and%DataRepresentaon%

 

•  Query  and  target  lengths  should  be  equal!  

•  Cannot  tolerate  noise:  –  Time  shiks  –  Sequences  out  of  phase  –  Scaling  in  the  y-­‐axis  

Disadvantages  

20  

Page 21: Time%SeriesIpeople.dsv.su.se/~panagiotis/DAMI2014/timeseries1.pdf · 2014-12-07 · Syllabus% Nov4 Introduc8on%to%datamining% Nov5 Associaon%Rules% Nov10,14 Clustering%and%DataRepresentaon%

21  

Limita8ons  of  Euclidean  Distance  

Euclidean  Distance  Sequences  are  aligned  “one  to  one”.  

“Warped”  Time  Axis  Nonlinear  alignments  are  possible.  

D Q,X( ) ≡ qi − xi( ) 2i=1

n∑

Q  

Q  

C  

C  

Page 22: Time%SeriesIpeople.dsv.su.se/~panagiotis/DAMI2014/timeseries1.pdf · 2014-12-07 · Syllabus% Nov4 Introduc8on%to%datamining% Nov5 Associaon%Rules% Nov10,14 Clustering%and%DataRepresentaon%

22  

DTW:  Dynamic  8me  warping  (1/2)    

•  Each  cell  c  =  (i,  j)  is  a  pair  of  indices  whose  corresponding  values  will  be  computed,  (xi–qj)2,  and  included  in  

the  sum  for  the  distance.  

•  Euclidean  path:  

–  i  =  j  always.  

–  Ignores  off-­‐diagonal  cells.  

X  

Q  

(x2–q2)2  +  (x1–q1)2    (x1–q1)2  

Page 23: Time%SeriesIpeople.dsv.su.se/~panagiotis/DAMI2014/timeseries1.pdf · 2014-12-07 · Syllabus% Nov4 Introduc8on%to%datamining% Nov5 Associaon%Rules% Nov10,14 Clustering%and%DataRepresentaon%

23  

(i, j)

DTW:  Dynamic  8me  warping  (2/2)  

•  DTW  allows  any  path.  •  Examine  all  paths:  

•  Standard  dynamic  programming  to  fill  in  the  table.  

•  The  top-­‐right  cell  contains  final  result.  

(i, j) (i-1, j)

(i-1, j-1) (i, j-1)

Shrink  X  /  stretch  Q

Stretch  X  /  shrink  Q

X  

Q  

a  

b  

Page 24: Time%SeriesIpeople.dsv.su.se/~panagiotis/DAMI2014/timeseries1.pdf · 2014-12-07 · Syllabus% Nov4 Introduc8on%to%datamining% Nov5 Associaon%Rules% Nov10,14 Clustering%and%DataRepresentaon%

24  

Computa8on  

Ddtw (Q,X) = f (N,M )

f (i, j) = qi − x j +minf (i, j −1)f (i−1, j)f (i−1, j −1)

"

#$

%$

q-­‐stretch  no  stretch  

x-­‐stretch  

•  DTW  is  computed  by  dynamic  programming •  Given  two  sequences  

– Q  =  {q1,  q2,  …,  qN}  – X  =    {x1,  x2,  …,  xM}            

Page 25: Time%SeriesIpeople.dsv.su.se/~panagiotis/DAMI2014/timeseries1.pdf · 2014-12-07 · Syllabus% Nov4 Introduc8on%to%datamining% Nov5 Associaon%Rules% Nov10,14 Clustering%and%DataRepresentaon%

•  Warping  path  W:    –  set  of  grid  cells  in  the  8me  warping  matrix  

•  DTW  finds  the  op8mum  warping  path  W:  –  the  path  with  the  smallest  matching  score      

Op8mum  warping  path  W  (the  best  alignment)   Proper<es  of  a  DTW  legal  path  

I.   Boundary  condi<ons  

 W1=(1,1)  and  WK=(n,m)  

II.   Con<nuity    Given  Wk  =  (a,  b),  then      Wk-­‐1  =  (c,  d),  where  a-­‐c  ≤  1,  b-­‐d  ≤  1  

III.   Monotonicity    Given  Wk  =  (a,  b),  then        Wk-­‐1  =  (c,  d),  where  a-­‐c  ≥  0,  b-­‐d  ≥  0  

Proper8es  of  DTW  

X  

Y  

25  

Page 26: Time%SeriesIpeople.dsv.su.se/~panagiotis/DAMI2014/timeseries1.pdf · 2014-12-07 · Syllabus% Nov4 Introduc8on%to%datamining% Nov5 Associaon%Rules% Nov10,14 Clustering%and%DataRepresentaon%

Proper8es  of  DTW  

I.   Boundary  condi<ons  

 W1=(1,1)  and  WK=(n,m)  

II.   Con<nuity    Given  Wk  =  (a,  b),  then      Wk-­‐1  =  (c,  d),  where  a-­‐c  ≤  1,  b-­‐d  ≤  1  

III.   Monotonicity    Given  Wk  =  (a,  b),  then        Wk-­‐1  =  (c,  d),  where  a-­‐c  ≥  0,  b-­‐d  ≥  0   26  

•  Paths   start   at   the   boXom   lek   cell  and  end  at  the  top  right  cell  

•  There   is  always  a  point  of   the  path  in   each   row   and   column   of   the  matrix  

•  Paths   go   always   from   lek   to   right  and  from  boXom  to  top  

Page 27: Time%SeriesIpeople.dsv.su.se/~panagiotis/DAMI2014/timeseries1.pdf · 2014-12-07 · Syllabus% Nov4 Introduc8on%to%datamining% Nov5 Associaon%Rules% Nov10,14 Clustering%and%DataRepresentaon%

 

•  Query  and  target  lengths  may  not  be  of  equal  length  J  

•  Can  tolerate  noise:  –  8me  shiks  –  sequences  out  of  phase  –  scaling  in  the  y-­‐axis  

Advantages  

27  

Page 28: Time%SeriesIpeople.dsv.su.se/~panagiotis/DAMI2014/timeseries1.pdf · 2014-12-07 · Syllabus% Nov4 Introduc8on%to%datamining% Nov5 Associaon%Rules% Nov10,14 Clustering%and%DataRepresentaon%

 

•  Computa8onal  complexity:  O(nm)  

 •  May  not  be  able  to  handle  some  types  of  noise...  

•  DTW  is  not  metric  (triangle  inequality  does  not  hold)  

Disadvantages  

28  

Page 29: Time%SeriesIpeople.dsv.su.se/~panagiotis/DAMI2014/timeseries1.pdf · 2014-12-07 · Syllabus% Nov4 Introduc8on%to%datamining% Nov5 Associaon%Rules% Nov10,14 Clustering%and%DataRepresentaon%

29  

Sakoe-­‐Chiba  Band   Itakura  Parallelogram  

r  =    

Global  Constraints  n  Slightly  speed  up  the  calcula8ons  and  prevent  pathological  warpings  n  A  global  constraint  limits  the  indices  of  the  warping  path    

 wk  =  (i,  j)k  such  that  j-­‐r  ≤  i  ≤  j+r  n  Where  r  is  a  term  defining  allowed  range  of    warping  for  a  given  point  in  a  

sequence  

Page 30: Time%SeriesIpeople.dsv.su.se/~panagiotis/DAMI2014/timeseries1.pdf · 2014-12-07 · Syllabus% Nov4 Introduc8on%to%datamining% Nov5 Associaon%Rules% Nov10,14 Clustering%and%DataRepresentaon%

Complexity  of  DTW  

•  Basic  implementa8on  =  O(n2)  where  n  is  the  length  of  the  sequences  –  will  have  to  solve  the  problem  for  each  (i,    j)  pair  

•  If  warping  window  is  specified,  then  O(nr)  –  only  solve  for  the  (i,  j)  pairs  where  |  i  –  j  |  <=  r  

   

30  

Page 31: Time%SeriesIpeople.dsv.su.se/~panagiotis/DAMI2014/timeseries1.pdf · 2014-12-07 · Syllabus% Nov4 Introduc8on%to%datamining% Nov5 Associaon%Rules% Nov10,14 Clustering%and%DataRepresentaon%

Longest  Common  Subsequence  Measures      

(Allowing  for  Gaps  in  Sequences)  

Gap skipped

31  

Page 32: Time%SeriesIpeople.dsv.su.se/~panagiotis/DAMI2014/timeseries1.pdf · 2014-12-07 · Syllabus% Nov4 Introduc8on%to%datamining% Nov5 Associaon%Rules% Nov10,14 Clustering%and%DataRepresentaon%

Longest  Common  Subsequence  (LCSS)  

ignore majority of noise

match

match

Advantages of LCSS:

A. Outlying values not matched

B. Distance/Similarity distorted less

Disadvantages of DTW:

A. All points are matched

B. Outliers can distort distance

C. One-to-many mapping

LCSS is more resilient to noise than DTW.

32  

Page 33: Time%SeriesIpeople.dsv.su.se/~panagiotis/DAMI2014/timeseries1.pdf · 2014-12-07 · Syllabus% Nov4 Introduc8on%to%datamining% Nov5 Associaon%Rules% Nov10,14 Clustering%and%DataRepresentaon%

Longest  Common  Subsequence  Similar dynamic programming solution as DTW, but now we measure similarity not distance.

Can also be expressed as distance

33  

Page 34: Time%SeriesIpeople.dsv.su.se/~panagiotis/DAMI2014/timeseries1.pdf · 2014-12-07 · Syllabus% Nov4 Introduc8on%to%datamining% Nov5 Associaon%Rules% Nov10,14 Clustering%and%DataRepresentaon%

Similarity  Retrieval  

•  Range  Query  –  Find  all  8me  series  X  where    

•  Nearest  Neighbor  query  –  Find  all  the  k  most  similar  8me  series  to  Q  

•  A  method  to  answer  the  above  queries:    –  Linear  scan    

•  A  beXer  approach    – GEMINI  [next  8me]  

D Q,X( ) ≤ ε

34  

Page 35: Time%SeriesIpeople.dsv.su.se/~panagiotis/DAMI2014/timeseries1.pdf · 2014-12-07 · Syllabus% Nov4 Introduc8on%to%datamining% Nov5 Associaon%Rules% Nov10,14 Clustering%and%DataRepresentaon%

35  

Lower  Bounding  –  NN  search    

Intui<on  ü   Try  to  use  a  cheap  lower  bounding  calcula8on  as  oken  as  possible  ü   Do  the  expensive,  full  calcula8ons  when  absolutely  necessary  

We  can  speed  up  similarity  search  by  using  a  lower  bounding  func8on    §   D:  distance  measure  

§   LB:  lower  bounding  func8on  s.t.:          LB(Q,  X)  ≤  D(Q,  X)    

Ø   Set  best  =  ∞  Ø   For  each  Xi:  

à if  LB(Xi,  Q)  <  best  if  D(Xi,  Q)  <  best                  best  =  D(Xi,  Q)    

1-NN Search Using LB

We  assume  a  database  of  8me  series:  DB  =  {X1,  X2,  …,  XN}  

Page 36: Time%SeriesIpeople.dsv.su.se/~panagiotis/DAMI2014/timeseries1.pdf · 2014-12-07 · Syllabus% Nov4 Introduc8on%to%datamining% Nov5 Associaon%Rules% Nov10,14 Clustering%and%DataRepresentaon%

36  

Lower  Bounding  –  NN  search    

Intui<on  ü   Try  to  use  a  cheap  lower  bounding  calcula8on  as  oken  as  possible  ü   Do  the  expensive,  full  calcula8ons  when  absolutely  necessary  

We  can  speed  up  similarity  search  by  using  a  lower  bounding  func8on    §   D:  distance  measure  

§   LB:  lower  bounding  func8on  s.t.:          LB(Q,  X)  ≤  D(Q,  X)    

Range Query Using LB For  each  Xi:  

à if  LB(Xi,  Q)  ≤  ε  if  D(Xi,  Q)  <  ε                  report  Xi  

We  assume  a  database  of  8me  series:  DB  =  {X1,  X2,  …,  XN}  

Page 37: Time%SeriesIpeople.dsv.su.se/~panagiotis/DAMI2014/timeseries1.pdf · 2014-12-07 · Syllabus% Nov4 Introduc8on%to%datamining% Nov5 Associaon%Rules% Nov10,14 Clustering%and%DataRepresentaon%

Problems  •  How  to  define  Lower  bounds  for  different  distance  measures?  

•  How  to  extract  the  features?  How  to  define  the  feature  space?  –  Fourier  transform  – Wavelets  transform  – Averages  of  segments  (Histograms  or  APCA)  –  Chebyshev  polynomials  –  ....  your  favorite  curve  approxima8on...    

37  

Page 38: Time%SeriesIpeople.dsv.su.se/~panagiotis/DAMI2014/timeseries1.pdf · 2014-12-07 · Syllabus% Nov4 Introduc8on%to%datamining% Nov5 Associaon%Rules% Nov10,14 Clustering%and%DataRepresentaon%

38  

Some  Lower  Bounds  on  DTW  

Each  8me  series  is  represented  by  4  features:        <First,  Last,  Min,  Max>  

LB_Kim  =  maximum  squared  difference  of  the    corresponding  features    

LB_Kim  

max(Q)  

min(Q)  

LB_Yi  

LB_Yi  =  squared  differences  of  the  points  of  X  that  fall  above  max(Q)  or  below  min(Q)  

X  

Q  

X  

Q  

Page 39: Time%SeriesIpeople.dsv.su.se/~panagiotis/DAMI2014/timeseries1.pdf · 2014-12-07 · Syllabus% Nov4 Introduc8on%to%datamining% Nov5 Associaon%Rules% Nov10,14 Clustering%and%DataRepresentaon%

39  

LB_Keogh  [Keogh  2004]  

L  

U  

Q  

U  

L  Q  

X  

 Q  

X  

 Q  

Sakoe-­‐Chiba  Band  

Itakura  Parallelogram  

Ui  =  max(qi-­‐r  :  qi+r)  Li    =  min(qi-­‐r  :  qi+r)  

Page 40: Time%SeriesIpeople.dsv.su.se/~panagiotis/DAMI2014/timeseries1.pdf · 2014-12-07 · Syllabus% Nov4 Introduc8on%to%datamining% Nov5 Associaon%Rules% Nov10,14 Clustering%and%DataRepresentaon%

40  

X U

L Q

X U

L Q

X

Q

X

Q

Sakoe-Chiba Band

Itakura Parallelogram

LB_Keogh(Q,X)=

(xi −Ui )2 if xi >Ui

(xi − Li )2 if xi <Li

0 otherwise

"

#$$

%$$

i=1

n

∑LB_Keogh

LB_Keogh  

LB_Keogh(Q,X) ≤ DTW (Q,X)

Page 41: Time%SeriesIpeople.dsv.su.se/~panagiotis/DAMI2014/timeseries1.pdf · 2014-12-07 · Syllabus% Nov4 Introduc8on%to%datamining% Nov5 Associaon%Rules% Nov10,14 Clustering%and%DataRepresentaon%

41  

LB_Keogh Sakoe-Chiba

LB_Keogh Itakura

LB_Yi

LB_Kim

…propor8onal  to  the  length  of  gray  lines  used  in  the  illustra8ons    

Tightness  of  LB  

nceDistaWarpTimeDynamicTruenceDistaWarpTimeDynamicofEstimateBoundLowerT =

0  ≤  T  ≤  1  The  larger  the  

beXer  

Page 42: Time%SeriesIpeople.dsv.su.se/~panagiotis/DAMI2014/timeseries1.pdf · 2014-12-07 · Syllabus% Nov4 Introduc8on%to%datamining% Nov5 Associaon%Rules% Nov10,14 Clustering%and%DataRepresentaon%

Lower  Bounding  

distance  Q  

we  want  to  find  the  1-­‐NN  to  our  query  data  series,  Q  

Page 43: Time%SeriesIpeople.dsv.su.se/~panagiotis/DAMI2014/timeseries1.pdf · 2014-12-07 · Syllabus% Nov4 Introduc8on%to%datamining% Nov5 Associaon%Rules% Nov10,14 Clustering%and%DataRepresentaon%

Lower  Bounding  

distance  Q   true  S1  

we  compute  the  distance  to  the  first  data  series  in  our  dataset,  D(S1,Q)  

this  becomes  the  best  so  far  (BSF)  

Page 44: Time%SeriesIpeople.dsv.su.se/~panagiotis/DAMI2014/timeseries1.pdf · 2014-12-07 · Syllabus% Nov4 Introduc8on%to%datamining% Nov5 Associaon%Rules% Nov10,14 Clustering%and%DataRepresentaon%

Lower  Bounding  

distance  Q   true  S1  

BSF  

LB  S2  

we  compute  the  distance  LB(S2,Q)  and  it  is  greater  than  the  BSF  

we  can  safely  prune  it,  since  D(S2,Q)    LB(S2,Q)  

Page 45: Time%SeriesIpeople.dsv.su.se/~panagiotis/DAMI2014/timeseries1.pdf · 2014-12-07 · Syllabus% Nov4 Introduc8on%to%datamining% Nov5 Associaon%Rules% Nov10,14 Clustering%and%DataRepresentaon%

Lower  Bounding  

distance  Q   true  S1  

BSF  

LB  S2  

we  compute  the  distance  LB(S3,Q)  and  it  is  smaller  than  the  BSF  we  have  to  compute  D(S3,Q)≥  LB(S3,Q),  since  it  may  s8ll  be  

smaller  than  BSF  

LB  S3  

Page 46: Time%SeriesIpeople.dsv.su.se/~panagiotis/DAMI2014/timeseries1.pdf · 2014-12-07 · Syllabus% Nov4 Introduc8on%to%datamining% Nov5 Associaon%Rules% Nov10,14 Clustering%and%DataRepresentaon%

Lower  Bounding  

distance  Q   true  S1  

BSF  

LB  S2  

it  turns  out  that  D(S3,Q)≥  BSF,  so  we  can  safely  prune  S3  

true  S3  

Page 47: Time%SeriesIpeople.dsv.su.se/~panagiotis/DAMI2014/timeseries1.pdf · 2014-12-07 · Syllabus% Nov4 Introduc8on%to%datamining% Nov5 Associaon%Rules% Nov10,14 Clustering%and%DataRepresentaon%

Lower  Bounding  

distance  Q   true  S1  

BSF  

LB  S2  true  S3  

Page 48: Time%SeriesIpeople.dsv.su.se/~panagiotis/DAMI2014/timeseries1.pdf · 2014-12-07 · Syllabus% Nov4 Introduc8on%to%datamining% Nov5 Associaon%Rules% Nov10,14 Clustering%and%DataRepresentaon%

Lower  Bounding  

distance  Q   true  S1  

BSF  

LB  S2  true  S3  

we  compute  the  distance  LB(S4,Q)  and  it  is  smaller  than  the  BSF  we  have  to  compute  D(S4,Q)≥  LB(S4,Q),  since  it  may  s8ll  be  

smaller  than  BSF  

LB  S4  

Page 49: Time%SeriesIpeople.dsv.su.se/~panagiotis/DAMI2014/timeseries1.pdf · 2014-12-07 · Syllabus% Nov4 Introduc8on%to%datamining% Nov5 Associaon%Rules% Nov10,14 Clustering%and%DataRepresentaon%

Lower  Bounding  

distance  Q   true  S1  

BSF  

LB  S2  true  S3  true  S4  

it  turns  out  that  D(S4,Q)<  BSF,  so  S4  becomes  the  new  BSF  

Page 50: Time%SeriesIpeople.dsv.su.se/~panagiotis/DAMI2014/timeseries1.pdf · 2014-12-07 · Syllabus% Nov4 Introduc8on%to%datamining% Nov5 Associaon%Rules% Nov10,14 Clustering%and%DataRepresentaon%

Lower  Bounding  

distance  Q   true  S1  

S1  cannot  be  the  1-­‐NN,  because  S4  is  closer  to  Q  

LB  S2  true  S3  true  S4  

BSF  

Page 51: Time%SeriesIpeople.dsv.su.se/~panagiotis/DAMI2014/timeseries1.pdf · 2014-12-07 · Syllabus% Nov4 Introduc8on%to%datamining% Nov5 Associaon%Rules% Nov10,14 Clustering%and%DataRepresentaon%

51  

How  about  subsequence  matching?  

•  DTW  is  defined  for  full-­‐sequence  matching:  –  All   points  of   the  query   sequence  are  matched   to  all   points  of  the  target  sequence  

•  Subsequence  matching:  –  The   query   is   matched   to   a   part   (subsequence)   of   the   target  sequence

Query  sequence   Data  stream  

Page 52: Time%SeriesIpeople.dsv.su.se/~panagiotis/DAMI2014/timeseries1.pdf · 2014-12-07 · Syllabus% Nov4 Introduc8on%to%datamining% Nov5 Associaon%Rules% Nov10,14 Clustering%and%DataRepresentaon%

X: long sequence

Q: short sequence

What subsequence of X is the best match for Q?

Subsequence Matching

Page 53: Time%SeriesIpeople.dsv.su.se/~panagiotis/DAMI2014/timeseries1.pdf · 2014-12-07 · Syllabus% Nov4 Introduc8on%to%datamining% Nov5 Associaon%Rules% Nov10,14 Clustering%and%DataRepresentaon%

X: long sequence

Q: short sequence

What subsequence of X is the best match for Q … such that the match ends at position j?

position j

J-Position Subsequence Match

X: long sequence

Q: short sequence

Page 54: Time%SeriesIpeople.dsv.su.se/~panagiotis/DAMI2014/timeseries1.pdf · 2014-12-07 · Syllabus% Nov4 Introduc8on%to%datamining% Nov5 Associaon%Rules% Nov10,14 Clustering%and%DataRepresentaon%

X: long sequence

Q: short sequence

position j

J-Position Subsequence Match

X: long sequence

Q: short sequence

Naïve Solution: DTW Examine all possible subsequences

Page 55: Time%SeriesIpeople.dsv.su.se/~panagiotis/DAMI2014/timeseries1.pdf · 2014-12-07 · Syllabus% Nov4 Introduc8on%to%datamining% Nov5 Associaon%Rules% Nov10,14 Clustering%and%DataRepresentaon%

X: long sequence

Q: short sequence

position j

J-Position Subsequence Match

Naïve Solution: DTW Examine all possible subsequences

X: long sequence

Q: short sequence

X: long sequence

Q: short sequence

Naïve Solution: DTW Examine all possible subsequences

Page 56: Time%SeriesIpeople.dsv.su.se/~panagiotis/DAMI2014/timeseries1.pdf · 2014-12-07 · Syllabus% Nov4 Introduc8on%to%datamining% Nov5 Associaon%Rules% Nov10,14 Clustering%and%DataRepresentaon%

X: long sequence

Q: short sequence

position j

J-Position Subsequence Match

Naïve Solution: DTW Examine all possible subsequences

X: long sequence

Q: short sequence

X: long sequence

Q: short sequence

Naïve Solution: DTW Examine all possible subsequences

Page 57: Time%SeriesIpeople.dsv.su.se/~panagiotis/DAMI2014/timeseries1.pdf · 2014-12-07 · Syllabus% Nov4 Introduc8on%to%datamining% Nov5 Associaon%Rules% Nov10,14 Clustering%and%DataRepresentaon%

X: long sequence

Q: short sequence

position j

J-Position Subsequence Match

Too costly!

Naïve Solution: DTW Examine all possible subsequences

X: long sequence

Q: short sequence

X: long sequence

Q: short sequence

Naïve Solution: DTW Examine all possible subsequences

Page 58: Time%SeriesIpeople.dsv.su.se/~panagiotis/DAMI2014/timeseries1.pdf · 2014-12-07 · Syllabus% Nov4 Introduc8on%to%datamining% Nov5 Associaon%Rules% Nov10,14 Clustering%and%DataRepresentaon%

58  

•  Compute  the  8me  warping  matrices  star8ng  from  every  database  frame –  Need  O(n)  matrices,  O(nm)  8me  per  frame

Q

X xtstart xtend

x1

Why  not  ‘naive’?

Capture the optimal subsequence starting

from t = tstart n

m

Page 59: Time%SeriesIpeople.dsv.su.se/~panagiotis/DAMI2014/timeseries1.pdf · 2014-12-07 · Syllabus% Nov4 Introduc8on%to%datamining% Nov5 Associaon%Rules% Nov10,14 Clustering%and%DataRepresentaon%

59  

Key  Idea  •  Star-padding

– Use only a single matrix

(the naïve solution uses n matrices)

–  Prefix Q with ‘*’, that always gives zero distance

–  Instead of Q=(q1 , q2 , …, qm), compute distances with Q’

– O(m) time and space (the naïve requires O(nm))

(*)),,,,('

0

210

=

=

qqqqqQ m…

Page 60: Time%SeriesIpeople.dsv.su.se/~panagiotis/DAMI2014/timeseries1.pdf · 2014-12-07 · Syllabus% Nov4 Introduc8on%to%datamining% Nov5 Associaon%Rules% Nov10,14 Clustering%and%DataRepresentaon%

SPRING: dynamic programming

n  Initialization n  Insert a “dummy” state ‘*’ at the beginning of the query n  ‘*’ matches every value in X with score 0

database sequence X

quer

y Q

* 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

Page 61: Time%SeriesIpeople.dsv.su.se/~panagiotis/DAMI2014/timeseries1.pdf · 2014-12-07 · Syllabus% Nov4 Introduc8on%to%datamining% Nov5 Associaon%Rules% Nov10,14 Clustering%and%DataRepresentaon%

n  Computation n  Perform dynamic programming computation in a similar

manner as standard DTW

database sequence X

* 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

quer

y Q

SPRING: dynamic programming

(i, j) (i, j) (i-1, j)

(i-1, j-1) (i, j-1)

Page 62: Time%SeriesIpeople.dsv.su.se/~panagiotis/DAMI2014/timeseries1.pdf · 2014-12-07 · Syllabus% Nov4 Introduc8on%to%datamining% Nov5 Associaon%Rules% Nov10,14 Clustering%and%DataRepresentaon%

Q[1:i] is matched with X[s,j]

database sequence X

* 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

quer

y Q

i  

js

n  For each (i, j): n  compute the j-position subsequence match of the first i

items of Q to X[s:j]

SPRING: dynamic programming

Page 63: Time%SeriesIpeople.dsv.su.se/~panagiotis/DAMI2014/timeseries1.pdf · 2014-12-07 · Syllabus% Nov4 Introduc8on%to%datamining% Nov5 Associaon%Rules% Nov10,14 Clustering%and%DataRepresentaon%

n  For each (i, j): n  compute the j-position subsequence match of the first i

items of Q to X[s:j] n  Top row: j-position subsequence match of Q for all j’s n  Final answer: best among j-position matches

n  Look at answers stored at the top row of the table

database sequence X

* 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

quer

y Q

SPRING: dynamic programming

Page 64: Time%SeriesIpeople.dsv.su.se/~panagiotis/DAMI2014/timeseries1.pdf · 2014-12-07 · Syllabus% Nov4 Introduc8on%to%datamining% Nov5 Associaon%Rules% Nov10,14 Clustering%and%DataRepresentaon%

database sequence X

* 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

Subsequence vs. full matching qu

ery

Q

Q

p1 pi pN

q1

qj

qM

Page 65: Time%SeriesIpeople.dsv.su.se/~panagiotis/DAMI2014/timeseries1.pdf · 2014-12-07 · Syllabus% Nov4 Introduc8on%to%datamining% Nov5 Associaon%Rules% Nov10,14 Clustering%and%DataRepresentaon%

n  Assume that the database is one very long sequence n  Concatenate all sequences into one sequence

n  O (|Q| * |X|) n  But can be computed faster by looking at only two

adjacent columns

Computational complexity

database sequence X

* 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

quer

y Q

Page 66: Time%SeriesIpeople.dsv.su.se/~panagiotis/DAMI2014/timeseries1.pdf · 2014-12-07 · Syllabus% Nov4 Introduc8on%to%datamining% Nov5 Associaon%Rules% Nov10,14 Clustering%and%DataRepresentaon%

STWM (Subsequence Time Warping Matrix)

•  Problem of the star-padding: we lose the information about the starting frame of the match

•  After the scan, “which is the optimal subsequence?”

•  Elements of STWM

– Distance value of each subsequence

–  Starting position !!

•  Combination of star-padding and STWM

– Efficiently identify the optimal subsequence in a stream fashion

Page 67: Time%SeriesIpeople.dsv.su.se/~panagiotis/DAMI2014/timeseries1.pdf · 2014-12-07 · Syllabus% Nov4 Introduc8on%to%datamining% Nov5 Associaon%Rules% Nov10,14 Clustering%and%DataRepresentaon%

Up  next…

•  Time  series  summariza8ons  

–  Discrete  Fourier  Transform  (DFT)  

–  Piecewise  Aggregate  Approxima8on  (PAA)  

–  Symbolic ApproXimation (SAX)

•  Streams

–  Z-normalization

–  A fast algorithm for subsequence matching in streams

•  Time series classification [briefly]

–  Lazy learners and Shapelets