fair use agreement

Post on 20-Feb-2016

23 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

Fair Use Agreement. This agreement covers the use of all slides on this CD-Rom, please read carefully. You may freely use these slides for teaching, if You send me an email telling me the class number/ university in advance. - PowerPoint PPT Presentation

TRANSCRIPT

Themis Palpanas 1VLDB - Aug 2004

Fair Use AgreementFair Use Agreement This agreement covers the use of all slides on this CD-Rom, please read carefully.

• You may freely use these slides for teaching, if • You send me an email telling me the class number/ university in advance.• My name and email address appears on the first slide (if you are using all or most of the slides), or on each slide (if you are just taking a few slides).

• You may freely use these slides for a conference presentation, if • You send me an email telling me the conference name in advance.• My name appears on each slide you use.

• You may not use these slides for tutorials, or in a published work (tech report/ conference paper/ thesis/ journal etc). If you wish to do this, email me first, it is highly likely I will grant you permission.

(c) Eamonn Keogh, eamonn@cs.ucr.edu

Indexing Large Human-Motion Databases

Eamonn Keogh, Themis Palpanas Victor B. Zordan,Dimitrios Gunopulos

University of California, RiversideMarc Cardle

University of Cambridge

Themis Palpanas 3VLDB - Aug 2004

Motion Capture

records motion data from live actors

Themis Palpanas 4VLDB - Aug 2004

Motion Capture

records motion data from live actors used for data-driven animation

Themis Palpanas 5VLDB - Aug 2004

Motion Capture in Games Industry

Street NBA

Madden

Themis Palpanas 6VLDB - Aug 2004

Motion Capture in Movie Industry

Troy

Lord of the Rings

Themis Palpanas 7VLDB - Aug 2004

Motivation

motion capture data segmented in short sequences, stored in motion libraries composed to create long, realistic motion sequences

important to find similar sequences form pool of similar sequences choose the most promising, to continue the motion

Themis Palpanas 8VLDB - Aug 2004

Motivation Dynamic Time Warping (DTW)

Considers only local adjustments in time, to match two time series However sometimes global adjustments are required

DTW is being extensively used uniform scaling is complementary

combination of both techniques offers rich, high-quality result set

DTW Uniform Scaling

Themis Palpanas 9VLDB - Aug 2004

Uniform Scaling

time series query, Q, length n candidate, C, length m (m>n)

0 100 200 300 400

0 100 200 300 400

C

Q

Themis Palpanas 10VLDB - Aug 2004

Uniform Scaling

time series query, Q, length n candidate, C, length m (m>n)

stretch Q to length p (n≤p≤m): Qp

Qpj = Q┌j*n/p┐, 1 ≤ j ≤ p

scaling factor, sf = p/n max scaling factor, sfmax= m/n

0 100 200 300 400

0 100 200 300 400

C

Q

0 100 200 300 400

0 100 200 300 400

Q

Qp

Themis Palpanas 11VLDB - Aug 2004

Problem Statement

given time series, Q database of candidate time series, {D}

find argminp{ dist(Qp, {D} ) } dist(Qp, {D} )= Euclidean Distance between time series

Themis Palpanas 12VLDB - Aug 2004

Problem Statement

given time series, Q database of candidate time series, {D}

find argminp{ dist(Qp, {D} ) } dist(Qp, {D} )= Euclidean Distance between time series

challenges quickly solve the problem for two time series extend solution to scale-up to large time series

databases

Themis Palpanas 13VLDB - Aug 2004

Outline

Speeding Up Search Scaling Up To Large Databases Experimental Evaluation Related Work Conclusions

Themis Palpanas 14VLDB - Aug 2004

Best Uniform Scaling Match

brute force algorithm: for each time series in {D}

for each sf, 1 ≤ sf ≤ sfmax

compute distance between the two time series find the best overall match

time complexity: O(|D|(m-n)) extremely expensive!

Themis Palpanas 15VLDB - Aug 2004

Lower Bounding Uniform Scaling

lower bound distance between two time series,for any sf, 1 ≤ sf ≤ sfmax

desiderata: fast to compute tight bound

results in fast pruning of candidates that are guaranteed not to belong to the solution compute distance only for time series not pruned by

lower bound

Themis Palpanas 16VLDB - Aug 2004

Lower Bounding Uniform Scaling

assume: candidate C, length 100 query Q, length 80 wish to find best match for any

scaling of Q between 80-100

0 10 20 30 40 50 60 70 80 90 100

C

m = 100

Themis Palpanas 17VLDB - Aug 2004

Lower Bounding Uniform Scaling

assume: candidate C, length 100 query Q, length 80 wish to find best match for any

scaling of Q between 80-100 build envelopes, length 80:

0 10 20 30 40 50 60 70 80 90 100

U

L

n = 80Ui = max( C (i-1)*m/n +1,…, C i*m/n )

Li = min( C (i-1)*m/n +1,…, C i*m/n )

Themis Palpanas 18VLDB - Aug 2004

Lower Bounding Uniform Scaling

assume: candidate C, length 100 query Q, length 80 wish to find best match for any

scaling of Q between 80-100 build envelopes, length 80:

0 10 20 30 40 50 60 70 80 90 100

Q

Ui = max( C (i-1)*m/n +1,…, C i*m/n )

Li = min( C (i-1)*m/n +1,…, C i*m/n )

Themis Palpanas 19VLDB - Aug 2004

Lower Bounding Uniform Scaling

assume: candidate C, length 100 query Q, length 80 wish to find best match for any

scaling of Q between 80-100 build envelopes, length 80:

0 10 20 30 40 50 60 70 80 90 100

Ui = max( C (i-1)*m/n +1,…, C i*m/n )

Li = min( C (i-1)*m/n +1,…, C i*m/n )

Themis Palpanas 20VLDB - Aug 2004

Lower Bounding Uniform Scaling

assume: candidate C, length 100 query Q, length 80 wish to find best match for any

scaling of Q between 80-100 compute lower bound:

0 10 20 30 40 50 60 70 80 90 100

n

iiiii

iiii

otherwiseLQifLQUQifUQ

CQKeoghLB1

2

2

0)()(

),(_

Themis Palpanas 21VLDB - Aug 2004

Envelope Indexing

dimensionality of envelopes is high

0 10 20 30 40 50 60 70 80 90 100

80 points

Themis Palpanas 22VLDB - Aug 2004

Envelope Indexing

dimensionality of envelopes is high reduce dimensionality by approximating them

Piecewise Constant Approximation

0 10 20 30 40 50 60 70 80 90 100

8 points

UU

U

L

Themis Palpanas 23VLDB - Aug 2004

Envelope Indexing

dimensionality of envelopes is high reduce dimensionality by approximating them

Piecewise Constant Approximation

assume query Q, length 80

0 10 20 30 40 50 60 70 80 90 100

Q

Themis Palpanas 24VLDB - Aug 2004

Envelope Indexing

dimensionality of envelopes is high reduce dimensionality by approximating them

Piecewise Constant Approximation

assume query Q, length 80 we approximate it with 8 points

0 10 20 30 40 50 60 70 80 90 100

Q

Themis Palpanas 25VLDB - Aug 2004

Envelope Indexing

dimensionality of envelopes is high reduce dimensionality by approximating them

Piecewise Constant Approximation

assume query Q, length 80 approximated with 8 points

compute approximation of lower bound:

0 10 20 30 40 50 60 70 80 90 100

N

iiiii

iiii

otherwiseLQifLQUQifUQ

NnRQMINDIST

1

2

2

0

ˆ)ˆ(

ˆ)ˆ()ˆ,(

Themis Palpanas 26VLDB - Aug 2004

Algorithms for Secondary Storage

use a multidimensional index VA-file -> FastScan algorithm R-tree -> RtreeProbe algorithm

2-pass algorithms:1. scan approximated envelopes,

prune search space2. find exact answer using original series

Themis Palpanas 27VLDB - Aug 2004

Outline

Speeding Up Search Scaling Up To Large Databases Experimental Evaluation Related Work Conclusions

Themis Palpanas 28VLDB - Aug 2004

Datasets Used

motion capture data from 124 sensors placed on human actors

mixed bag time series coming from:

medicine, manufacturing, environmental monitoring, economics, sensor data

experimented with time series databases of: size 5,000 – 80,000 time series length 64 – 1,024 points

Themis Palpanas 29VLDB - Aug 2004

Main Memory Experiments

assume database fits in memory measure pruning power:

fraction of times each approach calls distance function

our technique: 1 order of magnitude

faster than CD-criterion

256

128

64

256

128

64

1.20

1.10

1.05

0

0.05

0.1

0.15

0.2

0.25

LB_Keogh

CD- criterion

Themis Palpanas 30VLDB - Aug 2004

Main Memory Experiments

assume database fits in memory measure pruning power:

fraction of times each approach calls distance function

our technique: 1 order of magnitude

faster than CD-criterion 3 orders of magnitude

faster than brute force

256

128

64

256

128

64

1.20

1.10

1.05

0

0.05

0.1

0.15

0.2

0.25

LB_Keogh

CD- criterion

brute force

Themis Palpanas 31VLDB - Aug 2004

Disk-Based Experiments

comparison of: brute force FastScan RtreeProbe

25612864

25612864

25612864

1.201.101.05

0

5

10

15

20

25

LinearScan

FastScan

RtreeProbe

Sec

onds

25612864

25612864

25612864

1.201.101.05

0

5

10

15

20

25

LinearScan

FastScan

RtreeProbe

Sec

onds

Themis Palpanas 32VLDB - Aug 2004

Disk-Based Experiments

comparison of: FastScan RtreeProbe

LinearScanLB

FastScan

RtreeBF

RtreeProbe

Sec

onds

64

0

10

20

30

40

50

60

70

80

1282565121024LinearScanLB

FastScan

RtreeBF

RtreeProbe

Sec

onds

64

0

10

20

30

40

50

60

70

80

0

10

20

30

40

50

60

70

80

1282565121024

Themis Palpanas 33VLDB - Aug 2004

Disk-Based Experiments

comparison of: FastScan RtreeProbe

Sec

onds

0

LinearScanLB

FastScan

RtreeBF

RtreeProbe

0.02

0.04

0.06

0.08

0.1

0.12

0.14

0.16

0.18

500010000200004000080000

Sec

onds

0

LinearScanLB

FastScan

RtreeBF

RtreeProbe

LinearScanLB

FastScan

RtreeBF

RtreeProbe

0.02

0.04

0.06

0.08

0.1

0.12

0.14

0.16

0.18

500010000200004000080000

500010000200004000080000

Themis Palpanas 34VLDB - Aug 2004

Case Study

video

Themis Palpanas 35VLDB - Aug 2004

Outline

Speeding Up Search Scaling Up To Large Databases Experimental Evaluation Related Work Conclusions

Themis Palpanas 36VLDB - Aug 2004

Related Work

Dynamic Time Warping (DTW) [Yi & Faloutsos’00][Keogh’02][Zhu & Shasha’03][Fung &

Wong’03]

Longest Common SubSequence (LCSS) [Das et al.’97][Vlachos et al.’03]

uniform scaling [Argyros & Ermopoulos’03]

Themis Palpanas 37VLDB - Aug 2004

Outline

Speeding Up Search Scaling Up To Large Databases Experimental Evaluation Related Work Conclusions

Themis Palpanas 38VLDB - Aug 2004

Conclusions

studied utility of uniform scaling similarity matching applications in:

motion capture libraries, music retrieval, historical handwritten archives

introduced first lower bounding technique proposed indexing method for bounding envelopes

suitable for very large time series databases experimentally evaluated efficiency of technique demonstrated quality of results with real motion

capture data

Themis Palpanas 39VLDB - Aug 2004

Outline

Themis Palpanas 40VLDB - Aug 2004

Lower Bounding Uniform Scaling

assume: candidate C, length 100 query Q, length 80 wish to find best match for any

scaling of Q between 80-100 build envelopes, length 80:

0 10 20 30 40 50 60 70 80 90 100

Ui = max( C (i-1)*m/n +1,…, C i*m/n )

Li = min( C (i-1)*m/n +1,…, C i*m/n )

top related