augmenting the generalized hough transform to enable the mining of petroglyphs

35
Augmenting the Generalized Hough Transform to Enable the Mining of Petroglyphs Qiang Zhu, Xiaoyue Wang, Eamonn Keogh, 1 Sang-Hee Lee Dept. Of Computer Science & Eng., 1 Dept. of Anthropology University of California, Riverside

Upload: orli

Post on 02-Feb-2016

26 views

Category:

Documents


1 download

DESCRIPTION

Augmenting the Generalized Hough Transform to Enable the Mining of Petroglyphs. Qiang Zhu , Xiaoyue Wang, Eamonn Keogh, 1 Sang-Hee Lee Dept. Of Computer Science & Eng., 1 Dept. of Anthropology University of California, Riverside. Outline. Motivation Approach Evaluation Conclusion. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Augmenting the Generalized Hough Transform  to Enable the Mining of Petroglyphs

Augmenting theGeneralized Hough Transform to Enable the Mining of Petroglyphs

Qiang Zhu, Xiaoyue Wang, Eamonn Keogh, 1Sang-Hee Lee

Dept. Of Computer Science & Eng., 1Dept. of Anthropology

University of California, Riverside

Page 2: Augmenting the Generalized Hough Transform  to Enable the Mining of Petroglyphs

Outline

Motivation Approach Evaluation Conclusion

Page 3: Augmenting the Generalized Hough Transform  to Enable the Mining of Petroglyphs

Motivation(1) -applications

Petroglyphs are one of the earliest expressions of abstract thinking.

Providing a rich source of information: climate change existence of a certain species patterns of human’s migrations and interactions

Page 4: Augmenting the Generalized Hough Transform  to Enable the Mining of Petroglyphs

Motivation(2) -difficulties

Progress in petroglyph research has been frustratingly slow. due to their extraordinarily diverse and complex

structure most matching algorithms can not capture the

similarity of petroglyphs for those that can, even in limited cases, do not

scale to large collections

Page 5: Augmenting the Generalized Hough Transform  to Enable the Mining of Petroglyphs

Approach

How to preprocess the raw data? How to define the distance measure? How to speed up?

Page 6: Augmenting the Generalized Hough Transform  to Enable the Mining of Petroglyphs

Preprocessing(1)

With rare exceptions, petroglyphs do not lend themselves to automatic extraction with segmentation algorithms.

The border of this rock may be recognized as the edge of this petroglyph

Page 7: Augmenting the Generalized Hough Transform  to Enable the Mining of Petroglyphs

PetroAnnotator

Load the raw image into our human computation tool

Page 8: Augmenting the Generalized Hough Transform  to Enable the Mining of Petroglyphs

PetroAnnotator (cont.)

Draw an approximate boundary around object, and then trace the shape

Page 9: Augmenting the Generalized Hough Transform  to Enable the Mining of Petroglyphs

Preprocessing(2) -downsampling

A B

(A) Two overlaid skeleton traces (340 by 250) of the same image of a Bighorn sheep. Less than 3.5% of the pixels from each image overlap.

(B) The same two images after downsampling (30 by 23).75.6% of the pixels (denoted by black) are common to both.

Page 10: Augmenting the Generalized Hough Transform  to Enable the Mining of Petroglyphs

Distance Measure -why GHT? essentially makes no assumption about the data

open/closed boundaries connected/disconnected shapes

correctly captures the similarity subjective/objective similarity on unlabeled/labeled

datasets tightly lower bound the distance

allowing for very efficient searches in large datasets

Page 11: Augmenting the Generalized Hough Transform  to Enable the Mining of Petroglyphs

Classic GHT

GHT is a useful method for two dimensional arbitrary shape detection.

Q C

Page 12: Augmenting the Generalized Hough Transform  to Enable the Mining of Petroglyphs

(1) Find the “star-pattern”

R R

Page 13: Augmenting the Generalized Hough Transform  to Enable the Mining of Petroglyphs

(2) Superimpose & Accumulate 0 1 0 0 0

0 0 0 0 0

1 1 1 0 0

0 0 0 0 0

AC

0 1 1 0 0

0 0 0 0 0

1 2 2 1 0

0 0 0 0 0

0 1 1 1 0

0 0 0 0 0

1 2 3 2 1

0 0 0 0 0

0 1 1 1 0

0 0 1 0 0

1 2 3 2 1

0 1 1 1 0

Page 14: Augmenting the Generalized Hough Transform  to Enable the Mining of Petroglyphs

(3) Find the “peak”

CQ

R R’

0 1 1 1 0

0 0 1 0 0

1 2 3 2 1

0 1 1 1 0

A

Page 15: Augmenting the Generalized Hough Transform  to Enable the Mining of Petroglyphs

A Basic Distance Measure

Classic GHT doesn’t explicitly encode a similarity measure

We can simply define a GHT-based distance: minimal unmatched edge points (MUE) =

number of edge points in Q – maximal matched edge points

= 4 – 3 = 1 (for our toy example)

Page 16: Augmenting the Generalized Hough Transform  to Enable the Mining of Petroglyphs

A New Cell Incrementation Strategy When can we obtain the value of a particular cell

in the accumulator? In the classic GHT, until the end of all incrementation Is it possible to obtain the value one by one? Need to check all positions that are possible to increase the

cell value

Q C

?

Page 17: Augmenting the Generalized Hough Transform  to Enable the Mining of Petroglyphs

Lower Bound

2 2 4 2 2

In this column Q needs 2 pixels in C, and has 3In this column Q needs 2 pixels in C, and has 2In this column Q needs 4 pixels in C, and has only 2In this column Q needs 2 pixels in C, and has 2In this column Q needs 2 pixels in C, and has 3

Q C

2 2 4 2 2SigQx =

0 0 3 2 2 2 3 00 0 0SigCx =

?

0Minimal missed points: + 0+ 2+ 0+ 0 = 2

?

?

?

?

?

?

?

?

Page 18: Augmenting the Generalized Hough Transform  to Enable the Mining of Petroglyphs

Time Complexity Classic GHT

O(NQ×NC+S2) superimpose all query vectors to all edge points in the

candidate image

Lower bound GHT O(S2) compare one-dimensional signatures further reduced by early abandon and shifting order one to two orders of magnitude speed-up

Page 19: Augmenting the Generalized Hough Transform  to Enable the Mining of Petroglyphs

Variants on the Basic Distance Measure Query-by-Content:

Clustering:

Finding Motifs:

otherwise

CQMUEN

NNifNNCQMUEN

CQD

Q

QCQCQ

nn

),(

1

/),(

1

),(

)],(),([),( QCDCQDNNCQD nnnnCQclustering

)),((2/)(),( CQMUENNNCQD QCQmotifs

Page 20: Augmenting the Generalized Hough Transform  to Enable the Mining of Petroglyphs

Evaluation

We performed three sets of experiments:

Evaluation of Utility

-on unlabeled data

Evaluation of Accuracy -on labeled data

Evaluation of Scalability -on synthetic data

Page 21: Augmenting the Generalized Hough Transform  to Enable the Mining of Petroglyphs

Evaluation of Utility (1)

Atlatls

Anthropomorphs

Bighorn Sheep

(1) Our GHT-based distance measure correctly groups all seven pairs

(2) The higher level structure of the dendrogram also correctly groups similar petroglyphs

A clustering of typical Southwestern USA petroglyphs

Page 22: Augmenting the Generalized Hough Transform  to Enable the Mining of Petroglyphs

Evaluation of Utility (2)

a b c d e f g h

SC

WY

Page 23: Augmenting the Generalized Hough Transform  to Enable the Mining of Petroglyphs
Page 24: Augmenting the Generalized Hough Transform  to Enable the Mining of Petroglyphs

Evaluation of Utility (3)

Whether our distance measure can find meaningful motifs? 2,852 real petroglyphs 4,065,526 possible pairs 52 top motifs (0.00128%) by motif cutoff

0 50 100

150

200

Motif Cutoff

Page 25: Augmenting the Generalized Hough Transform  to Enable the Mining of Petroglyphs

Evaluation of Accuracy -datasets NicIcon dataset

24,441 images 14 categories 33 volunteers 234×234 pixels WD/WI tests

Farsi digits dataset From 11,942 registration

forms 60,000 digits for training 20,000 digits for testing 54×64 pixels (largest MBR)

0 1 3 82 4 96 75

Page 26: Augmenting the Generalized Hough Transform  to Enable the Mining of Petroglyphs

(1) Test the Downsampling Size

10 20 30 40 50 60 70 80 0

10

20

30

Resolution (R×R) of Downsampled Images (NicIcon)

Error Rate (%)

5

WI

WD

5 10 20 30 2

4

8

12

16

Resolution (R×R) of Downsampled Images (Farsi)

Error Rate (%)

In both datasets, the error rate of one-nearest-neighbor test varies little once the resolution is greater than 10×10

Page 27: Augmenting the Generalized Hough Transform  to Enable the Mining of Petroglyphs

(2) Competitive accuracy

NicIcon dataset Error rate for WD: 4.78% 8.46% for WI

The dataset creators tested on the online data using three classifiers.

Only one of them (DTWB) is better, however, slower

Farsi digits dataset Error rate: 4.54%

Borji et al. performed extensive empirical tests on this dataset

Of the twenty reported error rates, the mean was 8.69%

Only four beat our approach, but need to set at least six parameters

Page 28: Augmenting the Generalized Hough Transform  to Enable the Mining of Petroglyphs

Evaluation of Scalability -datasets We made 8 synthetic petroglyph datasets

Based on 22 classic petroglyphs Duplicated by 10 volunteers on a tablet Applied a Random Polynomial Transformation Containing up to 1,280,000 objects

Page 29: Augmenting the Generalized Hough Transform  to Enable the Mining of Petroglyphs

(1) Querying by Content Leave-one-out one-nearest-neighbor test. Repeated the test for 10 times on each dataset.

10K

20K 40K 80K 160K 320K 640K 1280K

40

60

80

100

Size of Synthetic Petroglyphs Datasets Prune Rate (%)

Max Prune Rate

Avg Prune Rate Min Prune Rate

10K 20K 40K 80K 160K 320K 640K 1280K

2

6

10

14

18

Size of Synthetic Petroglyphs Datasets

% to Brute Force Time

Page 30: Augmenting the Generalized Hough Transform  to Enable the Mining of Petroglyphs

(2) Finding Motifs A brute force algorithm requires time quadratic in the size of

dataset. By using the triangular inequality of our distance measure, we

only need to calculate a tiny fraction of the exact distance.

Even for the smallest dataset:

-our algorithm is 712 times faster

-we can prune 99.84% of the calculations

10K 20K 40K 80K 160K 320K 640K 1280K

0

40000

80000

120000

Size of Synthetic Petroglyphs Datasets

Speed Up (times)

Page 31: Augmenting the Generalized Hough Transform  to Enable the Mining of Petroglyphs

Conclusion

In this work we considered, for the first time, the problem of mining large collections of rock art. Introduced a novel distance measure Found an efficiently computable tight lower bound

to this measure Enabled mining large data archives effectively

Page 32: Augmenting the Generalized Hough Transform  to Enable the Mining of Petroglyphs

Thanks for your listening !

All datasets and the code can be downloaded from: http://www.cs.ucr.edu/~qzhu/petro.html

Page 33: Augmenting the Generalized Hough Transform  to Enable the Mining of Petroglyphs

Preprocessing

With rare exceptions, petroglyphs do not lend themselves to automatic extraction with segmentation algorithms.

Cracks in the rock are more “significant” than the actual edges

Page 34: Augmenting the Generalized Hough Transform  to Enable the Mining of Petroglyphs

Preprocessing -existing archives There are several other rich sources of rock art

data to be mined, e.g.: sketches by anthropologists

From a scanned book DownsampledBinarized Thinned

Page 35: Augmenting the Generalized Hough Transform  to Enable the Mining of Petroglyphs

By Hausdroff By GHT

Experiment testing the impact of noise, a single dot is randomly added