traditional database indexing techniques for video database indexing jianping fan department of...

Post on 20-Dec-2015

235 Views

Category:

Documents

1 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Traditional Database Indexing Techniques for Video Database Indexing

Jianping FanDepartment of Computer Science

University of North Carolina at CharlotteCharlotte, NC 28223

jfan@uncc.edu

http://www.cs.uncc.edu/~jfan

1. Why we need indexing?

Library: 2000000 books

Find the book with title “Multimedia Systems, Standards, and Networks” without indexing!

Query:

Too hard! 2000000!

How we can do this more efficiently?

2. How Library Works?

a. Classify these books into several subjects:

I get it!

Too easy!11!

Books in Library

Natural Sciences Social Sciences

DancingComputer Science

ElectricalEngineering

Computer Languages Researches

Database Multimedia

2. How Library Works?

b. How they get this good partition and management?

Taxonomy !!& Library Science!!

Natu

ral S

cien

ces

Socia

l Scie

nce

How we can do this for data & image?

3. Key Problems for Building Indexing?

What you can find from this map?

3. Key Problems for Building Indexing?

What you can find from this map?

What you can find from this map?

3. Key Problems for Building Indexing?

What you can find from this map?

3. Key Problems for Building Indexing?

3. Key Problems for Building Indexing?

a. Partition

b. Representation

Database is some tables! Map is similar as tables!

Partition the large-scale data set into meaningful &manageable small regions hierarchically!

Represent these regions using efficient technique so that they can access very fast!

4. How to build indexing structure for data?

a. Space partition approach:

Partition the space into regions according to some measure

4. How to build indexing structure for data?

a. Space partition approach:

Space partition tree is attractive for GIS system.

4. How to build indexing structure for data?

Space partition may not work for some case!!

4. How to build indexing structure for data?

Partition data based on data distributions!

clustering

Using clustering to partition data set!!

4. How to build indexing structure for data?

b. Data Partition via Clustering

4. How to build indexing structure for data?

b. Data Partition via Clustering

K-mean data clustering

(1) Select K center to startDark points

4. How to build indexing structure for data?

b. Data Partition via Clustering

K-mean data clustering

(2) Put the testing point into most similar center

]},1[|),(min{ KicentertestD i

4. How to build indexing structure for data?

b. Data Partition via Clustering

K-mean data clustering

(3) Update the corresponding cluster center

c. Representation of Data Partition Results:

(1) Rectangular box(ID, x1,y1,x2,y2)

(2) Sphere

(ID, xc,yc, R)

(SR-tree)

4. How to build indexing structure for data?

A

B

CD

E

F G

H

I

J

K

L

M

N

A B C

D E F G H

L M N

I J K

L M N

D E F G H

I J K

Data set Indexing tree

Search road

R-tree: Minimum Rectangular Box

A

B

C

D

A B C D

A B C D

First partition

4. How to build indexing structure for data?

R-tree: Minimum Rectangular Box

A

a b

c

d e

fg

A B C D

A B C D

a b c d e f gSecond partition of A

4. How to build indexing structure for data?

Data partition approach:

4. How to build indexing structure for data?

Data partition approach:

B

Second partition of B

A B C D

A B C Dh

i

jk

h i j k

4. How to build indexing structure for data?

Data partition approach:

Cl

m

A B C D

A B C D

l mSecond partition of C

4. How to build indexing structure for data?

A B C D

B C D A

a b c d e f g

l m

h i j k

Final indexing structure

Data partition approach:

R-tree family

A

B C

F

G

H

D

E

Root Node

A B C

D E F G H

4. How to build indexing structure for data?

R-tree family

A

B C

F

G

H

D

E

Root Node

A B C

D E F G H

a. Overlap between A and C!

4. How to build indexing structure for data?

4. How to build indexing structure for data?

X-tree: Minimum Rectangular Box with Fat Node root

Normal directory nodesSuper-nodes

Data nodes

4. How to build indexing structure for data?

SR-tree: Minimum Sphere

Grid file can be treated as an extended Q-tree with multiple partition at each attribute!

salary

age

4. How to build indexing structure for data?

Grid file can be treated as an extended Q-tree with multiple partition at each attribute!

buckets

4. How to build indexing structure for data?

primary buckets

overflow bucket

4. How to build indexing structure for data?

a. Equal query: 1 + M

b. Range query: N + N*M

c. Insert: 1 + M + 1

d. Delete: 1 + M + 1

Bucket numbers: N; overflow bucket: M; Number of data entries for leaf node: K

4. How to build indexing structure for data?

Data distribution information can be used to improve the performance of grid file.

salary

age

4. How to build indexing structure for data?

Dynamic Grid File

salary

age

bucket

4. How to build indexing structure for data?

20*

00

01

10

11

2 2

2

2

LOCAL DEPTH 2

2

DIRECTORY

GLOBAL DEPTHBucket A

Bucket B

Bucket C

Bucket D

Bucket A2(`split image'of Bucket A)

1* 5* 21*13*

32*16*

10*

15* 7* 19*

4* 12*

19*

2

2

2

000

001

010

011

100

101

110

111

3

3

3DIRECTORY

Bucket A

Bucket B

Bucket C

Bucket D

Bucket A2(`split image'of Bucket A)

32*

1* 5* 21*13*

16*

10*

15* 7*

4* 20*12*

LOCAL DEPTH

GLOBAL DEPTH

4. How to build indexing structure for data?

a. Equal query: 1 + M

b. Range query: N + N*M

c. Insert: 1 + M + 1

d. Delete: 1 + M + 1

Bucket numbers: N; overflow bucket: M; Number of data entries for leaf node: K

4. How to build indexing structure for data?

Database indexing structure is built for decision making and tries to make the decision as fast as possible!

Color = Green?

Size = Big?

watermelon

Size = Medium?

appleGrape

Color = Yellow?

Shape = Round?

Size = Big? banana

grapefruit lemon

Size = small?

Taste = sweet?

cherry grape

apple

yes

yesno

yes no

no

yes no

yes no

yes no

yes no

yes no

4. How to build indexing structure for data?

Decision Tree

How to obtain decision for a database?

)(log)()( 2 jj

j ppni

a. Obtain a set of labeled training data set from the database.

b. Calculate the entropy impurity:

c. Classifier is built by:

)(max ni

4. How to build indexing structure for data?

KD-tree

By treating query as a decision making procedure, we can use decision to build more effective database indexing!

Database root node

Salary > $75000?

yes no

Data table

Age > 60?yes no

no

Age > 60?yes no

4. How to build indexing structure for data?

Each inter-node, only one attribute is used!

It is not balance! Search from different node may have different I/O cost!

It can support multiple attribute database indexing like R-tree!

It has integrated decision making and database query!

4. How to build indexing structure for data?

a. Equal query: N + M

b. Range query: N + M

c. Insert: N + M + 1

d. Delete: N+ M + 1

Tree levels: N; Leaf nodes: M; Number of data entries for leaf node: KThe inter-nodes for kd-tree at the same level are stored on the same page.

4. How to build indexing structure for data?

5. Storage Management for High-Dimensional Indexing Structures

Index entries

Data entries

direct search for

(Index File)

(Data file)

Data Records

data entries

Data entries

Data Records

CLUSTEREDUNCLUSTERED

Index entries

Data entries

direct search for

(Index File)

(Data file)

Data Records

data entries

Data entries

Data Records

CLUSTEREDUNCLUSTERED

We want to put the similar data in the same page or neighboring pages!

5. Storage Management for High-Dimensional Indexing Structures

It is very hard to do multi-dimensional data sorting!

00 01 10 11

Hilbert Curve: scale multi-dimensional data into one dimension.

5. Storage Management for High-Dimensional Indexing Structures

0 1

23

4

5 6

7 8

9 10

11

12

13

14 15

From multi-dimensional indexing to one-dimensional storage in disk!

6. Video Database Indexing

Can these technique be used for video database indexing?

a. Curse of Dimensions: overlap in high-dimensional space

b. Semantic Gap: visual features == semantic concepts

What we should do?

ColorHSV color histogram, dominant color, …

TextureEdge histogram, wavelet coefficients, Tamura features, …

MotionDirectional motion histogram, Camera motion, …

Other features

Video Sequence

Shot 1 Shot i Shot n

Visual Representation

Schema Determination

ColorHSV color histogram, dominant color, …

TextureEdge histogram, Tamura, ….

ShapeRectangular box, moments, …..

MotionTrajectory, motion histogram, …

Other features

Video Sequence

Key Object 1 Key Object i Key Object n

Schema Determination

6. Video Database Indexing

A

B

C

overlap

curse of dimensions

6. Video Database Indexing

a. Concept Hierarchy

We should try to bridge the semantic gap in the video content partition procedure.

Objective:

2000 Olympic Games

filed basketball softball soccer volleyball

Team USA

Team Norway

Team Slovakia

Team USA

Players

News

Game Actions

Players

News

Game Actions

6. Video Database Indexing

],.....,[21 niii xxx

.. ..... ...

Visual Features

...

. ..

Semantic Clusters jC

. . . . . . . . . . . . . . . . .Video Contents in

Database

Weighted mapping?

b. Semantic classification

6. Video Database Indexing

Video in Database

Cluster 1 Cluster i Cluster n

Subcluster 11 Subcluster 1j Subcluster n1 Subcluster nl

Subregion 11k Subregion nl1 Subregion nlm

object1111 object nlm1

Disk for Cluster 1 Disk for Cluster i Disk for Cluster n

ii DN log

7. Video Query with Indexing

query object

feature extraction

Cluster 1 Cluster i Cluster n

Subcluster i1 Subcluster ij Subcluster im

Subregion ij1 Subregion ijl Subregion ijr

Object ijrm

Disk for cluster 1 Disk for cluster i

Video Browsing

A* Search Algorithm Video in Database

Cluster 1 Cluster i Cluster n

Subcluster 11 Subcluster 1j Subcluster n1 Subcluster nl

Subregion 11k Subregion nl1 Subregion nlm

object1111 object nlm1

Disk for Cluster 1 Disk for Cluster i Disk for Cluster n

ii DN log

Multimedia Database System Design

Access control & rights management

Query & Delivery

Delivery

Query Presentation

Query Processing

Visual Summarization

Indexing

Video Collections

MPEG Encoder

Indexing is very important!

top related