cs335 principles of multimedia systems content based media retrieval hao jiang computer science...

28
CS335 Principles of Multimedia Systems Content Based Media Retrieval Hao Jiang Computer Science Department Boston College Dec. 4, 2007

Post on 20-Dec-2015

217 views

Category:

Documents


0 download

TRANSCRIPT

CS335 Principles of Multimedia Systems

Content Based Media Retrieval

Hao Jiang

Computer Science Department

Boston College

Dec. 4, 2007

CS335 Principles of Multimedia Systems

Introduction

With the increase of multimedia content on the web, we need methods to find the image, audio or video.

Text only schemes have limitations– A lot of manual work.– Inaccurate description.

Content based media retrieval studies how to– find media data using classification and recognition methods

based on “features” in audio, image and videos.– For example, we can use an exemplar (like a sample

picture) to find other similar ones.

Applications: – Digital library, personal image album and audio folders,

satellite image processing and medical applications.

CS335 Principles of Multimedia Systems

An Image Retrieval Example (Viper)

The query input.

CS335 Principles of Multimedia Systems

An Image Retrieval Example (Viper)

The query output.

CS335 Principles of Multimedia Systems

User feedback.

CS335 Principles of Multimedia Systems

Refined results. Better?

CS335 Principles of Multimedia Systems

Another query for paintings.

CS335 Principles of Multimedia Systems

Painting Search Result

The shortlist returned from the search.

CS335 Principles of Multimedia Systems

Content Based Media Retrieval The input

– A text description.– One or more exemplar images, audio or video clips.– A sketch, e.g., a dark background with an orange disk in the

the middle used to search for sunset scenes.– Or the combination or them.

The output– A shortlist of images, or audio video clips.– The instances of the event you want to find in videos or

audios.– Can be structured into a web page or other documents.– Usually allows user feedback to improve the result.

The basic task in content based media retrieval is comparing and searching multimedia data.

CS335 Principles of Multimedia Systems

How Do We Evaluate the Performance?

Precision and Recall– Precision = (# of relevant items) / (# of items retrieved)– Recall = (# of relevant items) / (Total # of related items

in the dataset)

The procedure of drawing a Recall-Precision Curve:– Compute the relevance score for each item in the database.– Sort the list.– Assume the sorted list is like

r r r n n r r r n n …

and we have total 6 relevant items in the database

CS335 Principles of Multimedia Systems

The Recall-Precision Curve

1/6 2/6 3/6 4/6 5/6 1

1

Precision

Recall

Short list is like: r r r n n r r r n n …

Q: Why do not we just use a single value instead of a curve?

CS335 Principles of Multimedia Systems

The “Best” Recall-Precision Curve

1

Precision

Recall1/(# of relevant items)

(# of relevant items)/(# of total items)

1

CS335 Principles of Multimedia Systems

Image Retrieval Methods

To find images in a database, we have to compare images quantitatively based on “features”.

We can compare the images as a whole using features like:– Color, textures and their spatial layouts.

We can also segment images into regions and use similar features in object detection.

In some recent systems, people use salient features such as SIFT (Scale Invariant Transform) like features, learning and pattern recognition methods.

CS335 Principles of Multimedia Systems

Color Histogram Methods

Color only schemes tend to find many unrelated images.http://amazon.ece.utexas.edu/~qasim/qdialog_IMGDATA2_v1_Birds_Swans.html

CS335 Principles of Multimedia Systems

Improve Color Histogram Methods

If we can separate the foreground with background the result will be improved.

Foreground

Background

CS335 Principles of Multimedia Systems

Improve Color Histogram Methods

Their spatial relations also help to find the right object.

ColorBlob 2

ColorBlob 1

CS335 Principles of Multimedia Systems

Finding Shapes

Finding similar shapes is a very useful tool in managing large number of images.

Chamfer matching is a standard method to compare the similarity of shapes.

General Hough Transform can also be used to find shapes in images.

CS335 Principles of Multimedia Systems

Shape Context

Shape context is another widely used feature in shape retrieval.

Cij is the distance of shape contexts hi and hj

CS335 Principles of Multimedia Systems

Improve Matching Efficiency

Fast pruning in matching– Reprehensive shape contexts

– Shapemes

Greg Mori, Serge Belongie, and Jitendra Malik, Shape Contexts Enable Efficient Retrieval of Similar Shapes, CVPR, 2001

CS335 Principles of Multimedia Systems

Example Results

Reprehensive shape contexts in shape matching

CS335 Principles of Multimedia Systems

Current Trends and Challenges

We now show a more “recent” work

L. Fei-Fei, R. Fergus, and P. Perona. A Bayesian approach tounsupervised One-Shot learning of Object categories. ICCV 2003.

The goal is to detect whether an objectappears in an image.

CS335 Principles of Multimedia Systems

SIFT features are used.The good features areIn fact learned fromSmall set of training images.

CS335 Principles of Multimedia Systems

Motor bikeResults.

CS335 Principles of Multimedia Systems

Retrieve Other Multimedia Data

Audio retrieval– Find a audio clip in a large database.

Video retrieval– Find a specific video clip.– Find a video short that has specific person or action.– Browsing video …

CS335 Principles of Multimedia Systems

Data Structures in Media Retrieval

In multimedia data retrieval we often need to find the “nearest Neighbor” in the database from the exemplar.

We can abstract each media object as a feature vector. Our goal is to organize the database so that we can locate the most similar vector as quickly as possible.

Q: Think of some data structures that help to improve the searching.

CS335 Principles of Multimedia Systems

K-d Tree

A 2D k-d tree

a

b

c

d e

f

a

b

c d f

CS335 Principles of Multimedia Systems

Summary

Content based multimedia retrieval is still not mature. Many problems still need to be solved.

There is no single method that solves all the problems.

We need better object detection and classification schemes.

Other related problems like multimedia data mining are also attracting more and more interest.