understanding the semantics of media lecture notes on video search & mining, spring 2012...

18
Understanding the Semantics of Media Lecture Notes on Video Search & Mining, Spring 2012 Presented by Jun Hee Yoo Biointelligence Laboratory School of Computer Science and Engineering Seoul National Univertisy http://bi.snu.ac.kr

Upload: maryann-owens

Post on 01-Jan-2016

217 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Understanding the Semantics of Media Lecture Notes on Video Search & Mining, Spring 2012 Presented by Jun Hee Yoo Biointelligence Laboratory School of

Understanding the Semantics of Media

Lecture Notes on Video Search & Mining, Spring 2012

Presented by Jun Hee Yoo

Biointelligence Laboratory

School of Computer Science and Engineering

Seoul National Univertisy

http://bi.snu.ac.kr

Page 2: Understanding the Semantics of Media Lecture Notes on Video Search & Mining, Spring 2012 Presented by Jun Hee Yoo Biointelligence Laboratory School of

© 2012, SNU CSE Biointelligence Lab., http://bi.snu.ac.kr 2

Semantic Understanding There are some tools which attempt to segment video at

a higher level. But this level of analysis does not tell us much about

the meaning represented in the media.

Problem Statement

Page 3: Understanding the Semantics of Media Lecture Notes on Video Search & Mining, Spring 2012 Presented by Jun Hee Yoo Biointelligence Laboratory School of

© 2012, SNU CSE Biointelligence Lab., http://bi.snu.ac.kr 3

Approach

Segmentation Literature Use LSI because it allow us to quantify the position of a

portion of the document in a multi-dimensional semantic space.

Propose to summarize the text with LSI and analyze the signal with smooth Gaussians.

Semantic Retrieval Literature Use mixtures of probability experts for semantic-audio

retrieval (MPESAR) to model which more sophisticated model connecting words and media.

Page 4: Understanding the Semantics of Media Lecture Notes on Video Search & Mining, Spring 2012 Presented by Jun Hee Yoo Biointelligence Laboratory School of

© 2012, SNU CSE Biointelligence Lab., http://bi.snu.ac.kr 4

Analysis Tools

SVD To reduce the dimensionality of a signal in a manner

which is optimum, in a least-squared sense. This use to reduce dimensionality of both audio and im-

age video data. Color Space

Concatenate into 512 histogram bins.

Word Space Using Latent semantic indexing with SVD. To measure the distance use the angle;

Page 5: Understanding the Semantics of Media Lecture Notes on Video Search & Mining, Spring 2012 Presented by Jun Hee Yoo Biointelligence Laboratory School of

© 2012, SNU CSE Biointelligence Lab., http://bi.snu.ac.kr 5

Segmenting Video

Temporal Properties of Video Color:

It provides robust evidence for a shot change in a video signal.

However, it cannot tell us global structure of the video.

Random words form a transcript: The words indicate a lot about the overall structure of

the story.

Page 6: Understanding the Semantics of Media Lecture Notes on Video Search & Mining, Spring 2012 Presented by Jun Hee Yoo Biointelligence Laboratory School of

© 2012, SNU CSE Biointelligence Lab., http://bi.snu.ac.kr 6

Segmenting Video

Test Material CNN Headline News (30min TV show). 21st Century Jet (Documentary). Use automatic speech recognition(ASR) to provide a

transcript of the audio.

Page 7: Understanding the Semantics of Media Lecture Notes on Video Search & Mining, Spring 2012 Presented by Jun Hee Yoo Biointelligence Laboratory School of

© 2012, SNU CSE Biointelligence Lab., http://bi.snu.ac.kr 7

Segmenting Video

Scale Space Convert the original signal into scaled space.

In scale space, we analyze a signal with many differ-ent kernels.

With Low Pass Filter

Histogram

Page 8: Understanding the Semantics of Media Lecture Notes on Video Search & Mining, Spring 2012 Presented by Jun Hee Yoo Biointelligence Laboratory School of

© 2012, SNU CSE Biointelligence Lab., http://bi.snu.ac.kr 8

Segmenting Video

Combined Image and Audio Data

Combined color, words and scale space analysis. The result is a 20-dimensional vector function of time and scale.

Page 9: Understanding the Semantics of Media Lecture Notes on Video Search & Mining, Spring 2012 Presented by Jun Hee Yoo Biointelligence Laboratory School of

© 2012, SNU CSE Biointelligence Lab., http://bi.snu.ac.kr 9

Segmenting Video

Hierarchical Segmentation Results

Color and word autocorrelations for the Boeing 777 video

Page 10: Understanding the Semantics of Media Lecture Notes on Video Search & Mining, Spring 2012 Presented by Jun Hee Yoo Biointelligence Laboratory School of

© 2012, SNU CSE Biointelligence Lab., http://bi.snu.ac.kr 10

Segmenting Video

Hierarchical Segmentation Results

Grouping 4-8 sentences produces a larger semantic auto-correlation.

Page 11: Understanding the Semantics of Media Lecture Notes on Video Search & Mining, Spring 2012 Presented by Jun Hee Yoo Biointelligence Laboratory School of

© 2012, SNU CSE Biointelligence Lab., http://bi.snu.ac.kr 11

Segmenting Video

Intermediate Results A scale-space segmen-

tation algorithm pro-duced a boundary map showing the edges in the signal.

Page 12: Understanding the Semantics of Media Lecture Notes on Video Search & Mining, Spring 2012 Presented by Jun Hee Yoo Biointelligence Laboratory School of

© 2012, SNU CSE Biointelligence Lab., http://bi.snu.ac.kr 12

Segmenting Video

A comparison of ground truth. Left: estimated result. Right: ground truth.

Page 13: Understanding the Semantics of Media Lecture Notes on Video Search & Mining, Spring 2012 Presented by Jun Hee Yoo Biointelligence Laboratory School of

© 2012, SNU CSE Biointelligence Lab., http://bi.snu.ac.kr 13

Segmenting Video

Shot Boundary Segmentation. Use commercial product, designed by YesVideo.

Page 14: Understanding the Semantics of Media Lecture Notes on Video Search & Mining, Spring 2012 Presented by Jun Hee Yoo Biointelligence Laboratory School of

© 2012, SNU CSE Biointelligence Lab., http://bi.snu.ac.kr 14

Segmenting Video

Manual Segmentation result

Page 15: Understanding the Semantics of Media Lecture Notes on Video Search & Mining, Spring 2012 Presented by Jun Hee Yoo Biointelligence Laboratory School of

© 2012, SNU CSE Biointelligence Lab., http://bi.snu.ac.kr 15

Semantic Retrieval

MPESAR process

Page 16: Understanding the Semantics of Media Lecture Notes on Video Search & Mining, Spring 2012 Presented by Jun Hee Yoo Biointelligence Laboratory School of

© 2012, SNU CSE Biointelligence Lab., http://bi.snu.ac.kr 16

Semantic Retrieval

Acoustic Signal processing chain

Acoustic to Semantic Lookup

Page 17: Understanding the Semantics of Media Lecture Notes on Video Search & Mining, Spring 2012 Presented by Jun Hee Yoo Biointelligence Laboratory School of

© 2012, SNU CSE Biointelligence Lab., http://bi.snu.ac.kr 17

Semantic Retrieval

Testing

Page 18: Understanding the Semantics of Media Lecture Notes on Video Search & Mining, Spring 2012 Presented by Jun Hee Yoo Biointelligence Laboratory School of

© 2012, SNU CSE Biointelligence Lab., http://bi.snu.ac.kr 18

Retrieval Results

Histogram of true label ranks based on likelihoods from au-dio-to-semantic tests

Histogram of true label ranks based on likelihoods from se-mantic-to-acoustic tests