classifying motion picture audio

20
Classifying Motion Picture Audio Eirik Gustavsen 07.06.07

Upload: cwen

Post on 31-Jan-2016

21 views

Category:

Documents


0 download

DESCRIPTION

Classifying Motion Picture Audio. Eirik Gustavsen 07.06.07. Outline. Motivation Thesis State of the Art Proposed system Experimental setup Results Future work Conclusion. Motivation. Most projects classify clear classes or classes with noise. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Classifying Motion Picture Audio

Classifying Motion Picture Audio

Eirik Gustavsen07.06.07

Page 2: Classifying Motion Picture Audio

Outline

• Motivation • Thesis• State of the Art• Proposed system• Experimental setup• Results• Future work• Conclusion

Page 3: Classifying Motion Picture Audio

Motivation

• Most projects classify clear classes or classes with noise.

• Few clear boundaries in motion picture audio• Subjective descriptions of movies• Dificult to compare movie content

Page 4: Classifying Motion Picture Audio

Thesis

It is possible to automatically create a table of contents of a motion picture, based on its audio track only.

Page 5: Classifying Motion Picture Audio

Research questions

• Find best LLDs to classify motion picture audio

• Detect boundaries between audio classes within complex audio segments

• Automatically create a TOC based on the audio track only

Page 6: Classifying Motion Picture Audio

Pre-Processing44100 Hz sample rateMono16 bits

30 ms windows (LW)

Page 7: Classifying Motion Picture Audio

Low Level Descriptors

Time domain Frequency domain

Page 8: Classifying Motion Picture Audio

Low Level Descriptors

• Total of 23 low level descriptors

TIME DOMAIN

• Audio Power• Audio Wave Form• Root-Mean Square• Short Time Energy• Low Short Time Energy Ratio• Zero-Crossing Rate• High Zero-Crossing Rate Ratio

FREQUENCY DOMAIN

• Audio Spectrum Centroid• Fundamental Frequency• 10 Mel-Frequency Cepstral Coefficients• Spectrum Flux

Page 9: Classifying Motion Picture Audio

Dimensionally reduction

Principal components analysis (PCA) is a technique used to reduce multidimensional data sets to lower dimensions for analysis.

f(1)f(2)f(3)f(4)f(5)...f(23)

PCAd(1)d(2)d(3)

Page 10: Classifying Motion Picture Audio

K Nearest Neighbors

Page 11: Classifying Motion Picture Audio

Proposed system

Pre- Prosessing LLD Norm

PCAKNNPost- Prosessing

TOC Generation

Page 12: Classifying Motion Picture Audio

Classifying Audio

Speech

Noise (white)

Music

”Silence”

Mixed audio classes

Page 13: Classifying Motion Picture Audio

Class Boundary Detection

Page 14: Classifying Motion Picture Audio

Class Boundary Detection

Page 15: Classifying Motion Picture Audio

Class Boundary Detection

Page 16: Classifying Motion Picture Audio

Finding most suitable LLDs

Most Suitable:

ASCAWFRMSHZCRR

Page 17: Classifying Motion Picture Audio

Sample Results

Music with low volume

Clear speech

Speech with background environmental sounds

Fading between music and speech

Speech with Background music

Jingle

” Some mistakes”

Page 18: Classifying Motion Picture Audio

Future Work

• To be done in this thesis– Post processing– TOC

• Open research questions for future works– New motion picture audio classes– Detecting sound objects– Speech recognition

Page 19: Classifying Motion Picture Audio

Conclusion

• Pre-processing makes it possible to classify motion picture audio correctly

• Using right combination of LLDs enhances the result of the classification

Page 20: Classifying Motion Picture Audio

Questions

?