lam: musical audio similarity

Post on 25-Feb-2016

39 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

LAM: Musical Audio Similarity. Michael Casey Centre for Cognition, Computation and Culture Department of Computing Goldsmiths College, University of London. Overview. Machine Music Understanding Features / Classes / Clusters Real-Time Audio Matching Feature Extraction - PowerPoint PPT Presentation

TRANSCRIPT

LAM: Musical Audio Similarity

Michael CaseyCentre for Cognition, Computation and Culture

Department of ComputingGoldsmiths College, University of London

Overview• Machine Music Understanding

• Features / Classes / Clusters

• Real-Time Audio Matching• Feature Extraction• Feature Similarity (Indexing / Retrieval)• PD/MSP Tools

• Music Similarity Applications• Sound object matching• Texture matching

Sound Understanding

Signal Processing Sound Understanding

Feature Extraction

frame 2

frame 3

overlapframe 1

audiosource

20ms10ms 30ms 40ms

Feature Extraction

frame 2

frame 3

overlapframe 1

audiosource

20ms10ms 30ms 40ms

Feature Extraction

frame 2

frame 3

overlapframe 1

audiosource

20ms10ms 30ms 40ms

Feature Extraction

frame 2

frame 3

overlapframe 1

audiosource

20ms10ms 30ms 40ms

Feature Extraction

frame 2

frame 3

overlapframe 1

audiosource

20ms10ms 30ms 40ms

Feature Extraction

frame 2

frame 3

overlapframe 1

audiosource

20ms10ms 30ms 40ms

p( | ) * P( )

Statistical Learningfor Decision Making

Decision boundary

Partitioning of feature space

P( | )= p( )

MusicSpeech

MPEG-7 Audio Tools

Audio

MPEG-7 Audio Tools

Log FrequencySpectrogramAudio

AudioSpectrumEnvelopeD

MPEG-7 Audio Tools

Log FrequencySpectrogramAudio Log

AmplitudeDecorrelatingTransform /

Dimension Reduction

AudioSpectrumEnvelopeD

AudioSpectrumProjectionD

SoundModelStatePathD

State Path

Use estimated state sequence as a feature

MPEG-7 Audio Tools

Log FrequencySpectrogramAudio Log

AmplitudeDecorrelatingTransform /

Dimension Reduction

AudioSpectrumEnvelopeD

AudioSpectrumProjectionD

Hidden MarkovModel

SoundModelDS

MPEG-7 Audio StringsAcoustic Lexicons

Log FrequencySpectrogramAudio Log

AmplitudeDecorrelatingTransform /

Dimension Reduction

AudioSpectrumEnvelopeD

AudioSpectrumProjectionD

Hidden MarkovModel

SoundModelDS StatePath

? 7 1 V 7 1 0 1 ...

SoundModelStatePathD

SYMBOL STRING

State Symbol Sequence (40 State Model)

?71V

7101 .

..

State Symbol Sequence (40 State Model)

?71V

7101 .

..

State Symbol Sequence (40 State Model)

?71V

7101 .

..

State Symbol Sequence (40 State Model)

?71V

7101 .

..

SoundModelStateHistogramD

seconds

stat

e in

dex

stat

e in

dex

0.01s Frames

Self-Similarity Matrix

Self-Similarity Matrix

Self-Similarity Matrix

|||||||||cos, 1

babaT

ba

Self-Similarity Matrix

|||||||||cos, 1

babaT

ba

a

Self-Similarity Matrix

|||||||||cos, 1

babaT

ba

a

b

Self-Similarity Matrix

|||||||||cos, 1

babaT

ba

a

b

Self-Similarity Matrix

|||||||||cos, 1

babaT

ba

S-Matrix

Efficient Storage / Retrieval

• Real-Time Access

• Large Databases

• Distributed Databases

PostgreSQL Database Representation of State Path “Strings” and Histograms

Similarity

• Compute distance between feature pairs• Features == SoundModelStateHistogramD

• Similarity Metric•dist(a,b) >= 0•dist(a,b)== 0 iff a==b•dist(a,b) + dist(b,c) >= dist(a,c)

• Vector Dot Product

|||||||||cos, 1

babaT

ba

Similarity of Feature Trajectories

Dynamic Time Warping

Acousticon Strings

• Distance Metric– String Edit Distance (Levenschtein)

• Scalable to Large Databases– PostgreSQL Implementation– Can use built-in Index Structures

• Scalable to Real-Time Implementation– matching and audio streaming (< 20ms )

Information Retrievalfor Creativity

• Utilize sound extant database for new material

• Take the structure of a music clip but replace the content.

• New interfaces for music creativity.

Audio Information Retrieval

MPEG-7Database

A pre-indexed Collection of Sounds

Audio Query Extract

MPEG-7Database

Segment Match

Result ListA Sound or Scene orList of Sounds

Audio Information Retrieval

Audio Query Extract

MPEG-7Database

Segment Match

Result ListFeature extractionfrom audio.

Audio Information Retrieval

Audio Query Extract

MPEG-7Database

Segment Match

Result ListPartitioningof audio intochunks.

Audio Information Retrieval

Audio Query Extract

MPEG-7Database

Segment Match

Result List

Find similar chunksof Audio

Audio Information Retrieval

Real-Time Matching

MusaicsReal-Time Matching

MusaicsReal-Time MatchingReal-Time Matching

MusaicsReal-Time Matching

MusaicsReal-Time Matching

MusaicsReal-Time Matching

MusaicsReal-Time Matching

MusaicsReal-Time Matching

MusaicsReal-Time Matching

MusaicsReal-Time Matching

MusaicsReal-Time Matching

top related