the extraction of structure from a musical...

22
The Extraction of Structure from a Musical Piece Kasper.Souren@ircam.fr http://www.ircam.fr/anasyn/souren/

Upload: others

Post on 23-Jun-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: The Extraction of Structure from a Musical Piecerecherche.ircam.fr/equipes/analyse-synthese/jjcaas/slides/Souren.pdf · related to human perception rather from the listener's standpoint

The Extraction of Structure 

from a Musical Piece

[email protected]://www.ircam.fr/anasyn/souren/

Page 2: The Extraction of Structure from a Musical Piecerecherche.ircam.fr/equipes/analyse-synthese/jjcaas/slides/Souren.pdf · related to human perception rather from the listener's standpoint

related to human perception

rather from the listener's standpoint      

than from the composer's standpoint

Musical Structure

Page 3: The Extraction of Structure from a Musical Piecerecherche.ircam.fr/equipes/analyse-synthese/jjcaas/slides/Souren.pdf · related to human perception rather from the listener's standpoint

audio, no MIDI or symbolic information

audio descriptors

not (yet) limited to one style

looking for similarity and borders

Finding structure

Page 4: The Extraction of Structure from a Musical Piecerecherche.ircam.fr/equipes/analyse-synthese/jjcaas/slides/Souren.pdf · related to human perception rather from the listener's standpoint

 raw audio (11025 Hz)

log  FFT

log  FFT

power spectrogram

spectrum band variation information

EOFprincipal components

most significant spectrum band variations

musical piece (ogg, wav, mp3)

Most significant spectrum variations

Page 5: The Extraction of Structure from a Musical Piecerecherche.ircam.fr/equipes/analyse-synthese/jjcaas/slides/Souren.pdf · related to human perception rather from the listener's standpoint

Most significant spectrum variations

PC of “log FFT” of framesfrom every band

spectrogram

100 feature vectors per second

time

frequency

most significantspectrum band variations

about 1 feature vector per second

Page 6: The Extraction of Structure from a Musical Piecerecherche.ircam.fr/equipes/analyse-synthese/jjcaas/slides/Souren.pdf · related to human perception rather from the listener's standpoint

EOF based on SVD

Empirical Orthogonal Functions, based on   Singular Value Decomposition

popular in climate research

type of Principal Component Analysis

useful for reducing number of dimensions 

while explaining large part of variance

Page 7: The Extraction of Structure from a Musical Piecerecherche.ircam.fr/equipes/analyse-synthese/jjcaas/slides/Souren.pdf · related to human perception rather from the listener's standpoint

Similarity matrixJ. Foote, 1999

1) the audio descriptors are N­dimensional space

2) calculate mutual distances: distance matrix

3) rescale: similarity matrix

Page 8: The Extraction of Structure from a Musical Piecerecherche.ircam.fr/equipes/analyse-synthese/jjcaas/slides/Souren.pdf · related to human perception rather from the listener's standpoint

Similarity matrixtime

time

time

most significantspectrum variations

Similarity Matrix

Chardonnay Says by Nood/Banana

Page 9: The Extraction of Structure from a Musical Piecerecherche.ircam.fr/equipes/analyse-synthese/jjcaas/slides/Souren.pdf · related to human perception rather from the listener's standpoint

Finding similar partsstep 1:  calculate lag matrix

`

similarity matrix lag matrix

time

time delay time

time

Page 10: The Extraction of Structure from a Musical Piecerecherche.ircam.fr/equipes/analyse-synthese/jjcaas/slides/Souren.pdf · related to human perception rather from the listener's standpoint

Finding similar partsstep 2: apply 2D FIR filter to blur

blurred lag matrixlag matrix

time

delay timedelay time

time

Page 11: The Extraction of Structure from a Musical Piecerecherche.ircam.fr/equipes/analyse-synthese/jjcaas/slides/Souren.pdf · related to human perception rather from the listener's standpoint

Finding similar partsstep 3: find vertical local maxima

its local maxima(values from non­blurred matrix)blurred lag matrix

time

time

delay timedelay time

Page 12: The Extraction of Structure from a Musical Piecerecherche.ircam.fr/equipes/analyse-synthese/jjcaas/slides/Souren.pdf · related to human perception rather from the listener's standpoint

Finding similar partsstep 4: post­processing

0) forget first column (diagonal of similarity matrix)1) localize sufficiently long contiguous parts2) remove overlaps3) remove diagonal parts

local maxima similar parts

Page 13: The Extraction of Structure from a Musical Piecerecherche.ircam.fr/equipes/analyse-synthese/jjcaas/slides/Souren.pdf · related to human perception rather from the listener's standpoint

Finding borders

step 1: convolution, kernels of different sizes

similarity matrix filtered matrices

Page 14: The Extraction of Structure from a Musical Piecerecherche.ircam.fr/equipes/analyse-synthese/jjcaas/slides/Souren.pdf · related to human perception rather from the listener's standpoint

Finding borders

step 2: diagonals => columns

filtered matrices

diagonals of filtered matrices

tim

e

kernel size

Page 15: The Extraction of Structure from a Musical Piecerecherche.ircam.fr/equipes/analyse-synthese/jjcaas/slides/Souren.pdf · related to human perception rather from the listener's standpoint

Finding borders

step 3: find local maxima in columns diagonals of filtered matrices local maxima

tim

e

Page 16: The Extraction of Structure from a Musical Piecerecherche.ircam.fr/equipes/analyse-synthese/jjcaas/slides/Souren.pdf · related to human perception rather from the listener's standpoint

Finding borders

step 4: post­processing

1) localize contiguous parts

2) sum their values

3) throw away positionswith too low values

4) refine the positionsusing  the spectrogram

time

Page 17: The Extraction of Structure from a Musical Piecerecherche.ircam.fr/equipes/analyse-synthese/jjcaas/slides/Souren.pdf · related to human perception rather from the listener's standpoint

Structural Information Theory formal calculus for Gestalt laws

focus on visual patterns

experimented with Genetic Programming

problem:

need for much higher description, musical objects,

thus source seperation, classification, ...

Page 18: The Extraction of Structure from a Musical Piecerecherche.ircam.fr/equipes/analyse-synthese/jjcaas/slides/Souren.pdf · related to human perception rather from the listener's standpoint

Framework for Audio Analysis functionality interesting for                              

audio and music research

integrating research could be fruitful

finding musical structure

audio signal separation

sound classification

...

Page 19: The Extraction of Structure from a Musical Piecerecherche.ircam.fr/equipes/analyse-synthese/jjcaas/slides/Souren.pdf · related to human perception rather from the listener's standpoint

scripting language, interpreted

object­oriented

flexible, extensible, easy to embed

modular

free software (BSD style license)

Python

Page 20: The Extraction of Structure from a Musical Piecerecherche.ircam.fr/equipes/analyse-synthese/jjcaas/slides/Souren.pdf · related to human perception rather from the listener's standpoint

Scientific analysis environment stand­alone application: QtFfAA

GUI + command line, object viewer, visualisation

Embeddable in free audio software

for audio editors and recorders

for music players, DJ tools

FfAA modes

Page 21: The Extraction of Structure from a Musical Piecerecherche.ircam.fr/equipes/analyse-synthese/jjcaas/slides/Souren.pdf · related to human perception rather from the listener's standpoint

versatile interface MDI GUI (PyQt)

commandline (IPython)

load and analyse sound files

database

visualisation

easily extensible

FfAA right now

Page 22: The Extraction of Structure from a Musical Piecerecherche.ircam.fr/equipes/analyse-synthese/jjcaas/slides/Souren.pdf · related to human perception rather from the listener's standpoint

The Extraction of Structure 

from a Musical Piece

[email protected]://www.ircam.fr/anasyn/souren/