automatic transcription of polyphonic piano music using a note masking technique

25
Automatic transcription of polyphonic piano music using a note masking technique Mr Ronan Kelly and Dr Jacqueline Walker Department of Electronic & Computer Engineering University of Limerick [email protected] , [email protected]

Upload: zuzela

Post on 11-Jan-2016

41 views

Category:

Documents


1 download

DESCRIPTION

Automatic transcription of polyphonic piano music using a note masking technique. Mr Ronan Kelly and Dr Jacqueline Walker Department of Electronic & Computer Engineering University of Limerick [email protected] , [email protected]. Overview. Music transcription Our approach - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Automatic transcription of polyphonic piano music using a note masking technique

Automatic transcription of polyphonic piano music using a note masking

technique

Mr Ronan Kelly and Dr Jacqueline Walker

Department of Electronic & Computer Engineering

University of Limerick

[email protected], [email protected]

Page 2: Automatic transcription of polyphonic piano music using a note masking technique

Overview

• Music transcription

• Our approach

• Onset detection

• Algorithm

• Results

• Conclusions

Page 3: Automatic transcription of polyphonic piano music using a note masking technique

Music Transcription

• Complex cognitive task

Example: Top of the Pops!

• A challenging task for a computer but one which pushes boundaries of signal processing, pattern recognition, machine learning,….

Page 4: Automatic transcription of polyphonic piano music using a note masking technique

Monophonic Music Transcription

• A solved problem– Sliding window-based analysis of melody

line– Steps – decimate – reduce data– Onset detecton– FFT or constant Q transform– Note detection

Page 5: Automatic transcription of polyphonic piano music using a note masking technique

Polyphonic Music Transcription

• Multiple simultaneous notes

• In Western Tonal Music (WTM), notes played together almost inevitably share harmonics

• Impact of rhythms, held notes

• Possibility of multiple instruments

Page 6: Automatic transcription of polyphonic piano music using a note masking technique

Approaches to Polyphonic Transcription

• Human audition based– Martin Cooke’s “Modelling Auditory Processing and

Organisation”, 1993– Brown & Cooke, “Computational Auditory Scene Analysis”,

1994

• Signal processing based– Tanguiane “Artificial Perception and Music

Recognition”, 1993

– Klapuri et al, since 1998

Page 7: Automatic transcription of polyphonic piano music using a note masking technique

Our Approach

• Onset Detection

• Note Window & FFT

• Masking Scheme Iteration

Page 8: Automatic transcription of polyphonic piano music using a note masking technique

Onset Detection

• NAE (Note Average Energy) Onset detection1.

1. (Liu, R., Griffith J., Walker, J. & Murphy, P., TIME DOMAIN NOTE AVERAGE ENERGY BASED MUSIC ONSET DETECTION, Proceedings of the Stockholm Music Acoustics Conference, August 6-9, 2003 (SMAC 03), Stockholm, Sweden

Figure 3 Energy (b), averaged energy (c) and note average energy (d) of power envelope (a).

Power Envelope p(t)

Energy e(t)

Note Average Energy NAE(t)

Average Energy a(t)

(a)

(b)

(c)

(d)

In practice, we search for local minima…

,1

)( t

tn

)( 1nn tttdt)(tp

tttNAE

n

Page 9: Automatic transcription of polyphonic piano music using a note masking technique

Note Window• FFT performed on the whole note• Avoids start-of-note and end-of-note effects• Gives greater robustness against noise

Page 10: Automatic transcription of polyphonic piano music using a note masking technique

Algorithm for Masking Scheme - 1

Continue until no peaks above threshold

FFT on note window

Find max peak in window

Remove peak from window; add to list

Page 11: Automatic transcription of polyphonic piano music using a note masking technique

Algorithm for Masking Scheme - 2

Continue until list is empty

Apply mask to first (lowest) frequency in list

Adjust amplitudes of all affected frequencies by mask

Add frequency to note list; move to next frequency

Page 12: Automatic transcription of polyphonic piano music using a note masking technique

Masking Scheme - 1

C4, E4, G4

Max. peak amplitude = 29.9 @ 392 Hz (G4)

262 Hz, 330 Hz, 392 Hz

Next peak amplitude = 21.4 @ 330 Hz

Page 13: Automatic transcription of polyphonic piano music using a note masking technique

Masking Scheme - 2

05

1015202530

Amplitude

262 330 392 523

Frequency (Hz)

Detected frequency peaksFrequency (Hz) Amplitude

262 11.2

330 21.4

392 29.9

523 7.1

0

0.2

0.4

0.6

0.8

1

Amplitude

261 523 784

Frequency (Hz)

Frequency (Hz) Amplitude

260,261,262 100%

523,524 72%

784,785 41%

Note mask

Page 14: Automatic transcription of polyphonic piano music using a note masking technique

Masking Scheme - 3

0

5

10

15

20

25

30

Amplitude

262 330 392 523

Frequency (Hz)

C4 Mask

Values Detected

Masking action

0 5

10 15 20 25 30

Amplitude

262 330 392 523 Frequency (Hz)

Remaining detected values

Frequency (Hz) Amplitude

330 21.4

392 29.9

523 3.1

After masking

Note played: C4

Page 15: Automatic transcription of polyphonic piano music using a note masking technique

Building a Note Mask - 1

A note is played with other notes and the significant frequency peaks and amplitudes recorded:

harmonics of D4 in red

D4 harmonics in common in blue

Page 16: Automatic transcription of polyphonic piano music using a note masking technique

Building a Note Mask - 2

05

101520253035404550

Amplitude

262 523 785 1047 1309 1570 1832

Frequency (Hz)

D4 Values

C4 Values

0

5

10 15 20

25 30 35

Amplitude

294 587 1174 1469 2056

Frequency (Hz)

D4 Values A4 Values D4 + A4 values

D4 and C4 D4 and A4

Page 17: Automatic transcription of polyphonic piano music using a note masking technique

Building a Note Mask - 3

Frequency (Hz)

D4, C4 D4, E4 D4, F4 D4, G4 D4, A4 D4, B4

294 1 1 1 1 1 1

587 0.70 0.67 0.76 0.75 0.84 0.65

881 0.38 0.37 0.44 0.44 0.40

1175 0.11 0.12

1468 0.17 0.16 0.15 0.17 0.14

1762 0.12 0.11 0.12

2056 0.27 0.25 0.28 0.28 0.30 0.18

Extract values unique to D4 and normalise to amplitude of highest peak:

Page 18: Automatic transcription of polyphonic piano music using a note masking technique

Building a Note Mask - 3

Average across samples:

0102030405060708090

100

Amplitude % of

Fundamental Frequency

294 587 881 1175 1468 1762 2056

Frequency

D4 Mask

Frequency (Hz) Amplitude

294 100%

587 72.69%

881 40.63%

1175 11.49%

1468 15.93%

1762 11.61%

2056 26.03%

Page 19: Automatic transcription of polyphonic piano music using a note masking technique

Experimental Set-up

• Keyboard used: Technics KN800 PCM Keyboard

• Note range: C2 to B6

• Recording – direct using line-in

• Isolated chords and polyphonic music samples

Page 20: Automatic transcription of polyphonic piano music using a note masking technique

Results

How to define error?

Need to account for both missed notes (m) and spurious notes (x)

%n

xmE% 100

n is number of notes detected – not number of notes played

Page 21: Automatic transcription of polyphonic piano music using a note masking technique

Results – Isolated Chords

Notes Played

Notes detected

Missed notes Spurious notes

Total Error (%)

Chords

5-8 notes

243 225 18 0 8.0

Chords

3-4 notes

648 638 15 5 3.1

Chords 1898 1906 69 77 7.7

Page 22: Automatic transcription of polyphonic piano music using a note masking technique

Results – Polyphonic Music

Notes played

Notes detected

Missed notes

Spurious notes

Total Error (%)

Danny Boy

(slow)

87 94 7 14 22

Danny Boy

(moderate)

91 98 8 15 23.5

Danny Boy

(fast)

90 99 8 17 25

Page 23: Automatic transcription of polyphonic piano music using a note masking technique

Effect of Onset Detection

• Effective onset detection is crucial• Two types of errors:

Extra onset

less likely to cause a problem

but, … note divided up too finely

Missing onset

note windows not placed ‘correctly’

Page 24: Automatic transcription of polyphonic piano music using a note masking technique

Results with Onset Detection

Notes played

Notes detected

Missed notes

Spurious notes

Total Error (%)

Danny Boy

(slow) 87 120 10 43 44

Danny Boy

(moderate)91 120 17 28 44

Danny Boy

(fast)90 120 23 37 58

Page 25: Automatic transcription of polyphonic piano music using a note masking technique

Future Work

• Develop model for note combinations (polyphonic note masks)

• Use wider range of note combinations

• Develop an efficient approach to applying polyphonic note masks

• Improve note onset detection