investigation of pitch detection characteristics from different audio context

31
Investigation of Pitch Detection Characteristics from Different Audio Context

Upload: asia-dikes

Post on 28-Mar-2015

221 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Investigation of Pitch Detection Characteristics from Different Audio Context

Investigation of Pitch Detection Characteristics from Different Audio Context

Page 2: Investigation of Pitch Detection Characteristics from Different Audio Context

Part 1: Introduction

Page 3: Investigation of Pitch Detection Characteristics from Different Audio Context

Pitch Detection Characteristics from Different Audio Context

Motivations: Testing pitch detection algorithms using imperfect audio materials• Music note itself can be very complex • A lot of audio material is recorded in imperfect recording conditions, for example, interference from other music instrument in emsemble recording and noise. • Existing source separation algorithms usually provide incomplete separation.

Testing and Evaluation Goals:• Pitch Detection Performance Analysis using Synthesized Notes• Pitch Detection Performance Analysis using Real Musical Notes

Testing Framework:

add MIR Toolbox

•distortion•interference note•noise

Pitch detection result 1

SNR

Source audio signal

combined audio signal(simulate imperfect audio)

MIR Toolbox

Pitch detection result 2

Page 4: Investigation of Pitch Detection Characteristics from Different Audio Context

Part 2: Pitch Detection Performance on Synthesized Notes

Page 5: Investigation of Pitch Detection Characteristics from Different Audio Context

• Synthesized tone of 440 Hz.

440Hz

MIR Toolbox

440.5227 Hz

time (s)

freq

uen

cy(H

z)

Spectrogam

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.90

0.5

1

1.5

2

x 104

Synthesized Notes of Different Complexity

Page 6: Investigation of Pitch Detection Characteristics from Different Audio Context

• Synthesized tone of 440 Hz.

440Hz

MIR Toolbox

440.5227 Hz

time (s)

freq

uen

cy(H

z)

Spectrogam

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.90

500

1000

1500

Synthesized Notes of Different Complexity

Page 7: Investigation of Pitch Detection Characteristics from Different Audio Context

• Synthesized tone of 440 Hz. We add in some amplitude modulation and frequency modulation to each sonic partials to add in the complexities.

• AM index = 0.5, maximum frequency deviation = 10 Hz at f1

440Hz

MIR Toolbox

443.3253 Hz

time (s)

freq

uen

cy(H

z)

Spectrogam

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.90

0.5

1

1.5

2

x 104

Synthesized Notes of Different Complexity

Page 8: Investigation of Pitch Detection Characteristics from Different Audio Context

440Hz

MIR Toolbox

443.3253 Hz

time (s)

freq

uen

cy(H

z)

Spectrogam

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.90

500

1000

1500

• Synthesized tone of 440 Hz. We add in some amplitude modulation and frequency modulation to each sonic partials to add in the complexities.

• AM index = 0.5, maximum frequency deviation = 10 Hz at f1

Synthesized Notes of Different Complexity

Page 9: Investigation of Pitch Detection Characteristics from Different Audio Context

440Hz

MIR Toolbox

443.3253 Hz

• Synthesized tone of 440 Hz. We add in some amplitude modulation and frequency modulation to each sonic partials to add in the complexities.

• AM index = 0.5, maximum frequency deviation = 10 Hz at f1

Synthesized Notes of Different Complexity

Page 10: Investigation of Pitch Detection Characteristics from Different Audio Context

440Hz

MIR Toolbox

443.3253 Hz

• Synthesized tone of 440 Hz. We add in some amplitude modulation and frequency modulation to each sonic partials to add in the complexities.

• AM index = 0.5, maximum frequency deviation = 10 Hz at f1

Synthesized Notes of Different Complexity

Page 11: Investigation of Pitch Detection Characteristics from Different Audio Context

• Synthesized tone of 440 Hz. We add in some amplitude modulation and frequency modulation to each sonic partials to add in the complexities.

• AM index = 0.5, maximum frequency deviation = 40 Hz at f1

440Hz

MIR Toolbox

449.7704 Hz

time (s)

freq

uen

cy(H

z)

Spectrogam

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.90

0.5

1

1.5

2

x 104

Synthesized Notes of Different Complexity

Page 12: Investigation of Pitch Detection Characteristics from Different Audio Context

440Hz

MIR Toolbox

449.7704 Hz

• Synthesized tone of 440 Hz. We add in some amplitude modulation and frequency modulation to each sonic partials to add in the complexities.

• AM index = 0.5, maximum frequency deviation = 40 Hz at f1

time (s)

freq

uen

cy(H

z)

Spectrogam

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.90

500

1000

1500

Synthesized Notes of Different Complexity

Page 13: Investigation of Pitch Detection Characteristics from Different Audio Context

Part 3: Pitch Detection Performance on Real Musical Notes

Page 14: Investigation of Pitch Detection Characteristics from Different Audio Context

time (s)

freq

uen

cy(H

z)

Spectrogam of Combined Test Signal

0.1 0.2 0.3 0.4 0.5 0.60

0.5

1

1.5

2

x 104

Interference from Another Music Note

time (s)

freq

uen

cy(H

z)

Spectrogam of Target Signal

0.1 0.2 0.3 0.4 0.5 0.60

0.5

1

1.5

2

x 104

time (s)

freq

uen

cy(H

z)

Spectrogam of Interference Signal

0.1 0.2 0.3 0.4 0.5 0.60

0.5

1

1.5

2

x 104

add

MIR Toolbox

source note f0596.90 Hz

MIR Toolbox

combined note f0 1034.02Hz

MIR Toolbox

interference note f01040.83 Hz

source note

interference note

combined note

Wrong

SNR = 3.5dB

Page 15: Investigation of Pitch Detection Characteristics from Different Audio Context

time (s)

freq

uen

cy(H

z)

Spectrogam of Combined Test Signal

0.1 0.2 0.3 0.4 0.5 0.60

0.5

1

1.5

2

x 104

Interference from Another Music Note

MIR Toolbox

combined note f0 1034.02Hz

combined note

Wrong

0 0.5 1 1.5 2 2.5

x 104

0

100

200

300

400

500

600

700fft of sustained part, Combined Test Signal

frequency (Hz)

mag

nitu

de

0 1000 2000 3000 4000 5000 6000 7000

100

200

300

400

500

600

700

fft of sustained part, Combined Test Signal

frequency (Hz)

mag

nitu

de

Page 16: Investigation of Pitch Detection Characteristics from Different Audio Context

time (s)

freq

uen

cy(H

z)

Spectrogam of Combined Test Signal

0.1 0.2 0.3 0.4 0.5 0.60

0.5

1

1.5

2

x 104

Interference from Another Music Note

MIR Toolbox

combined note f0 1034.02Hz

combined note

Wrong

0 1000 2000 3000 4000 5000 6000 7000

100

200

300

400

500

600

700

fft of sustained part, Combined Test Signal

frequency (Hz)

mag

nitu

de

1000 1010 1020 1030 1040 1050 1060 10700

100

200

300

400

500

600fft of sustained part, zoom in, Combined Test Signal

frequency (Hz)

mag

nitu

de

Page 17: Investigation of Pitch Detection Characteristics from Different Audio Context

time (s)

freq

uen

cy(H

z)

Spectrogam of Combined Test Signal

0.1 0.2 0.3 0.4 0.5 0.60

0.5

1

1.5

2

x 104

time (s)

freq

uen

cy(H

z)

Spectrogam of Target Signal

0.1 0.2 0.3 0.4 0.5 0.60

0.5

1

1.5

2

x 104

Interference from Another Music Note

time (s)

freq

uen

cy(H

z)

Spectrogam of Interference Signal

0.1 0.2 0.3 0.4 0.5 0.60

0.5

1

1.5

2

x 104

add

MIR Toolbox

source note f0596.90 Hz

MIR Toolbox

combined note f0 596.47 Hz

MIR Toolbox

interference note f01040.83 Hz

source note

interference note

combined note

Right

SNR = 8.61 dB

Page 18: Investigation of Pitch Detection Characteristics from Different Audio Context

Interference from Another Music Note

time (s)

freq

uen

cy(H

z)

Spectrogam of Combined Test Signal

0.1 0.2 0.3 0.4 0.5 0.60

0.5

1

1.5

2

x 104

MIR Toolbox

combined note f0 596.47 Hz

combined note

Right

0 0.5 1 1.5 2 2.5

x 104

0

100

200

300

400

500

600

700fft of sustained part, Combined Test Signal

frequency (Hz)

mag

nitu

de

0 1000 2000 3000 4000 5000 6000 70000

100

200

300

400

500

600

700

fft of sustained part, Combined Test Signal

frequency (Hz)

mag

nitu

de

Page 19: Investigation of Pitch Detection Characteristics from Different Audio Context

Interference from Another Music Note

time (s)

freq

uen

cy(H

z)

Spectrogam of Combined Test Signal

0.1 0.2 0.3 0.4 0.5 0.60

0.5

1

1.5

2

x 104

MIR Toolbox

combined note f0 596.47 Hz

combined note

Right

0 1000 2000 3000 4000 5000 6000 70000

100

200

300

400

500

600

700

fft of sustained part, Combined Test Signal

frequency (Hz)

mag

nitu

de

560 570 580 590 600 610 620 6300

50

100

150

200

250

300

350

400

450

500fft of sustained part, zoom in, Combined Test Signal

frequency (Hz)

mag

nitu

de

Page 20: Investigation of Pitch Detection Characteristics from Different Audio Context

Interference from Another Music Note

-10 -5 0 5 10 15 20 25550

600

650

700

750

800

850

900

950

1000

1050pitch detection change with SNR

SNR (dB)

det

ect

ed p

itch

(Hz)

-10 -5 0 5 10 15 20 250

200

400

600

800

1000

1200pitch detection change with SNR

SNR (dB)

det

ect

ed p

itch

(Hz)

Page 21: Investigation of Pitch Detection Characteristics from Different Audio Context

time (s)

freq

uen

cy(H

z)

Spectrogam of Combined Test Signal

0.1 0.2 0.3 0.4 0.5 0.60

0.5

1

1.5

2

x 104

time (s)

freq

uen

cy(H

z)

Spectrogam of Interference Signal

0.1 0.2 0.3 0.4 0.5 0.60

0.5

1

1.5

2

x 104

Interference from Noise

time (s)

freq

uen

cy(H

z)

Spectrogam of Target Signal

0.1 0.2 0.3 0.4 0.5 0.60

0.5

1

1.5

2

x 104

add

MIR Toolbox

source note f0596.90 Hz

MIR Toolbox

combined note f0 596.65Hz

source note

noise

combined note

Right

SNR = 3.29 dB

Page 22: Investigation of Pitch Detection Characteristics from Different Audio Context

time (s)

freq

uen

cy(H

z)

Spectrogam of Combined Test Signal

0.1 0.2 0.3 0.4 0.5 0.60

0.5

1

1.5

2

x 104

Interference from Noise

MIR Toolbox

combined note f0 596.76Hz

combined note

Right

0 0.5 1 1.5 2 2.5

x 104

0

100

200

300

400

500

600

700fft of sustained part, Combined Test Signal

frequency (Hz)

mag

nitu

de

0 1000 2000 3000 4000 5000 6000

100

200

300

400

500

600

700

fft of sustained part, Combined Test Signal

frequency (Hz)

mag

nitu

de

Page 23: Investigation of Pitch Detection Characteristics from Different Audio Context

time (s)

freq

uen

cy(H

z)

Spectrogam of Combined Test Signal

0.1 0.2 0.3 0.4 0.5 0.60

0.5

1

1.5

2

x 104

Interference from Noise

MIR Toolbox

combined note f0 596.76Hz

combined note

Right

0 1000 2000 3000 4000 5000 6000

100

200

300

400

500

600

700

fft of sustained part, Combined Test Signal

frequency (Hz)

mag

nitu

de

560 570 580 590 600 610 620 6300

50

100

150

200

250

300

350

400

450

500fft of sustained part, zoom in, Combined Test Signal

frequency (Hz)

mag

nitu

de

Page 24: Investigation of Pitch Detection Characteristics from Different Audio Context

time (s)

freq

uen

cy(H

z)

Spectrogam of Interference Signal

0.1 0.2 0.3 0.4 0.5 0.60

0.5

1

1.5

2

x 104

Interference from Noise

time (s)

freq

uen

cy(H

z)

Spectrogam of Target Signal

0.1 0.2 0.3 0.4 0.5 0.60

0.5

1

1.5

2

x 104

add

MIR Toolbox

source note f0596.90 Hz

MIR Toolbox

combined note f0 600.06Hz

source note

noise

combined note

Right

SNR = -2.63 dB

time (s)

freq

uen

cy(H

z)

Spectrogam of Combined Test Signal

0.1 0.2 0.3 0.4 0.5 0.60

0.5

1

1.5

2

x 104

Page 25: Investigation of Pitch Detection Characteristics from Different Audio Context

Interference from Noise

MIR Toolbox

combined note f0 600.06Hz

combined note

Right

time (s)

freq

uen

cy(H

z)

Spectrogam of Combined Test Signal

0.1 0.2 0.3 0.4 0.5 0.60

0.5

1

1.5

2

x 104

0 0.5 1 1.5 2 2.5

x 104

0

100

200

300

400

500

600

700fft of sustained part, Combined Test Signal

frequency (Hz)

mag

nitu

de

0 1000 2000 3000 4000 5000 6000

100

200

300

400

500

600

700

fft of sustained part, Combined Test Signal

frequency (Hz)

mag

nitu

de

Page 26: Investigation of Pitch Detection Characteristics from Different Audio Context

Interference from Noise

MIR Toolbox

combined note f0 600.06Hz

combined note

Right

time (s)

freq

uen

cy(H

z)

Spectrogam of Combined Test Signal

0.1 0.2 0.3 0.4 0.5 0.60

0.5

1

1.5

2

x 104

0 1000 2000 3000 4000 5000 6000

100

200

300

400

500

600

700

fft of sustained part, Combined Test Signal

frequency (Hz)

mag

nitu

de

560 570 580 590 600 610 620 6300

50

100

150

200

250

300

350

400

450

500fft of sustained part, zoom in, Combined Test Signal

frequency (Hz)

mag

nitu

de

Page 27: Investigation of Pitch Detection Characteristics from Different Audio Context

Interference from Noise

-25 -20 -15 -10 -5 0 5 10 15 200

200

400

600

800

1000

1200pitch detection change with SNR

SNR (dB)

det

ect

ed p

itch

(Hz)

-25 -20 -15 -10 -5 0 5 10 15 200

200

400

600

800

1000

1200pitch detection change with SNR

SNR (dB)

det

ect

ed p

itch

(Hz)

Page 28: Investigation of Pitch Detection Characteristics from Different Audio Context

Conclusions

Page 29: Investigation of Pitch Detection Characteristics from Different Audio Context

Conclusions

• We implemented a framework to validate the performance of pitch detection algorithms at different audio qualities.

• We tested the performance of MIR toolbox pitch detection algorithms using both synthesized music notes and real music notes.

• Three factors that affects pitch detection performance are investigated. These factors include the complexity of the music note, interference from concurring music note and noise.

Page 30: Investigation of Pitch Detection Characteristics from Different Audio Context

QA

Page 31: Investigation of Pitch Detection Characteristics from Different Audio Context

Thank you!