investigation of pitch detection characteristics from different audio context
TRANSCRIPT
Investigation of Pitch Detection Characteristics from Different Audio Context
Part 1: Introduction
Pitch Detection Characteristics from Different Audio Context
Motivations: Testing pitch detection algorithms using imperfect audio materials• Music note itself can be very complex • A lot of audio material is recorded in imperfect recording conditions, for example, interference from other music instrument in emsemble recording and noise. • Existing source separation algorithms usually provide incomplete separation.
Testing and Evaluation Goals:• Pitch Detection Performance Analysis using Synthesized Notes• Pitch Detection Performance Analysis using Real Musical Notes
Testing Framework:
add MIR Toolbox
•distortion•interference note•noise
Pitch detection result 1
SNR
Source audio signal
combined audio signal(simulate imperfect audio)
MIR Toolbox
Pitch detection result 2
Part 2: Pitch Detection Performance on Synthesized Notes
• Synthesized tone of 440 Hz.
440Hz
MIR Toolbox
440.5227 Hz
time (s)
freq
uen
cy(H
z)
Spectrogam
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.90
0.5
1
1.5
2
x 104
Synthesized Notes of Different Complexity
• Synthesized tone of 440 Hz.
440Hz
MIR Toolbox
440.5227 Hz
time (s)
freq
uen
cy(H
z)
Spectrogam
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.90
500
1000
1500
Synthesized Notes of Different Complexity
• Synthesized tone of 440 Hz. We add in some amplitude modulation and frequency modulation to each sonic partials to add in the complexities.
• AM index = 0.5, maximum frequency deviation = 10 Hz at f1
440Hz
MIR Toolbox
443.3253 Hz
time (s)
freq
uen
cy(H
z)
Spectrogam
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.90
0.5
1
1.5
2
x 104
Synthesized Notes of Different Complexity
440Hz
MIR Toolbox
443.3253 Hz
time (s)
freq
uen
cy(H
z)
Spectrogam
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.90
500
1000
1500
• Synthesized tone of 440 Hz. We add in some amplitude modulation and frequency modulation to each sonic partials to add in the complexities.
• AM index = 0.5, maximum frequency deviation = 10 Hz at f1
Synthesized Notes of Different Complexity
440Hz
MIR Toolbox
443.3253 Hz
• Synthesized tone of 440 Hz. We add in some amplitude modulation and frequency modulation to each sonic partials to add in the complexities.
• AM index = 0.5, maximum frequency deviation = 10 Hz at f1
Synthesized Notes of Different Complexity
440Hz
MIR Toolbox
443.3253 Hz
• Synthesized tone of 440 Hz. We add in some amplitude modulation and frequency modulation to each sonic partials to add in the complexities.
• AM index = 0.5, maximum frequency deviation = 10 Hz at f1
Synthesized Notes of Different Complexity
• Synthesized tone of 440 Hz. We add in some amplitude modulation and frequency modulation to each sonic partials to add in the complexities.
• AM index = 0.5, maximum frequency deviation = 40 Hz at f1
440Hz
MIR Toolbox
449.7704 Hz
time (s)
freq
uen
cy(H
z)
Spectrogam
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.90
0.5
1
1.5
2
x 104
Synthesized Notes of Different Complexity
440Hz
MIR Toolbox
449.7704 Hz
• Synthesized tone of 440 Hz. We add in some amplitude modulation and frequency modulation to each sonic partials to add in the complexities.
• AM index = 0.5, maximum frequency deviation = 40 Hz at f1
time (s)
freq
uen
cy(H
z)
Spectrogam
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.90
500
1000
1500
Synthesized Notes of Different Complexity
Part 3: Pitch Detection Performance on Real Musical Notes
time (s)
freq
uen
cy(H
z)
Spectrogam of Combined Test Signal
0.1 0.2 0.3 0.4 0.5 0.60
0.5
1
1.5
2
x 104
Interference from Another Music Note
time (s)
freq
uen
cy(H
z)
Spectrogam of Target Signal
0.1 0.2 0.3 0.4 0.5 0.60
0.5
1
1.5
2
x 104
time (s)
freq
uen
cy(H
z)
Spectrogam of Interference Signal
0.1 0.2 0.3 0.4 0.5 0.60
0.5
1
1.5
2
x 104
add
MIR Toolbox
source note f0596.90 Hz
MIR Toolbox
combined note f0 1034.02Hz
MIR Toolbox
interference note f01040.83 Hz
source note
interference note
combined note
Wrong
SNR = 3.5dB
time (s)
freq
uen
cy(H
z)
Spectrogam of Combined Test Signal
0.1 0.2 0.3 0.4 0.5 0.60
0.5
1
1.5
2
x 104
Interference from Another Music Note
MIR Toolbox
combined note f0 1034.02Hz
combined note
Wrong
0 0.5 1 1.5 2 2.5
x 104
0
100
200
300
400
500
600
700fft of sustained part, Combined Test Signal
frequency (Hz)
mag
nitu
de
0 1000 2000 3000 4000 5000 6000 7000
100
200
300
400
500
600
700
fft of sustained part, Combined Test Signal
frequency (Hz)
mag
nitu
de
time (s)
freq
uen
cy(H
z)
Spectrogam of Combined Test Signal
0.1 0.2 0.3 0.4 0.5 0.60
0.5
1
1.5
2
x 104
Interference from Another Music Note
MIR Toolbox
combined note f0 1034.02Hz
combined note
Wrong
0 1000 2000 3000 4000 5000 6000 7000
100
200
300
400
500
600
700
fft of sustained part, Combined Test Signal
frequency (Hz)
mag
nitu
de
1000 1010 1020 1030 1040 1050 1060 10700
100
200
300
400
500
600fft of sustained part, zoom in, Combined Test Signal
frequency (Hz)
mag
nitu
de
time (s)
freq
uen
cy(H
z)
Spectrogam of Combined Test Signal
0.1 0.2 0.3 0.4 0.5 0.60
0.5
1
1.5
2
x 104
time (s)
freq
uen
cy(H
z)
Spectrogam of Target Signal
0.1 0.2 0.3 0.4 0.5 0.60
0.5
1
1.5
2
x 104
Interference from Another Music Note
time (s)
freq
uen
cy(H
z)
Spectrogam of Interference Signal
0.1 0.2 0.3 0.4 0.5 0.60
0.5
1
1.5
2
x 104
add
MIR Toolbox
source note f0596.90 Hz
MIR Toolbox
combined note f0 596.47 Hz
MIR Toolbox
interference note f01040.83 Hz
source note
interference note
combined note
Right
SNR = 8.61 dB
Interference from Another Music Note
time (s)
freq
uen
cy(H
z)
Spectrogam of Combined Test Signal
0.1 0.2 0.3 0.4 0.5 0.60
0.5
1
1.5
2
x 104
MIR Toolbox
combined note f0 596.47 Hz
combined note
Right
0 0.5 1 1.5 2 2.5
x 104
0
100
200
300
400
500
600
700fft of sustained part, Combined Test Signal
frequency (Hz)
mag
nitu
de
0 1000 2000 3000 4000 5000 6000 70000
100
200
300
400
500
600
700
fft of sustained part, Combined Test Signal
frequency (Hz)
mag
nitu
de
Interference from Another Music Note
time (s)
freq
uen
cy(H
z)
Spectrogam of Combined Test Signal
0.1 0.2 0.3 0.4 0.5 0.60
0.5
1
1.5
2
x 104
MIR Toolbox
combined note f0 596.47 Hz
combined note
Right
0 1000 2000 3000 4000 5000 6000 70000
100
200
300
400
500
600
700
fft of sustained part, Combined Test Signal
frequency (Hz)
mag
nitu
de
560 570 580 590 600 610 620 6300
50
100
150
200
250
300
350
400
450
500fft of sustained part, zoom in, Combined Test Signal
frequency (Hz)
mag
nitu
de
Interference from Another Music Note
-10 -5 0 5 10 15 20 25550
600
650
700
750
800
850
900
950
1000
1050pitch detection change with SNR
SNR (dB)
det
ect
ed p
itch
(Hz)
-10 -5 0 5 10 15 20 250
200
400
600
800
1000
1200pitch detection change with SNR
SNR (dB)
det
ect
ed p
itch
(Hz)
time (s)
freq
uen
cy(H
z)
Spectrogam of Combined Test Signal
0.1 0.2 0.3 0.4 0.5 0.60
0.5
1
1.5
2
x 104
time (s)
freq
uen
cy(H
z)
Spectrogam of Interference Signal
0.1 0.2 0.3 0.4 0.5 0.60
0.5
1
1.5
2
x 104
Interference from Noise
time (s)
freq
uen
cy(H
z)
Spectrogam of Target Signal
0.1 0.2 0.3 0.4 0.5 0.60
0.5
1
1.5
2
x 104
add
MIR Toolbox
source note f0596.90 Hz
MIR Toolbox
combined note f0 596.65Hz
source note
noise
combined note
Right
SNR = 3.29 dB
time (s)
freq
uen
cy(H
z)
Spectrogam of Combined Test Signal
0.1 0.2 0.3 0.4 0.5 0.60
0.5
1
1.5
2
x 104
Interference from Noise
MIR Toolbox
combined note f0 596.76Hz
combined note
Right
0 0.5 1 1.5 2 2.5
x 104
0
100
200
300
400
500
600
700fft of sustained part, Combined Test Signal
frequency (Hz)
mag
nitu
de
0 1000 2000 3000 4000 5000 6000
100
200
300
400
500
600
700
fft of sustained part, Combined Test Signal
frequency (Hz)
mag
nitu
de
time (s)
freq
uen
cy(H
z)
Spectrogam of Combined Test Signal
0.1 0.2 0.3 0.4 0.5 0.60
0.5
1
1.5
2
x 104
Interference from Noise
MIR Toolbox
combined note f0 596.76Hz
combined note
Right
0 1000 2000 3000 4000 5000 6000
100
200
300
400
500
600
700
fft of sustained part, Combined Test Signal
frequency (Hz)
mag
nitu
de
560 570 580 590 600 610 620 6300
50
100
150
200
250
300
350
400
450
500fft of sustained part, zoom in, Combined Test Signal
frequency (Hz)
mag
nitu
de
time (s)
freq
uen
cy(H
z)
Spectrogam of Interference Signal
0.1 0.2 0.3 0.4 0.5 0.60
0.5
1
1.5
2
x 104
Interference from Noise
time (s)
freq
uen
cy(H
z)
Spectrogam of Target Signal
0.1 0.2 0.3 0.4 0.5 0.60
0.5
1
1.5
2
x 104
add
MIR Toolbox
source note f0596.90 Hz
MIR Toolbox
combined note f0 600.06Hz
source note
noise
combined note
Right
SNR = -2.63 dB
time (s)
freq
uen
cy(H
z)
Spectrogam of Combined Test Signal
0.1 0.2 0.3 0.4 0.5 0.60
0.5
1
1.5
2
x 104
Interference from Noise
MIR Toolbox
combined note f0 600.06Hz
combined note
Right
time (s)
freq
uen
cy(H
z)
Spectrogam of Combined Test Signal
0.1 0.2 0.3 0.4 0.5 0.60
0.5
1
1.5
2
x 104
0 0.5 1 1.5 2 2.5
x 104
0
100
200
300
400
500
600
700fft of sustained part, Combined Test Signal
frequency (Hz)
mag
nitu
de
0 1000 2000 3000 4000 5000 6000
100
200
300
400
500
600
700
fft of sustained part, Combined Test Signal
frequency (Hz)
mag
nitu
de
Interference from Noise
MIR Toolbox
combined note f0 600.06Hz
combined note
Right
time (s)
freq
uen
cy(H
z)
Spectrogam of Combined Test Signal
0.1 0.2 0.3 0.4 0.5 0.60
0.5
1
1.5
2
x 104
0 1000 2000 3000 4000 5000 6000
100
200
300
400
500
600
700
fft of sustained part, Combined Test Signal
frequency (Hz)
mag
nitu
de
560 570 580 590 600 610 620 6300
50
100
150
200
250
300
350
400
450
500fft of sustained part, zoom in, Combined Test Signal
frequency (Hz)
mag
nitu
de
Interference from Noise
-25 -20 -15 -10 -5 0 5 10 15 200
200
400
600
800
1000
1200pitch detection change with SNR
SNR (dB)
det
ect
ed p
itch
(Hz)
-25 -20 -15 -10 -5 0 5 10 15 200
200
400
600
800
1000
1200pitch detection change with SNR
SNR (dB)
det
ect
ed p
itch
(Hz)
Conclusions
Conclusions
• We implemented a framework to validate the performance of pitch detection algorithms at different audio qualities.
• We tested the performance of MIR toolbox pitch detection algorithms using both synthesized music notes and real music notes.
• Three factors that affects pitch detection performance are investigated. These factors include the complexity of the music note, interference from concurring music note and noise.
QA
Thank you!