an investigation of morphvox pro voice changer … · an investigation of morphvox pro voice...

1
An Investigation of Morphvox Pro Voice Changer Camilla Barlow: BA Linguistics www.manchester.ac.uk/q-step @UoMQStep The project I undertook with MBFVS (Martin Barry Forensic Voice Services) involved a systematic investigation of the Morphvox Pro voice changer software, firstly to determine the effects that the different features of the software had on the voice, and then to see if it was possible to detect the use of the software in an output sound file Objectives Results Aim : to determine what effect the three main controls in Morhphvox; Pitch Shift, Timbre Shift and Timbre strength, have on the voice. Hypothesis: Pitch shift will alter pitch, timbre shift (being described in the user instructions as altering the size of the vocal tract) will change the formant values and timbre strength will further alter formant values. Method: Systematically ‘morphing’ a set of 7 vowels pre-recorded by a male speaker using different settings and then talking pitch and formant measurements to determine the effect on the voice. Aim: to see if speech modified by Morphvox voice changer can be identified as such not just assumed to be speech of a different individual to that of the original speech sample. Method: stop bursts, creaky voicing segments and vowel formants were all examined or measured for irregularities when compared to natural speech using samples from four speakers, two male and two female. The female samples were modified to sound like male speech and the make samples to sound like female using settings identified from the initial investigation. Pitch Shift: vowel set was modified using -1, -0.5, +0.5 and +1 Pitch shift values and measurements for fundamental frequency, f1, f2 and f3 taken for each vowel. The vowel formants remained unchanged throughout whilst the pitch was increased in the +0.5 and +1 recordings and decreased in the -0.5 and -1. A pitch shift of -1 resulted in a pitch decrease of 1 octave and the opposite resulted with the +1 setting, with -/+0.5 giving an increase or decrease of half an octave. Timbre Shift: The same procedure was repeated for the timbre shift control. After no clear pattern emerging using the original male vowels, a synthetic vowel was created with f0 100 and formants at 500, 1500, 2500 and 3500. It was then clear that that the timbre shift setting -1 halved the formant frequencies and the +1 doubled them as shown in Table 1.0 Timbre Strength: The synthetic vowel was used again at timbre strength settings 0, 33, 66 and 100 keeping timbre shift at 100. It transpired that this setting took the original recording and mixed it with the recording after morphing by timbre shift, both sets of formants being visible, the degree of which was deter- mined by the value of timbre strength. An example of this is shown in Table 2.0, with timbre strength at 33, red arrows showing original formants and green morphed ones. Vowels Main Investigation Stop Bursts: When converting a female voice to a male sounding one, the stop bursts were either significantly reduced and distorted, or completely lost whilst in the cases where a male voice was changed to sound more like a female voice the stops were always duplicated. Both these effects are consequences of the PSOLA technique used by the software and, as they do not occur in natural speech, are a clear sign that this kind of program has been used. An example of a stop burst being duplicated in the voiceless bilabial plosive [p] where a male voice has been converted to a female can be seen below with the original on the left. Results Creaky Voicing: Whilst on first glance the wave form produced by the software appeared to be normal creaky voicing, the effect seen in the stop bursts per- sisted where some segments were duplicated, creating too regular a wave form to be natural creaky voicing. Two of the original male and female recordings were compared with the other ones morphed counterpart to see if the software could reliably produce a male voice from that of a female and vice versa. When comparing the f1 values of the original female recording with the morphed male for the most accurately measured vowels (a, , , i) the two way inde- pendent ANOVA demonstrated that the main effect of gender was significant F(1,64) = 52.08,p < .001 partial η2 = .449. The independent variable of vowel was also significant, F(3,64) = 147.3, p < .001 partial η2 = .873. Therefore, in the case of the four vowels examined, the f1 value of the original male voice was reliably morphed to be indistinguishable from that of a typical female. Although this was not true of all the vowels this was suspected to be a measurement error and not an error of the software. Table 2.0 Table 1.0 Conclusion Whilst Morphvox pro voice changer software can reliably change the gender of a voice in terms of the vowel formants and pitch, the use of the software is detectable due to irregularities that appear in the waveform that would not occur in natural speech.

Upload: vanthien

Post on 28-Aug-2018

247 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: An Investigation of Morphvox Pro Voice Changer … · An Investigation of Morphvox Pro Voice Changer Camilla Barlow: ... Systematically ‘morphing’ a set ... where a male voice

An Investigation of Morphvox Pro Voice Changer Camilla Barlow: BA Linguistics

www.manchester.ac.uk/q-step@UoMQStep

The project I undertook with MBFVS (Martin Barry Forensic Voice Services) involved a systematic investigation of the Morphvox Pro voice changer software, firstly to determine the effects that the different features of the software had on

the voice, and then to see if it was possible to detect the use of the software in an output sound file

Objectives Results

Aim : to determine what effect the three main controls in Morhphvox; Pitch Shift, Timbre Shift and Timbre strength, have on the voice. Hypothesis: Pitch shift will alter pitch, timbre shift (being described in the user instructions as altering the size of the vocal tract) will change the formant values and timbre strength will further alter formant values.Method: Systematically ‘morphing’ a set of 7 vowels pre-recorded by a male speaker using different settings and then talking pitch and formant measurements to determine the effect on the voice.

Aim: to see if speech modified by Morphvox voice

changer can be identified as such not just assumed to

be speech of a different individual to that of the original

speech sample.

Method: stop bursts, creaky voicing segments and

vowel formants were all examined or measured for

irregularities when compared to natural speech using

samples from four speakers, two male and two female.

The female samples were modified to sound like male

speech and the make samples to sound like female

using settings identified from the initial investigation.

Pitch Shift: vowel set was modified using -1, -0.5, +0.5

and +1 Pitch shift values and

measurements for fundamental frequency,

f1, f2 and f3 taken for each vowel. The vowel formants

remained unchanged throughout whilst the pitch was

increased in the +0.5 and +1 recordings and decreased

in the -0.5 and -1. A pitch shift of -1 resulted in a pitch

decrease of 1 octave and the opposite resulted with the

+1 setting, with -/+0.5 giving an increase or decrease of

half an octave.

Timbre Shift: The same procedure was

repeated for the timbre shift control. After no clear

pattern emerging using the original male vowels, a

synthetic vowel was created with f0 100 and formants

at 500, 1500, 2500 and 3500. It was then clear that

that the timbre shift setting -1 halved the formant

frequencies and the +1 doubled them as shown in

Table 1.0

Timbre Strength: The synthetic vowel was used again

at timbre strength settings 0, 33, 66 and

100 keeping timbre shift at 100. It transpired that this

setting took the original recording and mixed it with the

recording after morphing by timbre shift, both sets of

formants being visible, the degree of which was deter-

mined by the value of timbre strength. An example of

this is shown in Table 2.0, with timbre strength at 33,

red arrows showing original formants and green

morphed ones.

Vowels

Main InvestigationStop Bursts: When converting a female voice to a male

sounding one, the stop bursts were either significantly

reduced and distorted, or completely lost whilst in the

cases where a male voice was changed to sound more

like a female voice the stops were always duplicated.

Both these effects are consequences of the PSOLA

technique used by the software and, as they do not

occur in natural speech, are a clear sign that this kind of

program has been used. An example of a stop burst

being duplicated in the voiceless bilabial plosive [p]

where a male voice has been converted to a female can

be seen below with the original on the left.

Results

Creaky Voicing: Whilst on first glance the wave form

produced by the software appeared to be normal

creaky voicing, the effect seen in the stop bursts per-

sisted where some segments were duplicated, creating

too regular a wave form to be natural creaky voicing.

Two of the original male and female recordings were compared with the other ones morphed counterpart to see if the

software could reliably produce a male voice from that of a female and vice versa. When comparing the f1 values of the

original female recording with the morphed male for the most accurately measured vowels (a, , , i) the two way inde-

pendent ANOVA demonstrated that the main effect of gender was significant F(1,64) = 52.08,p < .001 partial η2 = .449.

The independent variable of vowel was also significant, F(3,64) = 147.3, p < .001 partial η2 = .873. Therefore, in the case of

the four vowels examined, the f1 value of the original male voice was reliably morphed to be indistinguishable from that of a

typical female. Although this was not true of all the vowels this was suspected to be a measurement error and not an error

of the software.

Table 2.0

Table 1.0

Conclusion

Whilst Morphvox pro voice changer software can reliably change the gender of a voice in terms of the vowel formants and pitch, the use of the software is detectable due to irregularities that appear in the waveform that would not occur in natural speech.