imposing native speakers’ prosody on non-native speakers’ utterances: preliminary studies...
TRANSCRIPT
Imposing native speakers’ prosody on non-native speakers’ utterances:
Preliminary studies
Kyuchul YoonSpring 2006 NAELL
The Division of English
Kyungnam University
2
Contents
• Acquiring prosody in language learning…...3• Previous approaches……………………….4• A new tool…………………………………5• Technical details…………………………...6• Implications of the technique…………….15• Preliminary plans for an experiment ……..16
3
Acquiring prosody in language learning
• One of the critical tasks in language learning
• Prosody as non-segmental features of speech1. phrase breaks2. intonation (F0) contour3. segmental durations4. intensity contour
4
Previous approaches
• Explicit teaching of prosodic features such as the intonation contours, segmental durations, etc.
• Audio aidListen and repeat!
• Visual aid in computer softwareDr.Speaking® : F0 contour comparison between native speaker and non-native speaker
5
A new tool
• A new kind of audio aidin the form of a non-native speaker’s utterance with the prosodic features of a native speaker’s utterance
• How this works1. Software presents a native speaker’s utterance2. A non-native speaker repeats the utterance3. Software records the non-native speaker’s utterance4. Software imposes the native speaker’s prosody onto the non-native speaker’s utterance5. Software presents the processed non-native utterance
6
Technical details
• Manipulation of1. segmental durations, including phrase breaks 2. F0 contours 3. intensity contours
• For 1 and 2PSOLA (Pitch Synchronous OverLap and Add), developed by Moulines & Charpentier, 1990implemented in Praat
• For 3Intensity swap in Praat
7
Technical detailsMoulines & Charpentier, 1990
original waveform
windowed waveform
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
1 4 7 10 13 16 19
shortened waveform
1 3 5 7 9 11 13 15 17 19
waveform with lower F0
8
Technical details 1Segmental durations
• Segmental alignment & PSOLA processing: Alignment can be manual or automatic (with the help of speech recognition)
k eI m i n “…came in…”native
k eI i nnon-native m
9
Technical details 2F0 contours
• PSOLA processing on duration-treated utterance
k eI m i nnative
non-native k eI m i n
higher F0
lower F0
10
Technical details 3Intensity contours
• Mathematically “neutralize” non-native speaker’s intensity contour and transfer native speaker’s intensity contour in Praat – Holger Miterer (personal communication)
k eI m i nnative
non-native k eI m i n
11
Technical details
• Weakness1. Voiceless segments can be made “voiced” in the windowing process (pitch-synchronous technique)2. Excessive handling results in unnatural synthesis
• Segment alignmentshould be fine-tuned according to the voiced/voicless status of the (sub-)segments for better results
12
Technical detailsExamples
Praat script
native utterance
non-native utterance
synthetic non-native
13
Technical detailsComparison before synthesis – duration, F0 & intensity
native utterance
non-native utterance
(blue & yellow)
14
Technical detailsComparison after synthesis – duration, F0 & intensity
native utterance
synthetic non-native
(blue & yellow)
15
Implications of the technique
• The technique can be used in second language education:
to facilitate/motivate acquisition of the target language prosody
to emphasize the importance of prosody in achieving native speaker fluency
• ASR (Automatic Speech Recognition) can be employed to automate the segment aligning stage
16
Preliminary plans for an experiment
• HypothesisThe new type of audio feedback improves the efficiency of language, i.e. prosody, learning
• MethodKey idea: (In a listen-and-repeat type of language learning)Contrast the “old” type of audio feedback, i.e. playing native utterances only, with the “new” type of audio feedback, i.e. playing native and synthetic utterances.
17
Preliminary plans for an experiment
• Method1. Baseline: Grouping non-native learners into two (“good” and “bad”)2. Administration: Learning either with the “old” type of audio feedback
or with the “new” type of audio feedback3. Evaluation: Evaluate the two type of feedback by examining the
recordings of the learners
In 1 and 3, a native speaker marks the recordings of the non-native learners on a categorical/numerical scale.
In 2, the two groups (good/bad) are divided into four subgroups (good-A/good-B/bad-A/bad-B) so that A groups are given “old” type of audio feedback and B groups are given “new” type of audio feedback.