english pronunciation learning system for japanese students based on diagnosis of critical...
Post on 27-Dec-2015
229 Views
Preview:
TRANSCRIPT
English Pronunciation Learning System for Japanese Students
Based on Diagnosis of Critical Pronunciation Errors
Yasushi Tsubota, Tatsuya Kawahara, Masatake Dantsuji
Kyoto University, Japan
HUGO(Pronunciation Learning System)
• Goal: Pinpointing the pronunciation errorswhich diminish intelligibility and providing effective feedback for improving a student’s pronunciation
• Pronunciation practice consists of 2 phases– Dialogue-based skit (for natural conversation)– Practice using individual phrases or words
(for correcting specific errors)
Flow of Pronunciation Learning SystemSpeech dialogue ( Role-play )
Pronunciation Error Diagnosis
Training on Specific Errors
Practice conversation with interesting topics– Original contents developed at Kyoto University
– Foster ability to explain Japanese history/culture in English to foreign visitors
Speech Recognition Program in background– Error detection optimized for English pronunciation
by Japanese students
– Error Profile for the student
Intelligibility Estimation– Estimated from the error rates for the different type of errors
Error Priority– Indicates the student’s performance for a given pronunciation
– Expresses how far behind the students is on one pattern compared to students in the same levelTraining on Specific Errors
– Practice of individual pronunciation skills
– Error feedback providing both stress and segmental instruction
Introduction to the Beauties of Kyoto
Pronunciation Error Prediction
• 64 rules for pronunciation errors
• No equivalent syllable in L1 language
– e.g. sea → she
• No equivalent phoneme in L1 language
• l vs r, v etc
• Vowel insertion• b-r → b-uh-r
“breath”
Pronunciation ErrorPrediction
PronunciationDictionaryRules
for error
b ehr th
luh suhS E
Error↑
2. Sentence Stress Error Detection
First Stage ST/NS classification
Put it on the desk
CVsC CVx VsC CVs CVsC CVs
ST
NS
ST
NS
ST
NS
ST
NS
ST
NS ST
NS
NS NSSecond Stage PS/SS classification
NSNS
NS
PS
SS
PS
SSNS
PS
SS
Added syllableBy vowel insertion
Pause
Recognition Result SS NS NS NS PS NS
H T H M M T
StressHMM
StressHMM
Best weightFor ST/NS
Best weightFor PS/SS
Two-stage stress error detection
Pronunciation ErrorsV/B substitution (problem)
Final vowel insertion (let)
CCV-cluster insertion (active)
VCC-cluster insertion (study)
H/F substitution (fire)
W/Y deletion (would)
SH/CH substitution (choose)
R/L substitution (road)
ER/A substitution (paper)
Non-reduction (student)
•Built from literature in ESL
•Errors not accurately detected were removed
•Compute error rates of each subject
Average Error Rates per Intelligibility Level
SH ER RL VR VB FI CCV VCC HFWY
Practice in a university classroom
• Implementation – JAVA for Windows
– HTK
• Classroom user– 48 students
– 60 min. of pronunciation practice
• Machine– Windows2000
– Pentium4 1.5G
– Memory 512M
CALL room at Kyoto University
Introduction to Jidai Festival
Introduction to Jidai Festival
Introduction to Jidai Festival
Introduction to Jidai Festival
Grammar, Vocabulary Building Pronunciation Learning
5/12 5/19 5/26 6/1
English II Syllabus
Jidai Festival-Edo period-
Jidai Festival-Edo period-
Jidai Festival-Edo period-
Jidai Festival-Edo period-
Grammar, Vocabulary Building Pronunciation Learning
6/8 6/15 6/22 6/29
Jidai Festival-Edo period-
Pronunciation Learning
10/27
Jidai Festival-Edo period-
11/11
Pronunciation Learning
2nd Semester
16-hours of speech datain total
2nd session1st Semester
1st session1st Semester
Questionnaire
• Good practice for pronunciation learning• This practice is effective because Japanese students are not good at
pronunciation.• I hope to see further improvement in the performance of this system.• I am for this kind of English learning.• This practice is good for self-study.
Positive comments
Negative comments• Sometimes the diagnosis results were not understandable.• Not enough speech recognition accuracy.• Sometimes it seems to the machine improperly recognized my utterance.• This practice would be better if there were fewer recognition errors.
Satisfied with the concept of the system But, too many errors in speech recognition
Score <50 51-60 61-70 71-80 81-90 91-100#Students 2 2 8 11 13 4
Evaluation by the class
Examples of recorded speech
Yes,that’s right. (noise addition)Yes,that’s right. (noise addition)But, do you know what the festival of ages is like ?
(noise addition)
Ah, well, the festival of ages is a series of processions. (noise addition)
Each representing a different period in Japanese history and its relation to Kyoto. (noise addition)
The Edo period
which dates from 1603 to 1867, ( Speech Error )
I’d like to stop nowunder
Good Examples
Bad Examples
Analysis of logged data• Categorize the causes of misrecognition
– To measure system performance– If automatically detected, a prompt for re-recording is possible.
• Analysis of logged data– Listen to the logged speech data– Verify the correctness of speech recognizer’s alignment with spectrogra
m(Wavesurfer)
Analysis of logged data(1929 utterances)
• Errors in automatic detection of the end of a recording session[6.0%,116]
• Addition of noise[13.1%,252]• Hesitation[4.2%,81]• Speech errors[1.8%,34]• Misalignment by the speech
recognition system[12.8%,246]• Recognition errors[1.5%,29]
Instructions onvolume settings
Provide explanation,prompt for re-recording
Make uttereance longere.g. make into a sentence
SolutionCause
Improper configuration of recording volume
Directed microphone did not work well
Unfamiliarity withEnglish sentence
Unit of utterance istoo short(Phrase)
Analysis of Logged data
#Utterance Error Rate (Recording)
Error Rate
(Recognition)
1st trial 52.1 (Avg.)
1929(Total)
20.4(Avg.)
755(Total)
1.24(Avg.)
46 ( Total)
2nd trial 111(Avg.)
3982(Total)
4.9 (Avg.)
176(Total)
0(Avg.)
0(Total)
Conclusions
• Practical Use of Autonomous English Pronunciation Learning System for Japanese Students– Contents designed to teach students how to explain Japanese tradition and
culture– Phoneme, stress error detection, intelligibility estimation– Practical use in an English II class ay Kyoto University
• Practical use and analysis of logged data– Satisfied with the concept of the system – Analysis of improper operation
• Errors in automatic detection of the end of a recording session• Addition of noise• Hesitation• Speech errors• Misalignment by the speech recognition system• Recognition errors
top related