building a corpus of pathological speech
DESCRIPTION
Building a corpus of pathological speech. Gwen Van Nuffelen Marc De Bodt. Catherine Middag Jean-Pierre Martens. Dutch Corpus of Pathological and Normal Speech. disturbed muscular control due to damage of the nervous system weak, slow, imprecise, uncoordinated movements. - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Building a corpus of pathological speech](https://reader036.vdocuments.mx/reader036/viewer/2022062309/56815a9a550346895dc81731/html5/thumbnails/1.jpg)
Building a corpus of pathological speech
Catherine Middag
Jean-Pierre Martens
Gwen Van Nuffelen
Marc De Bodt
![Page 2: Building a corpus of pathological speech](https://reader036.vdocuments.mx/reader036/viewer/2022062309/56815a9a550346895dc81731/html5/thumbnails/2.jpg)
Dutch Corpus of Pathological and Normal Speech
Speakers NNormal (N) 119
Dysarthria (D) 102
Hearing impairment (H) 47
Laryngectomy (L) 45
Cleft (C) 39
Articulation disorders (A) 16
Voice disorder (VD) 8
Glossectomy (G) 1
Total 377
disturbed muscular control due to damage of the nervous system weak, slow, imprecise, uncoordinated movements
![Page 3: Building a corpus of pathological speech](https://reader036.vdocuments.mx/reader036/viewer/2022062309/56815a9a550346895dc81731/html5/thumbnails/3.jpg)
Dutch Corpus of Pathological and Normal Speech
Speakers NNormal (N) 119
Dysarthria (D) 102
Hearing impairment (H) 47
Laryngectomy (L) 45
Cleft (C) 39
Articulation disorders (A) 16
Voice disorder (VD) 8
Glossectomy (G) 1
Total 377
TL: surgical removal of the larynx and separation of the trachea from the mouth, nose, and esophagusTE, E, electro larynx (servox)
PL: partial removal of laryngeal structures, vocal folds
![Page 4: Building a corpus of pathological speech](https://reader036.vdocuments.mx/reader036/viewer/2022062309/56815a9a550346895dc81731/html5/thumbnails/4.jpg)
Dutch Corpus of Pathological and Normal Speech
Speakers NNormal (N) 119
Dysarthria (D) 102
Hearing impairment (H) 47
Laryngectomy (L) 45
Cleft (C) 39
Articulation disorders (A) 16
Voice disorder (VD) 8
Glossectomy (G) 1
Total 377
![Page 5: Building a corpus of pathological speech](https://reader036.vdocuments.mx/reader036/viewer/2022062309/56815a9a550346895dc81731/html5/thumbnails/5.jpg)
Speakers
• native speakers of Dutch• adequate language, cognitive, visual and hearing* abilities
![Page 6: Building a corpus of pathological speech](https://reader036.vdocuments.mx/reader036/viewer/2022062309/56815a9a550346895dc81731/html5/thumbnails/6.jpg)
Recordings
• Natural, quiet environment ~ clinical setting• No sound treated box
• Mini-disc (Sony, MZ-R700)• Microphone
• Sony (mouth-microphone distance: 30 cm)• Shure head set
• Transferred to a notebook wave file (mono, 44kHz)• 16 kHz
![Page 7: Building a corpus of pathological speech](https://reader036.vdocuments.mx/reader036/viewer/2022062309/56815a9a550346895dc81731/html5/thumbnails/7.jpg)
Type of samples
Sample NDutch Intelligibility Assessment 357
Articulation assessment 21
Sentences 211
Text 172
Text Marloes 221
Spontaneous speech 39
Semi spontaneous speech 136
Sustained vowel 216
Diadochokinetic rate 214
Formant transition 212
![Page 8: Building a corpus of pathological speech](https://reader036.vdocuments.mx/reader036/viewer/2022062309/56815a9a550346895dc81731/html5/thumbnails/8.jpg)
Dutch Intelligibility Assessment (DIA)
Intelligibility at phoneme level
50 consonant – vowel – consonant words3 subtests:
• A: initial consonants (19 words)• B: final consonants (15 words)• C: medial vowels/ diphthongs (16 words)
Balanced mix of existing and non-existing (well pronounceable) words
Large pool of test items: 25 lists/ subtest 25*25*25 different tests
![Page 9: Building a corpus of pathological speech](https://reader036.vdocuments.mx/reader036/viewer/2022062309/56815a9a550346895dc81731/html5/thumbnails/9.jpg)
lijst A3
1. vop2. ziep3. fuis4. deek5. koen6. hom7. dar8. paam9. mil10. boos11. son12. geur13. nee14. taf15. oes16. loon17. ruk18. joef19. wout
lijst B22
1. geen2. diem3. zoem4. daai5. jog6. peef7. zaar8. paat9. tik10. vang11. boop12. lieuw13. roos14. toe15. riel
lijst C11
1. gul2. zuut3. det4. wok5. waan6. heun7. nout8. vees9. meul10. wiel11. sas12. tuik13. oet14. rood15. min16. deil
DIA
16 year-old girl, stroke, dysarthria, PI: 40%
79 year-old male, TL, TE-speech, PI: 68%
![Page 10: Building a corpus of pathological speech](https://reader036.vdocuments.mx/reader036/viewer/2022062309/56815a9a550346895dc81731/html5/thumbnails/10.jpg)
DIA
1 .op ø b d f g h j k l m n p r s t v w z
1. dop
2. nuis
3.
top
List A10
Intelligibility: percentage of phonemes correctly understood
![Page 11: Building a corpus of pathological speech](https://reader036.vdocuments.mx/reader036/viewer/2022062309/56815a9a550346895dc81731/html5/thumbnails/11.jpg)
DIA
-20 0 20 40 60 80 1000
10
20
30
40
50
60
70
80
90histogram of the dia scores
score
num
ber o
f per
sons
![Page 12: Building a corpus of pathological speech](https://reader036.vdocuments.mx/reader036/viewer/2022062309/56815a9a550346895dc81731/html5/thumbnails/12.jpg)
Annotations DIA
• Praat• 2 tiers
• Tier 1: target word• Tier 2: fixed frame + perceived phoneme
• . VC• CV.• C.C
• Orthographic transcriptions
![Page 13: Building a corpus of pathological speech](https://reader036.vdocuments.mx/reader036/viewer/2022062309/56815a9a550346895dc81731/html5/thumbnails/13.jpg)
List A Target phoneme: initial consonantFixed frame: . V C
![Page 14: Building a corpus of pathological speech](https://reader036.vdocuments.mx/reader036/viewer/2022062309/56815a9a550346895dc81731/html5/thumbnails/14.jpg)
Articulation assessment
• Children • Insufficient reading skills• Logo-Art (Baarda et al, 2001)• Picture naming test• Annotations:
• Orthographic• Tier 1: target• Tier 2: perceived utterance (no fixed frame)
![Page 15: Building a corpus of pathological speech](https://reader036.vdocuments.mx/reader036/viewer/2022062309/56815a9a550346895dc81731/html5/thumbnails/15.jpg)
Sentences
Motor Speech Profile (Kay Elemetrics)
‘Wil je liever de thee of de borrel ?’‘Na nieuwjaar was hij weeral hier’
N= 211Orthographic transcriptionsTier 1 – tier 2; no word boundaries
man, no speech pathology
18 year-old male, congenital dysarthria
![Page 16: Building a corpus of pathological speech](https://reader036.vdocuments.mx/reader036/viewer/2022062309/56815a9a550346895dc81731/html5/thumbnails/16.jpg)
Text Marloes and Text
• Text ‘Papa en Marloes’• standardized text• balanced representation of Dutch phonemes• often used in clinical practice
• Text• different texts with the same reading level
• orthographic transcriptions• 2 tiers• boundaries between sentences
![Page 17: Building a corpus of pathological speech](https://reader036.vdocuments.mx/reader036/viewer/2022062309/56815a9a550346895dc81731/html5/thumbnails/17.jpg)
(Semi) Spontaneous speech
• Spontaneous• Semi spontaneous: randomly selected sequence of pictures• No annotations available
![Page 18: Building a corpus of pathological speech](https://reader036.vdocuments.mx/reader036/viewer/2022062309/56815a9a550346895dc81731/html5/thumbnails/18.jpg)
Future
• Gradually increase number samples• DIA validation SPACE intelligibility assessment• DIA sentence level: > 200 control speakers 3*6
sentences + annotations + pathological samples
![Page 19: Building a corpus of pathological speech](https://reader036.vdocuments.mx/reader036/viewer/2022062309/56815a9a550346895dc81731/html5/thumbnails/19.jpg)
Thank you!