stardust project – speech recognition for people with severe dysarthria mark parker specialist...

STARDUST PROJECT – Speech Recognition for People with Severe Dysarthria

Mark ParkerSpecialist Speech and Language Therapist

Project Team

DoH NEAT University of Sheffield Barnsley District General Hospital Prof P Enderby/ M Parker – Clinical Speech

Therapy Prof P Green/ Dr Athanassios Hatzis – Computer

Sciences Prof M Hawley/ Dr Simon Brownsall – Medical

Physics

What is Dysarthria?

A neurological motor speech impairment characterised by slow, weak, imprecise and/or uncoordinated movements of the speech musculature.

May be congenital or acquired 170/100 000 (Emerson & Enderby

1995)

Severity Rating

Typically based on ‘intelligibility’‘…the extent a listener understands

the speech produced…’ (Yorkston et al, 1999)

Not a pure measure – interaction of events

Mild 70-90%Moderate 40-70%Severe 10-40%

Aim

VRS used to access other technology Many of the people with severe

dysarthria will have associated severe physical disability

ECA operated with switching systems slow, laborious, positioning

VRS to supplement or replace switching

Background

Voice recognition systems commercially available packages -mobile phones,

WP packages-Dragon Dictate Continuous vs Discrete

Normal speech - with recognition training can get >90% recognition rates (Rose and Galdo, 1999)

Dysarthric speech - mild 10-15% lower recognition rates (Ferrier, 1992),

Declining rapidly as speech deteriorates 30-40% single words (Thomas-Stonell, 1998)- functionally useless

Intelligibility vs Consistency

Difference between machine recognition and human perception

‘Normal’ speech may be 100% intelligible and with a narrow band of differences across time (consistency).

‘Severe’ dysarthria may be completely unintelligible but may show consistency of key elements (or not)

Development of the system

10-12 volunteers - severe dysarthria and physical disability

Speech <30% intelligibility ratingVideo/DAT recording/computer

samplingAssessing for the range of phonetic

contrasts that can be achieved

Development of a system (2)

Discrete system - the number of contrasts that can be achieved will determine the number of commands that the VRS can handle

Don’t need intelligibility - need consistency

Determine what word/sound/phonetic contrast will represent what command

Development of a system (3)

Train the VRS - neural networks and hidden Markhov modelling

Speech consistency training Implement the system

Current position

Software development – sophisticated recording and data logging facility to be combined with ‘consistency’ measure and spectography package.

Developing ‘user friendliness’ and possibility of ‘remote’ usage.

Identifying & Recording EC commands ‘Labelling’ the sample Attempting to define measures of baseline

consistency at an ‘acoustic’ level Experimenting with recognition accuracy of

commercially available product - Sicare

Labelling

Breaking an utterance into component parts

To establish the extent of variance over time

Sicare testing

Recognition rates compatible with previous research

Begins to illustrate the points at which a recogniser becomes ‘confused’

May illustrate the areas where distinction has to be made

May start to illustrate some of the key acoustic factors that are crucial in dysarthric speech and VR

Non adapted commercial product functionally useless for this population

Subsidiary Questions

Is dysarthric speech consistent?Does the underlying acoustic/soundwave

pattern contain consistent differences in contrasts that are not perceptually distinguishable?

Can consistency be trained in the absence of intelligibility?

Does increasing consistency increase intelligibility?

Normal speech “alarm” 1&2

Normal speech “alarm” 2

Normal speech “television”

Dysarthric speech “television”

stardust project – speech recognition for people with severe dysarthria mark parker specialist...

Documents

speech musculature

dysarthric speech consistent

dysarthric speech mild

human perceptionnormal

recognition training

recognition accuracy

machine recognition

volunteers severe dysarthria