information extraction from spoken language

Information Extraction from Spoken Language

Dr Pierre DumouchelScientific Vice-President, CRIM

Full Professor, ÉTS

PUT RAW DATA NOW and then LINK DATA

• http://www.ted.com/talks/tim_berners_lee_on_the_next_web.html

PUT RAW DATA NOW

• Text• Data (numbers, statistics)• Data (audio, video)

LINKED DATA

• Information is in the relationship between data• Find relationship between them

IBM’s Watson and Jeopardy

Proposal

• Information Extraction in radio and television documents– Industrial Partners:

• CEDROM Sni• Irosoft

– Universities and Research Center• CRIM• ÉTS• INRS-EMT• McGill

• NSERC Strategic Project Proposal

Process Raw Audio Data

• Automatic Speech Recognition (ASR)• Parsing • Indexation

ASR Parsing Indexation

Closed-captioning / Subtitling

VOICEWRITER

Closed- captioning / Subtitling

• Done with the help of a VoiceWriter that:– Respeaks– Adds punctuation– Selects proper dictionary– Does not speak during advertising– Wraps up information when more than one

speakers speak in the same time or when the speech rate is too fast.

– Translates

How to process raw audio data?

AudioDiarization

Speaker Diarization

Speaker Recognition

Speaker RolePunctuationStructural

SegmentationTopic

Recognition

Audio Diarization

• Aims to segment an audio recording into acoustically homogeneous parts– Distinguish between speech and music– Distinguish between advertising and news

Speaker diarization

• Aims to segment a speech signal into its speech turns

Speaker Recognition

Speaker Role

• In broadcast news speech, most speech is from anchors and reporters. The remaining is from excerpts from quotations or interviews and are referred as sound bites.

• Detecting speaker role is important to improve: – acoustice speech recognizer– information extraction

Punctuation• Some language analysis tasks such as parsing

and entity extraction needs punctuations (dots and commas) in order to work properly.

Structural Segmentation

• Sentence segmentation, paragraph segmentation, story segmentation are important features for speech understanding applications from parsing and information extraction at the basic level.

• This problem is absent in text processing but has to be solved in speech processing.

Topic Spotting• Aims to identify the topic of a speech signal. It is useful to

adapt the different components of the system as well as to add metatag on a speech signal.

• Example: La belle ferme le voile– La: the, her– Belle: beautiful, beauty– Ferme: farm, closes– Le: the, his– Voile: veil, blocks the view– Two hypothetic translations:

• The veil is closed by the beauty• The beautiful farm blocks his view

How to improve Information Extraction from speech?

By improving ASR Components

Automatic Speech Recognizer

• Performance drops when• Out-of-vocabulary (Lexical models)• Multiple users (Acoustic models)• Multiple microphones (Acoustic models)• Multiple topics (Language models)• Cross-over talks (All models)

How to improve Information Extraction from speech?

• More data are better data.• More similar data are better data. Similar in

terms of– Topic – Coming from the same time period. Specifically,

more recent.• Example: Japan

– Prediction of what will happen and who will speaks.

More data are better data

• Use of the huge amount of web information• Use super computer infrastructure in order to

model it in a reasonable time:– Compute Canada infrastructure: CLUMEQ– Cluster of university computers

More similar data are better data

• Exploiting redundancies in different media information:– Anchor speech is predominant.– Reporters often appear at specific times, day after

day– Advertisings appear (and repeat) near specific

time slot, day after day.– The same news is often reused from one media to

another.

Exploiting redundancies in different media information

And then ….

AudioDiarization

Speaker Diarization

Speaker Recognition

Speaker RolePunctuationStructural

SegmentationTopic

Recognition

information extraction from spoken language

speech signal

speech processing

speech rate

better data

similar data

tsput raw data

audio diarizationaims

entity extraction

Documents

spoken and written language

levelt producing spoken language 1999

spoken language sow[1]

spoken language understanding

spoken language processing:summing up

spoken language systems spoken conversational interaction...

chatter: a spoken language dialogue system for language

spoken language corpus project

learning spoken language

cau3 spoken language - edex

spoken language study 9 2014

politeness markers in spoken language

spoken language study 4 2014

spoken language comprehension

learning the spoken language

spoken language structureequation section...

spoken language, oral culture

ca: gcse english language unit 3 part c: spoken language...

wold's most spoken language

spoken cusco quechua, language course