wyszukiwanie w plikach audio

24
Audio search Andrzej Dudziec

Upload: enterprise-search-warsaw-meetup

Post on 18-Jul-2015

135 views

Category:

Software


2 download

TRANSCRIPT

Page 1: Wyszukiwanie w plikach audio

Audio searchAndrzej Dudziec

Page 2: Wyszukiwanie w plikach audio

Outline

● Introduction

● Speech recognition

● Phonetic algorithms

● Evaluation

● Results

● Conclusions

Page 3: Wyszukiwanie w plikach audio

Introduction

Page 4: Wyszukiwanie w plikach audio

Introduction

audio text

Page 5: Wyszukiwanie w plikach audio

Speech recognition

● Words consists of letters e.g. ‘ONE’ - ‘O’, ‘N’, ‘E’● Speech consists of phonemes e.g. /wʌn/ - ‘W’, ‘AH’, ‘N’

Page 6: Wyszukiwanie w plikach audio

Speech recognition

phonemes

AM

Page 7: Wyszukiwanie w plikach audio

Speech recognition

● one W AH N● two T UW● three TH R IY● four F AO R● five F AY V● six S IH K S● seven S EH V AH N● eight EY T● nine N AY N● ten T EH N

phonemes

words

sentences

AM

dict

LM

Page 8: Wyszukiwanie w plikach audio

Speech recognition

phonemes

words

sentences

AM

dict

LM

Issues

● Acoustic level○ background noise○ multiple speakers○ accent, dialect, sex, mood○ coarticulation

● Dictionary level○ homonyms (be & bee, I scream & ice cream)

Page 9: Wyszukiwanie w plikach audio

Phonetic algorithmsThompson -> thompsonthompson -> th3mps3nth3mps3n -> th3mpS3nth3mpS3n -> Th3mpS3nTh3mpS3n -> Th3mPS3nTh3mPS3n -> Th3MPS3nTh3MPS3n -> Th3MPS3NTh3MPS3N -> T23MPS3NT23MPS3N -> TMPSNTMPSN111111 -> TMPSN1

sixteen sixty

Soundex

Metaphone

Caverphone

Soundex

Metaphone

Caverphone

S235

SKST

SKTN11

S230

SKST

SKTA11

● Soundex● Metaphone● Caverphone

Page 10: Wyszukiwanie w plikach audio

Phonetic algorithms

Ackermann AzuronSoundex SoundexA265 A265

Page 11: Wyszukiwanie w plikach audio

Metaphone code computation algorithm

Remove all repeating neighboring letters except letter C.

The beginning of the word should be transformed using the

following rules:

KN → N

GN → N

PN → N

AE → E

WR → R

Remove B letter at the end, if it is after M letter.

Replace C using the rules below:

With Х: CIA → XIA, SCH → SKH, CH → XH

With S: CI → SI, CE → SE, CY → SY

With K: C → K

Replace D using the following rules:

With J: DGE → JGE, DGY → JGY, DGI → JGY

With T: D → T

Replace GH → H, except it is at the end or before a vowel.

Replace GN → N and GNED → NED, if they are at the end.

Replace G using the following rules

With J: GI → JI, GE → JE, GY → JY

With K: G → K

Remove all H after a vowel but not before a vowel.

Perform following transformations using the rules below:

CK → K

PH → F

Q → K

V → F

Z → S

Replace S with X:

SH → XH

SIO → XIO

SIA → XIA

Replace T using the following rules

With X: TIA → XIA, TIO → XIO

With 0: TH → 0

Remove: TCH → CH

Transform WH → W at the beginning. Remove W if there is no vowel

after it.

If X is at the beginning, then replace X → S, else replace X → KS

Remove all Y which are not before a vowel.

Remove all vowels except vowel at the start of the word.

Page 12: Wyszukiwanie w plikach audio

Daitch-Mokotoff SoundexLetter combination At the

startAfter a vowel

Other

SCHTSCH, SCHTSH, SCHTCH, SHTCH, SHCH, SHTSH, STCH, STSCH, STRZ, STRS, STSH, SZCZ, SZCS

2 4 4

SHT, SCHT, SCHD, ST, SZT, SHD, SZD, SD 2 43 43

CSZ, CZS, CS, CZ, DRZ, DRS, DSH, DS, DZH, DZS, DZ, TRZ, TRS, TRCH, TSH, TTSZ, TTZ, TZS, TSZ, SZ, TTCH, TCH, TTSCH, ZSCH, ZHSH, SCH, SH, TTS, TC, TS, TZ, ZH, ZS

4 4 4

Page 13: Wyszukiwanie w plikach audio

Phonetic algorithms

Page 14: Wyszukiwanie w plikach audio

Evaluation

Page 15: Wyszukiwanie w plikach audio

Resultshelp ≠ helped

hell ≠ heaven

Page 16: Wyszukiwanie w plikach audio

Results

Page 17: Wyszukiwanie w plikach audio

Results

Page 18: Wyszukiwanie w plikach audio

Results

preprocessing audio snippets

XMLtext

audio snippets

Page 19: Wyszukiwanie w plikach audio

Results

Page 20: Wyszukiwanie w plikach audio

Results

Page 21: Wyszukiwanie w plikach audio

Results

Page 22: Wyszukiwanie w plikach audio

Conclusions

● good recognition model and audio preprocessing is crucial, consider speed vs accuracy

● phonetic filtering increases recall but decreases precision

● phonetic filters as improvement, not standalone

● consider fuzzy search

Page 23: Wyszukiwanie w plikach audio

Use cases

● audio archive

● looking up broadcast○ opinion mining○ collecting information

● voice control

● dictation○ short notes○ voice mail -> text messages

Page 24: Wyszukiwanie w plikach audio

Discussion?