a case study on using speech-to- translation alignments...

Post on 28-Jan-2020

9 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

A case study on using speech-to-translation alignments for language

documentation

Antonis Anastasopoulos, David Chiang

http://www.worldmapper.org/images/largepng/583.png 2

Goal

• Collect data now; analyze later

• The data must be:

• Sufficient

• Interpretable

3

How much data?Qur'an

New Testament

Hebrew Bible

All classical Hebrew

All classical Latin

All classical Greek

Millions of words / Hundreds of hours0 6 12 18 24 30

4

The linguistic graveyard

5

Making an audio Rosetta Stone

• Aikuma: Android/web-based app

• Push-to-talk, push-to-translate

6

Interpreting the audio Rosetta Stone

7

Interpreting the audio Rosetta Stone

Now their farm will not stay behind forever.

8

Interpreting the audio Rosetta Stone

Gila abur-u-n ferma hamišaluǧ güǧüna amuqʼ-da-č

Now their farm will not stay behind forever.

9

Interpreting the audio Rosetta Stone

now their farm will not stay behind forever

10

Background

11

K-means Clustering

12

K-means Clustering

12

K-means Clustering

12

K-means Clustering

12

K-means Clustering

12

K-means Clustering

12

K-means Clustering

12

K-means Clustering

12

Aligning Speech to Translation

13

Aligning Speech to Translation

[tanta plata]

13

Aligning Speech to Translation

[tanta plata] [plata]

13

Aligning Speech to Translation

[tanta plata] [plata]

[playa]

13

Aligning Speech to Translation

[tanta plata] [plata]

[Mexico][playa]

13

Aligning Speech to Translation

[tanta plata] [plata]

[Mexico][playa]

13

Aligning Speech to Translation

[tanta plata] [plata]

[Mexico][playa]

Money

Beach

Mexico

13

Example output é Valeria meletá o’ giornále

Valeria legge il giornale [Valeria reads the newspaper]

14

Example output é Valeria meletá o’ giornále

Valeria legge il giornale [Valeria reads the newspaper]

14

Example output é Valeria meletá o’ giornále

Valeria legge il giornale

Score: 0.82

[Valeria reads the newspaper]

14

User Study

15

Griko

16

Resultsph

one

erro

r rat

e (lo

wer

is b

ette

r)

23

24.5

26

27.5

speech-to-translation alignments

none auto gold

24.5

25.7

27

error rates averaged across 6 Italian-speaking participants

18

Resultsph

one

erro

r rat

e (lo

wer

is b

ette

r)

23

27

31

35

speech-to-translation alignments

Italian Spanish English

34.3

28.3

25.7

error rates averaged across all participants (6 Italian, 3 Spanish, 3 English)

19

Consensus transcriptionuser transcription distance

it1 o ladro i so ndze mia buttu 5it2 o ladro isodZenti dabol tu 6it3 o ladro isodzeem biabiddu 5

correct o ladro isodZe embi apo ttu

Can we do better?

Combine them!

20

String averaging

21

String averagingo l a d r o i s o n d z e m i a b u t t u

o l a d r o i s o d Z e n t i d a b o l t u

o l a d r o i s o d z e e m b i a b i d d u

21

String averagingo l a d r o i s o n d z e m i a b u t t u

o l a d r o i s o d Z e n t i d a b o l t u

o l a d r o i s o d z e e m b i a b i d d u

21

0 1 1 … 1

String averagingo l a d r o i s o n d z e m i a b u t t u

o l a d r o i s o d Z e n t i d a b o l t u

o l a d r o i s o d z e e m b i a b i d d u

21

0 1 1 … 1

String averagingo l a d r o i s o n d z e m i a b u t t u

o l a d r o i s o d Z e n t i d a b o l t u

o l a d r o i s o d z e e m b i a b i d d u

21

0 1 1 … 1

String averagingo l a d r o i s o n d z e m i a b u t t u

o l a d r o i s o d Z e n t i d a b o l t u

o l a d r o i s o d z e e m b i a b i d d u

21

0 1 1 … 1

String averagingo l a d r o i s o n d z e m i a b u t t u

o l a d r o i s o d Z e n t i d a b o l t u

o l a d r o i s o d z e e m b i a b i d d u

Average:

21

0 1 1 … 1

String averagingo l a d r o i s o n d z e m i a b u t t u

o l a d r o i s o d Z e n t i d a b o l t u

o l a d r o i s o d z e e m b i a b i d d u

o l a d r o i s o d z e m b i a b u t t u Average:

21

0 1 1 … 1

String averagingo l a d r o i s o n d z e m i a b u t t u

o l a d r o i s o d Z e n t i d a b o l t u

o l a d r o i s o d z e e m b i a b i d d u

o l a d r o i s o d z e m b i a b u t t u

5

6

5

3Average:

21

0 1 1 … 1

Resultsph

one

erro

r rat

e (lo

wer

is b

ette

r)

20

25

30

35

subset for averaging

Eng Spa+Eng Spa Ita+Eng all Ita Ita+Spa

23.223.92424.7

26.927.7

32.1

25.6

consensus transcription

22

Conclusions

23

Conclusions• Alignment of speech with its translation is possible

even in small corpora

23

Conclusions• Alignment of speech with its translation is possible

even in small corpora• And *possibly* helpful for manual transcription

(crowdsourced?)

23

Conclusions• Alignment of speech with its translation is possible

even in small corpora• And *possibly* helpful for manual transcription

(crowdsourced?)• What are the needs for interface design?

23

Conclusions• Alignment of speech with its translation is possible

even in small corpora• And *possibly* helpful for manual transcription

(crowdsourced?)• What are the needs for interface design?• Next: large scale case-study with collected data

23

Conclusions• Alignment of speech with its translation is possible

even in small corpora• And *possibly* helpful for manual transcription

(crowdsourced?)• What are the needs for interface design?• Next: large scale case-study with collected data

23

Talk to me if you want to share comments/ideas/data!

top related