initiation of standardization on network-based speech-to-speech translation at itu-t sg16 national...

Initiation of Standardization on Network-based Speech-to-speech Translation

at ITU-T SG16

National Institute of Information and Communications Technology, JapanSatoshi Nakamura

Chiori Hori

Contact : Name Satoshi Nakamura Chiori HoriOrganization NICTCountry Japan

Tel: +81-774 95 1370Fax: +81-774 95 1308Email: satoshi.nakamura@nict.go.jp chiori.hori@nict.go.jp

INTERNATIONAL TELECOMMUNICATION COM 16 – C 196 – ETELECOMMUNICATIONSTANDARDIZATION SECTOR

STUDY PERIOD 2009-2012

October 2009

English only

Original: EnglishQuestion(s): 7, 21, 22/16

Many Languages All Over the Many Languages All Over the WorldWorld

http://en.wikipedia.org/wiki/List_of_language_families

Breaking Language Breaking Language BoundariesBoundaries

Language boundaries is one of the causes of barriers to mutual understanding.

To remove language boundaries between people who speak different languages, Speech-to-Speech Translation (S2ST) technologies are an effective means of communication.

S2ST technologies have been studied.

EnglishEnglish““I go to school”I go to school”

Speech RecognitionRecognition

(ASR)(ASR)

MachineTranslationTranslation

(MT)(MT)

SpeechSynthesisSynthesis

(TTS)(TTS)

w a t a sh i w a t a sh i w a g a xtu w a g a xtu k o o n i…..k o o n i…..

私は私は学校に行く学校に行く

I to I to school goschool go

I go to I go to school school

JapaneseJapanese「私は学校に行「私は学校に行く」く」

CorporaCorpora

Convert to English word sequence

“「私は」⇒ I” “「学校に」⇒ to school”

“「行く」⇒ go”

Convert toword sequenceusing lexicon and grammar

Convert toJapanese phoneme sequence“w”, “a”, “t”…

Select appropriate waveform for English text

Reorder word sequences according toEnglish grammar “I” “ I” “to school” “ go” “go” “ to school”

Speech-to-Speech Translation Speech-to-Speech Translation (S2ST)(S2ST)

Stand Alone and Client-server Stand Alone and Client-server S2ST SystemsS2ST Systems

Stand alone system

Japanese

English Chinese

Indonesian

Packages the entire speech translation functions into a

handheld PC

Japanese speech“おはようございま

す．”

English speech

“Good morning.”

Client-server system

Why Network-based? Why Network-based?

Resource limitation in stand alone systems and language pairs are limited.

ASR/MT/TTS systems for many languages are available and needs to be maintained by each country.

Broadband network is available.

Standardization on Network-based Standardization on Network-based S2STS2ST

Language B

ASRASR

Language B

Language A

Language B

Parallel corpuslexicon

LexiconSpeech

Language A

LexiconSpeech

Language A

LexiconSpeech data

Language B

LexiconSpeech data

TTSTTS

Parallel corpuslexicon

Speech of

Language B

Speech of

Language A

Synthesized Speech

Parallel corpus, Parallel corpus, Speech data, lexiconSpeech data, lexicon

StandardizationStandardization

Data format forData format forASR and MT results ASR and MT results

Communication protocol Communication protocol among modulesamong modulesTTSTTS

S2ST Client

ASRASR

Synthesized Speech

S2ST Client

Lexicon for overall S2ST systemsLexicon for overall S2ST systems

An example of a lexicon for overall modules in S2ST systems

EntryLanguage

AttributeJapanese Korean Chinese English

大阪おおさか

4モーラ0型

・・

大阪ダーバンDaban

Da4ban3

四声三声

Ōsaka

ɔː s a k a

Surface

Pronunciation

Accent

東京とうきょう・・・・

・・

東京トンジン

Tong1jing1

・・

Tōkyō

・・

Surface

Pronunciation

Accent

The global standardization for lexicon format and a system to collect and provide lexicon for all languages is requisite to maintaining reliable lexicon for overall S2ST systems.

Asian Network-Based S2ST System Asian Network-Based S2ST System by by A-STAR ConsortiumA-STAR Consortium

11National Institute of Information and Communications Technology (NICT), National Institute of Information and Communications Technology (NICT), JapanJapan

22Electronics and Telecommunications Research Institute (ETRI), KoreaElectronics and Telecommunications Research Institute (ETRI), Korea33Chinese Academy of Sciences (CASIA), ChinaChinese Academy of Sciences (CASIA), China

44National Electronics and Computer Technology Center (NECTEC), ThailandNational Electronics and Computer Technology Center (NECTEC), Thailand55Agency for the Assessment and Application of Technology (BPPT), IndonesiaAgency for the Assessment and Application of Technology (BPPT), Indonesia

66Center for Development of Advance Computing (CDAC), IndiaCenter for Development of Advance Computing (CDAC), India77Institute of Information Technology (IOIT), VietnamInstitute of Information Technology (IOIT), Vietnam

88Institute for Infocomm Research (I2R), SingaporeInstitute for Infocomm Research (I2R), Singapore

Server Location for Network-based S2ST

Speech Translation using Distributed Service Servers

Example: From Korean to Thai Speech Translation

Speech translation service client

TTSTTSserverserver

ASRASRserverserver

① Speech recognition (Korean)

② Language translation (Korean→Thai)

Synthesized speech

(Thai)

MTMTserverserver

Translated text (Thai)

Speech (Korean)

MTMTserverserver

TTSTTSserverserver

Text (Korean)

ASRASRserverserver

③ Speech synthesis (Thai)

S2ST Client and Server S2ST Client and Server

Scope of StandardizationScope of Standardization

Draft Title Scope Target Date

F.S2STreqs Functional Requirements for Network-based S2ST

Definition of network-based S2ST

Functions and service requirements of network-based S2ST

During this Study Period (2009-2012)

H.S2STarch Architectural Requirements for Network-based S2ST

Functional architectures, mechanisms and

interface of network-based S2ST

During this Study Period (2009-2012)

Table : Draft Roadmap to develop standards for network-based S2ST

ConclusionConclusion

We would like to invite more people to standardization activities on network-based S2ST systems.

By leveraging the standardization, network-based S2ST systems can cover more languages.

initiation of standardization on network-based speech-to-speech translation at itu-t sg16 national...

speech translation s2st

networkbased speech

lexicon format

reliable lexicon

language pairs

overall s2st systems

clientserver s2st systems

english grammar

Documents

itu-t sg16 guidelines for organization of rapporteur group...

standardization of terminology and terminology in...

the standardization of standardization: the search …

sg16 (julio-agosto 2007)

itu forum bridging standardization gap – brasilia, may...

current status of standardization on digital signage...

iso/iec jtc 1/sc 29/wg 1 (itu-t sg16) - jpeg

itu-t sg16’s involvement in car communications

¥ 1 Û sg16).pdf · ¥ 1 Û s ... g

german standardization roadmap -...

standardization: implementation of process standardization...

defense standardization program - … · dod regulations on...

üds sg16 dnm snv

a standardization and validity study of a speech and...

1 joint standardization boards (jsb) enhancing...

standardization, qualification, & inspection challenges...

a standardization and validity study of a speech and...

iso/iec jtc 1/sc 29/wg 1 (itu-t sg16) - jpeg · pdf file ·...

signalling for seamless interaction between networks and...

standardization activities on digital signage in itu-t...