initiation of standardization on network-based speech-to-speech translation at itu-t sg16 national...
Post on 24-Dec-2015
229 Views
Preview:
TRANSCRIPT
Initiation of Standardization on Network-based Speech-to-speech Translation
at ITU-T SG16
National Institute of Information and Communications Technology, JapanSatoshi Nakamura
Chiori Hori
Contact : Name Satoshi Nakamura Chiori HoriOrganization NICTCountry Japan
Tel: +81-774 95 1370Fax: +81-774 95 1308Email: satoshi.nakamura@nict.go.jp chiori.hori@nict.go.jp
INTERNATIONAL TELECOMMUNICATION COM 16 – C 196 – ETELECOMMUNICATIONSTANDARDIZATION SECTOR
STUDY PERIOD 2009-2012
October 2009
English only
Original: EnglishQuestion(s): 7, 21, 22/16
Many Languages All Over the Many Languages All Over the WorldWorld
http://en.wikipedia.org/wiki/List_of_language_families
Breaking Language Breaking Language BoundariesBoundaries
Language boundaries is one of the causes of barriers to mutual understanding.
To remove language boundaries between people who speak different languages, Speech-to-Speech Translation (S2ST) technologies are an effective means of communication.
S2ST technologies have been studied.
EnglishEnglish““I go to school”I go to school”
Speech RecognitionRecognition
(ASR)(ASR)
MachineTranslationTranslation
(MT)(MT)
SpeechSynthesisSynthesis
(TTS)(TTS)
w a t a sh i w a t a sh i w a g a xtu w a g a xtu k o o n i…..k o o n i…..
私は私は学校に行く学校に行く
I to I to school goschool go
I go to I go to school school
JapaneseJapanese「私は学校に行「私は学校に行く」く」
CorporaCorpora
Convert to English word sequence
“「私は」⇒ I” “「学校に」⇒ to school”
“「行く」⇒ go”
Convert toword sequenceusing lexicon and grammar
Convert toJapanese phoneme sequence“w”, “a”, “t”…
Select appropriate waveform for English text
Reorder word sequences according toEnglish grammar “I” “ I” “to school” “ go” “go” “ to school”
Speech-to-Speech Translation Speech-to-Speech Translation (S2ST)(S2ST)
Stand Alone and Client-server Stand Alone and Client-server S2ST SystemsS2ST Systems
Stand alone system
Japanese
English Chinese
Indonesian
Packages the entire speech translation functions into a
handheld PC
Japanese speech“おはようございま
す.”
English speech
“Good morning.”
Client-server system
Why Network-based? Why Network-based?
Resource limitation in stand alone systems and language pairs are limited.
ASR/MT/TTS systems for many languages are available and needs to be maintained by each country.
Broadband network is available.
Standardization on Network-based Standardization on Network-based S2STS2ST
Language B
ASRASR
MTMT
Language B
Language A
Language B
Parallel corpuslexicon
LexiconSpeech
data
Language A
LexiconSpeech
data
Language A
Language A
LexiconSpeech data
Language B
LexiconSpeech data
TTSTTS
Parallel corpuslexicon
Speech of
Language B
Speech of
Language A
Synthesized Speech
Parallel corpus, Parallel corpus, Speech data, lexiconSpeech data, lexicon
StandardizationStandardization
Data format forData format forASR and MT results ASR and MT results
Communication protocol Communication protocol among modulesamong modulesTTSTTS
S2ST Client
MTMT
ASRASR
Synthesized Speech
S2ST Client
Lexicon for overall S2ST systemsLexicon for overall S2ST systems
An example of a lexicon for overall modules in S2ST systems
EntryLanguage
AttributeJapanese Korean Chinese English
Osaka
大阪おおさか
4モーラ0型
Osaka
・・
大阪ダーバンDaban
Da4ban3
四声三声
Osaka
Ōsaka
ɔː s a k a
Surface
Pronunciation
Accent
Tokyo
東京とうきょう・・・・
・・
東京トンジン
Tong1jing1
・・
Tokyo
Tōkyō
・・
Surface
Pronunciation
Accent
The global standardization for lexicon format and a system to collect and provide lexicon for all languages is requisite to maintaining reliable lexicon for overall S2ST systems.
Asian Network-Based S2ST System Asian Network-Based S2ST System by by A-STAR ConsortiumA-STAR Consortium
11National Institute of Information and Communications Technology (NICT), National Institute of Information and Communications Technology (NICT), JapanJapan
22Electronics and Telecommunications Research Institute (ETRI), KoreaElectronics and Telecommunications Research Institute (ETRI), Korea33Chinese Academy of Sciences (CASIA), ChinaChinese Academy of Sciences (CASIA), China
44National Electronics and Computer Technology Center (NECTEC), ThailandNational Electronics and Computer Technology Center (NECTEC), Thailand55Agency for the Assessment and Application of Technology (BPPT), IndonesiaAgency for the Assessment and Application of Technology (BPPT), Indonesia
66Center for Development of Advance Computing (CDAC), IndiaCenter for Development of Advance Computing (CDAC), India77Institute of Information Technology (IOIT), VietnamInstitute of Information Technology (IOIT), Vietnam
88Institute for Infocomm Research (I2R), SingaporeInstitute for Infocomm Research (I2R), Singapore
Speech Translation using Distributed Service Servers
Example: From Korean to Thai Speech Translation
Speech translation service client
TTSTTSserverserver
ASRASRserverserver
① Speech recognition (Korean)
② Language translation (Korean→Thai)
Synthesized speech
(Thai)
MTMTserverserver
Translated text (Thai)
Speech (Korean)
MTMTserverserver
TTSTTSserverserver
Text (Korean)
ASRASRserverserver
③ Speech synthesis (Thai)
Scope of StandardizationScope of Standardization
Draft Title Scope Target Date
F.S2STreqs Functional Requirements for Network-based S2ST
Definition of network-based S2ST
Functions and service requirements of network-based S2ST
During this Study Period (2009-2012)
H.S2STarch Architectural Requirements for Network-based S2ST
Functional architectures, mechanisms and
interface of network-based S2ST
During this Study Period (2009-2012)
Table : Draft Roadmap to develop standards for network-based S2ST
top related