Download - Augmented Human Communication Laboratory
Augmented Human CommunicationLaboratory
Prof. Satoshi Nakamura
Graduate School of Information Science,NARA INSTITUTE OF SCIENCE
AND TECHNOLOGY
Prof. Satoshi Nakamura @AHC Lab IS NAIST
Prof. Satoshi Nakamura @AHC Lab IS NAIST
Augmented Human Communication Lab
HonnyakuKonnyaku
TranslationRobotC3PO
Intelligent Robot
Talking Vehicle
Voice Conversion
What is Augmented Human Communication?
Brain Commu-nication
Communication EmpowermentTechnologies
Voice QoL
Voice ConversionSpeech Synthesis
Brain Analysis
Spoken Dialog
Why don’t you join our lab!
PersonaModeling
Speech Translation
Dr. NakamuraDr. Toda
Dr. Sakti
Dr. Neubig
Ms. Matsuda
Research for the Future
Prof. Satoshi Nakamura @AHC Lab IS NAIST
SRG(Super Research Group): New framework of NAIST for wider collaboration inside/outside the lab.
Integrating fundamental technologies into the augmented human-communication systems
Big DataAnalysis
Prof. Satoshi Nakamura @AHC Lab IS NAIST
Prof. Satoshi Nakamura
Assoc. ProfTomoki Toda
Assist. Prof. Sariani Sakti
Assist. Prof.Graham Neubig
World-wide
Bandung Institute of Technology
University of Ilinois
Carnegie MelonUniversity
Rutgers, NewJersyState University
University of Ulm
Karlsruhe Institute of Technology
CambrigeUniversity
Speech TranslationSpeech Recognition
Dialog ControlCognitive Communication
Big Data Analysis
Speech Analysis, Conversion, SynthesisSpeech Signal Processing
Dialog GenerationVoice QoL
(Voice Bank)
Speech RecognitionMultilingual SR
Cognitive Communication
Graphical Models
Machine TranslationSpeech Translation
Natural Language Processing
Machine Learning
ATR, NICT
Prof. Satoshi Nakamura @AHC Lab IS NAIST
Members D: 3, M2: 9,
Spoken DialogM: 2
Speech Translation D: 1,M: 3
CognitiveCommunicationD:1, M:1
Speech Synthesis
D:1, M:3
KIT Students: 2
New Members from Apr. 2012
New 11 M1 students
Speech Translation
はじめまして!
Speech Recogni‐tion (ASR)
Machine Transla‐tion (MT)
SpeechSynthesis
(SS)
Nice to meet you!
Speech Translation System
History of Speech Translation Research In Japan
Fundamentals
Read Speech
• Syntactically correct• Clear utterance• Limited domain
Ex. “Conference Registration”
Daily Conversation
• Standard expression• Unclear utterance• Limited domain
Ex. “Hotel Reservation”
Wider and Real Domain
• Wider and real domain“International Travel”
• Realistic expressions• Noisy speech• J-E, J-C speech translation
1986 1992 1999 2006
Rule-based TechnologyCorpus-based Technology
Hand-madeLarge scale corpus
+ Machine learning
2008ATR NICT
A-STAR
+ More languagesfor translation
• Multilateral translation for 8 Asian languages• Network-based S2ST
2010
•21 multilateral text translation
C-STAR
• Multilateral translation for 7 world languages
IWSLT
• Evaluation Campaign of S2S technologies
2011
VoiceTra
NAIST
ATR ATR
Prof. Satoshi Nakamura @AHC Lab IS NAIST
Speech Translation
Optimization Delay Naturalness
はじめまして!
Speech Recogni‐tion (ASR)
Machine Transla‐tion (MT)
SpeechSynthesis
(SS)
Nice to meet you!
Speech Translation System
Simultaneous Translation
Low delay speech translation for Japanese to English translation
Speech to speech
Richer expressive speech synthesis for speech translation
Optimization of ASR considering MT peformance
Optimization of ASR
Prof. Satoshi Nakamura @AHC Lab IS NAIST
Speech Synthesis
Convert voice to any speaker and voice quality and voice conversion for speech-to-speech translation.
Voice Conversion
Input speaker
HelloHello
Target Speaker
Speech synthesis from text
Text-to-Speech Synthesis
Thank you
Synthesis with emotions, individuality, quality, communicability.
“Thank you”
Communication without disturbing surrounding people by non-audible speech input
Silent Speech Interface
Card No. is ○○よ
Card No.is○○○!
?
?
?
For vocally handicapped people who cannot speak in a usual way... Recover their natural voices! Convert their artificially generated voices into more naturally sounding voices
Voice QoL
2011©Yamauchi AHC‐Lab, IS, NAIST
Persuasive Dialog System
Estimate user interests and persuade users to the target topics and interests
Want user to join AHC lab.
Dialog
Pers-uation
OK!
Suitable labs to me?
Want to join AHC
lab!
Target Topics
・Find users interests・Persuade user to be interested to target topics through dialogs
Conditions for persuation
Estimate interests by user utterances
Estimate Interests
Dialog control by persuasive responses to target topicsand interests
Persuasive Dialog
Prof. Satoshi Nakamura @AHC Lab IS NAIST
Cognitive Communication
• Detection of cognitive and semantic mismatch
Automatic Communication Measurements
Objective communication measurement from EEG
Social Communication Skill Training by iPad
Gaze tracking during conversation
Prof. Satoshi Nakamura @AHC Lab IS NAIST
InterACT
Karlsruhe Institute
of Technology Carnegie MelonUniversity
Hong Kong University of Science and Technology
Italian Institute of Technology University Southern
California
National Institute of ICT
Nara Institute ofScience and Technology
WasedaUniversity
Joint Research Project, Student Exchange, Faculty Exchange
/14
Professor Satoshi NakamuraBackground
1981.4- 1994.3 Sharp Corp. Central Research Labs 1986-1989 ATR Interpreting Telephony Res. Labs.
1994.4-2000.3 Associate Prof. Nara Institute of Science and Technolog 2000.4 Advanced Telecommunication Research International (ATR)
Vice President of ATR, Director of Spoken Language Communication Labs. ATR Fellow
2006.4 National Institute of Information and Communication Tech.(NICT) Director, MASTAR Project Director, KCCC Research Center Director General, Keihannna Research Laboratories
Dec. 2003 Honorarprofessor of University Karlsruhe, Germany
Apr. 2011 Prof. at Nara Institute of Science and Technology
14
Spoken Language Communication
Research Laboratories
/14
Speech to Speech Translation・“VoiceTra” Network-based Speech Translation
released on Jul. 2010 ・21 language pair for Text I/O・6 language pair for Speech I/O500k download and 4M access worldwide so far.
Japanese, English, Mandarin, Taiwanese Mandarin, German, French, Dutch, Danish,Italian, Spanish, Portuguese, Brazilian Portuguese, Russian, Arabic, Hindi, Indonesian, Malay, Thai, Tagalog, Vietnamese, Korean※ Language in red can be input/output in voices.※There is no text input support for Hindi or Vietnamese.
VoiceTra
15
15
音声翻訳 「しゃべって翻訳」
・日英双方向
・NTTドコモ
トップの画面
音声入力画面 翻訳結果出力画面
Launched in November 2007The first network‐based STS translation service
Associate ProfessorTomoki Toda
Education1999.4 Graduate School of Information Science, NAIST
- Master degree in engineering, 2001.3- Doctor degree in engineering, 2003.3
Professional Experience2003.4 JSPS Research Fellow (ATR, CMU, NITECH)2005.4 NAIST Assistant Professor2011.4 NAIST Associate Professor
AwardIEEE Signal Processing Society
The 2009 Young Author Best Paper AwardNippon Ericsson K.K.
The 10th Ericsson Young Scientist Award
/14
Education2005-2008 Doctorate degree (Dr.-Ing)
in Engineering Science, University of Ulm, GERMANY
2000-2002 Master degree (MSc ) in Communication Technology, University of Ulm, GERMANY
1995-1999 Bachelor degree (BSc) in Informatics, Bandung Institute of Technology, INDONESIA
Work Experience2011 – Now Assistant Professor, Augmented Human Communication Labs, NAIST, JAPAN2009 – 2011 Visiting Professor, Faculty of Computer Science, University of Indonesia, INDONESIA 2006 – 2011 Expert Researcher, Spoken Language Communication Research Groups, NICT, JAPAN 2003 – 2009 Research Engineer - Researcher, Spoken Language Communication Research Labs, ATR, JAPAN 2001-2002 Masterarbeit, Speech Understanding Dept,
Daimler Chrysler Research Center, GERMANY1999-2000 Junior Software Consultant, Sumarno Pabotingi
Associate, INDONESIA
Assistant Professor Sakriani Sakti
17
Assistant Professor Graham Neubig
Background2001-2005 University of Illinois, Urbana-Champaign
B.E. Computer Science2005-2008 Worked as English Teacher/Translator2008-2012 Kyoto University, Doctor in Informatics2012- Assistant Professor, NAIST
Machine TranslationSpoken Language
ProcessingNatural Language
Analysis
奈良先端へようこそ!
Welcome to NAIST!
Character-based MT
奈良
na ra
Nara
Learning from Speech
this isa pen
Speech Transformation
That’s, um, real good
That’s really good
言葉を扱う
言葉 を 扱うSegmentation
Pronunciation
kotoba o atsukau
Part-of-SpeechN P V
… and many more!