Download - Augmented Human Communication Laboratory

Augmented Human CommunicationLaboratory

Prof. Satoshi Nakamura

Graduate School of Information Science,NARA INSTITUTE OF SCIENCE

AND TECHNOLOGY

Prof. Satoshi Nakamura @AHC Lab IS NAIST


Augmented Human Communication Lab

HonnyakuKonnyaku

TranslationRobotC3PO

Intelligent Robot

Talking Vehicle

Voice Conversion

What is Augmented Human Communication?

Brain Commu-nication

Communication EmpowermentTechnologies

Voice QoL

Voice ConversionSpeech Synthesis

Brain Analysis

Spoken Dialog

Why don’t you join our lab!

PersonaModeling

Speech Translation

Dr. NakamuraDr. Toda

Dr. Sakti

Dr. Neubig

Ms. Matsuda

Research for the Future


SRG(Super Research Group): New framework of NAIST for wider collaboration inside/outside the lab.

Integrating fundamental technologies into the augmented human-communication systems

Big DataAnalysis


Prof. Satoshi Nakamura

Assoc. ProfTomoki Toda

Assist. Prof. Sariani Sakti

Assist. Prof.Graham Neubig

World-wide

Bandung Institute of Technology

University of Ilinois

Carnegie MelonUniversity

Rutgers, NewJersyState University

University of Ulm

Karlsruhe Institute of Technology

CambrigeUniversity

Speech TranslationSpeech Recognition

Dialog ControlCognitive Communication

Big Data Analysis

Speech Analysis, Conversion, SynthesisSpeech Signal Processing

Dialog GenerationVoice QoL

（Voice Bank）

Speech RecognitionMultilingual SR

Cognitive Communication

Graphical Models

Machine TranslationSpeech Translation

Natural Language Processing

Machine Learning

ATR, NICT


Members D: 3, M2: 9,

Spoken DialogM: 2

Speech Translation D: 1,M: 3

CognitiveCommunicationD:1, M:1

Speech Synthesis

D:1, M:3

KIT Students: 2

New Members from Apr. 2012

New 11 M1 students

Speech Translation

はじめまして！

Speech Recogni‐tion (ASR)

Machine Transla‐tion (MT)

SpeechSynthesis

(SS)

Nice to meet you!

Speech Translation System

History of Speech Translation Research In Japan

Fundamentals

Read Speech

• Syntactically correct• Clear utterance• Limited domain

Ex. “Conference Registration”

Daily Conversation

• Standard expression• Unclear utterance• Limited domain

Ex. “Hotel Reservation”

Wider and Real Domain

• Wider and real domain“International Travel”

• Realistic expressions• Noisy speech• J-E, J-C speech translation

1986 1992 1999 2006

Rule-based TechnologyCorpus-based Technology

Hand-madeLarge scale corpus

+ Machine learning

2008ATR NICT

A-STAR

+ More languagesfor translation

• Multilateral translation for 8 Asian languages• Network-based S2ST

2010

•21 multilateral text translation

C-STAR

• Multilateral translation for 7 world languages

IWSLT

• Evaluation Campaign of S2S technologies

2011

VoiceTra

NAIST

ATR ATR


Speech Translation

Optimization Delay Naturalness

はじめまして！

Speech Recogni‐tion (ASR)

Machine Transla‐tion (MT)

SpeechSynthesis

(SS)

Nice to meet you!

Speech Translation System

Simultaneous Translation

Low delay speech translation for Japanese to English translation

Speech to speech

Richer expressive speech synthesis for speech translation

Optimization of ASR considering MT peformance

Optimization of ASR


Speech Synthesis

Convert voice to any speaker and voice quality and voice conversion for speech-to-speech translation.

Voice Conversion

Input speaker

HelloHello

Target Speaker

Speech synthesis from text

Text-to-Speech Synthesis

Thank you

Synthesis with emotions, individuality, quality, communicability.

“Thank you”

Communication without disturbing surrounding people by non-audible speech input

Silent Speech Interface

Card No. is ○○よ

Card No.is○○○!

？

？

？

For vocally handicapped people who cannot speak in a usual way... Recover their natural voices! Convert their artificially generated voices into more naturally sounding voices

Voice QoL

2011©Yamauchi AHC‐Lab, IS, NAIST

Persuasive Dialog System

Estimate user interests and persuade users to the target topics and interests

Want user to join AHC lab.

Dialog

Pers-uation

OK!

Suitable labs to me?

Want to join AHC

lab！

Target Topics

・Find users interests・Persuade user to be interested to target topics through dialogs

Conditions for persuation

Estimate interests by user utterances

Estimate Interests

Dialog control by persuasive responses to target topicsand interests

Persuasive Dialog


Cognitive Communication

• Detection of cognitive and semantic mismatch

Automatic Communication Measurements

Objective communication measurement from EEG

Social Communication Skill Training by iPad

Gaze tracking during conversation


InterACT

Karlsruhe Institute

of Technology Carnegie MelonUniversity

Hong Kong University of Science and Technology

Italian Institute of Technology University Southern

California

National Institute of ICT

Nara Institute ofScience and Technology

WasedaUniversity

Joint Research Project, Student Exchange, Faculty Exchange

/14

Professor Satoshi NakamuraBackground

1981.4- 1994.3 Sharp Corp. Central Research Labs 1986-1989 ATR Interpreting Telephony Res. Labs.

1994.4-2000.3 Associate Prof. Nara Institute of Science and Technolog 2000.4 Advanced Telecommunication Research International (ATR)

Vice President of ATR, Director of Spoken Language Communication Labs. ATR Fellow

2006.4 National Institute of Information and Communication Tech.(NICT) Director, MASTAR Project Director, KCCC Research Center Director General, Keihannna Research Laboratories

Dec. 2003 Honorarprofessor of University Karlsruhe, Germany

Apr. 2011 Prof. at Nara Institute of Science and Technology

14

Spoken Language Communication

Research Laboratories

/14

Speech to Speech Translation・“VoiceTra” Network-based Speech Translation

released on Jul. 2010 ・21 language pair for Text I/O・6 language pair for Speech I/O500k download and 4M access worldwide so far.

Japanese, English, Mandarin, Taiwanese Mandarin, German, French, Dutch, Danish,Italian, Spanish, Portuguese, Brazilian Portuguese, Russian, Arabic, Hindi, Indonesian, Malay, Thai, Tagalog, Vietnamese, Korean※ Language in red can be input/output in voices.※There is no text input support for Hindi or Vietnamese.

VoiceTra

15

15

音声翻訳「しゃべって翻訳」

・日英双方向

・NTTドコモ

トップの画面

音声入力画面翻訳結果出力画面

Launched in November 2007The first network‐based STS translation service

Associate ProfessorTomoki Toda

Education1999.4 Graduate School of Information Science, NAIST

- Master degree in engineering, 2001.3- Doctor degree in engineering, 2003.3

Professional Experience2003.4 JSPS Research Fellow (ATR, CMU, NITECH)2005.4 NAIST Assistant Professor2011.4 NAIST Associate Professor

AwardIEEE Signal Processing Society

The 2009 Young Author Best Paper AwardNippon Ericsson K.K.

The 10th Ericsson Young Scientist Award

/14

Education2005-2008 Doctorate degree (Dr.-Ing)

in Engineering Science, University of Ulm, GERMANY

2000-2002 Master degree (MSc ) in Communication Technology, University of Ulm, GERMANY

1995-1999 Bachelor degree (BSc) in Informatics, Bandung Institute of Technology, INDONESIA

Work Experience2011 – Now Assistant Professor, Augmented Human Communication Labs, NAIST, JAPAN2009 – 2011 Visiting Professor, Faculty of Computer Science, University of Indonesia, INDONESIA 2006 – 2011 Expert Researcher, Spoken Language Communication Research Groups, NICT, JAPAN 2003 – 2009 Research Engineer - Researcher, Spoken Language Communication Research Labs, ATR, JAPAN 2001-2002 Masterarbeit, Speech Understanding Dept,

Daimler Chrysler Research Center, GERMANY1999-2000 Junior Software Consultant, Sumarno Pabotingi

Associate, INDONESIA

Assistant Professor Sakriani Sakti

17

Assistant Professor Graham Neubig

Background2001-2005 University of Illinois, Urbana-Champaign

B.E. Computer Science2005-2008 Worked as English Teacher/Translator2008-2012 Kyoto University, Doctor in Informatics2012- Assistant Professor, NAIST

Machine TranslationSpoken Language

ProcessingNatural Language

Analysis

奈良先端へようこそ！

Welcome to NAIST!

Character-based MT

奈良

na ra

Nara

Learning from Speech

this isa pen

Speech Transformation

That’s, um, real good

That’s really good

言葉を扱う

言葉を扱うSegmentation

Pronunciation

kotoba o atsukau

Part-of-SpeechN P V

… and many more!

Download - Augmented Human Communication Laboratory

Top Related