improving speech recognition with embodied cognition and behaviour-based robotics
TRANSCRIPT
Improving Speech Recognitionwith Embodied Cognition
and Behaviour-based Robotics
Improving Speech Recognitionwith Embodied Cognition
and Behaviour-based Robotics
Jorge Davila-Chacon
University of Hamburg - Knowledge Technology
www.informatik.uni-hamburg.de/WTM/
Spotify ML Meetup – November 3rd 2014
MotivationMotivation
Jorge Davila-Chacon Bio-Inspired SSL for Robot ASR 2
• Why is bio-inspired SSL interesting / useful?
Neurobotic ExperimentsNeurobotic Experiments
Jorge Davila-Chacon Bio-Inspired SSL for Robot ASR 3
Virtual Reality LabVirtual Reality Lab
Jorge Davila-Chacon Bio-Inspired SSL for Robot ASR 4
Bauer, J., Davila-Chacon, J., Strahl, E., Wermter, S. Smoke and Mirrors — Virtual Realities for Sensor Fusion Experiments in Biomimetic Robotics. In: Multisensor Fusion and Integration for Intelligent Systems, 2012
Neurobotic ExperimentsNeurobotic Experiments
Jorge Davila-Chacon 5Bio-Inspired SSL for Robot ASR
Bio-Inspired Sound Source LocalisationBio-Inspired Sound Source Localisation
Jorge Davila-Chacon Bio-Inspired SSL for Robot ASR 6
ITD
ILD
ITDs fromLow Frequencies
ITDs fromLow Frequencies
ILDs fromHigh Frequencies
ILDs fromHigh Frequencies
Spatial cues allow sound source localisation:
• Interaural Time Difference (ITD)• Interaural Level Difference (ILD)
Spatial cues allow sound source localisation:
• Interaural Time Difference (ITD)• Interaural Level Difference (ILD)
Same frequency component
Same frequency component
Jorge Davila-Chacon Bio-Inspired SSL for Robot ASR 7
ITDs extracted in Medial Superior Olive (MSO)
ITDs extracted in Medial Superior Olive (MSO)
• AVCN - Anterior Ventral Cochlear Nucleus
• AN - Auditory Nerve
• IC – Inferior Colliculus
Interaural Time DifferencesNeuroanatomy
Interaural Time DifferencesNeuroanatomy
Jorge Davila-Chacon Bio-Inspired SSL for Robot ASR 8
Interaural Time DifferencesComputational Principle
Interaural Time DifferencesComputational Principle
Jorge Davila-Chacon Bio-Inspired SSL for Robot ASR 9
ILDs extracted in Lateral Superior Olive (LSO)
ILDs extracted in Lateral Superior Olive (LSO)
• MNTB - Medial Nucleus of the Trapezoid Body
• IC – Inferior Colliculus
Interaural Level DifferencesNeuroanatomy
Interaural Level DifferencesNeuroanatomy
Bio-Inspired Sound Source LocalisationBio-Inspired Sound Source Localisation
Jorge Davila-Chacon Bio-Inspired SSL for Robot ASR 10
Output ofMSO and LSO integrated in
IC
Output ofMSO and LSO integrated in
IC
J. Dávila-Chacón, S. Heinrich, J. Liu, S. Wermter. Biomimetic Binaural Sound Source Localisation with Ego-Noise Cancellation. International Conference on Artificial Neural Networks, 2012.
Bio-Inspired Sound Source LocalisationBio-Inspired Sound Source Localisation
Jorge Davila-Chacon Bio-Inspired SSL for Robot ASR 11
Bio-Inspired Sound Source LocalisationBio-Inspired Sound Source Localisation
Jorge Davila-Chacon Bio-Inspired SSL for Robot ASR 12
Bio-Inspired Sound Source LocalisationBio-Inspired Sound Source Localisation
Jorge Davila-Chacon Bio-Inspired SSL for Robot ASR 13
Bio-Inspired Sound Source LocalisationBio-Inspired Sound Source Localisation
Jorge Davila-Chacon Bio-Inspired SSL for Robot ASR 14
MLP
IC
IC
Bio-Inspired Sound Source LocalisationBio-Inspired Sound Source Localisation
Jorge Davila-Chacon Bio-Inspired SSL for Robot ASR 15
J. Dávila-Chacón, S. Magg, J. Liu, S. Wermter. Neural and Statistical Processing of Spatial Cues for Sound Source Localisation. International Joint Conference on Neural Networks, 2013.
Bio-Inspired Sound Source LocalisationBio-Inspired Sound Source Localisation
Jorge Davila-Chacon Bio-Inspired SSL for Robot ASR 16
Bio-Inspired Sound Source LocalisationBio-Inspired Sound Source Localisation
Jorge Davila-Chacon Bio-Inspired SSL for Robot ASR 17
Simple IC outputSimple IC output
Bio-Inspired Sound Source LocalisationBio-Inspired Sound Source Localisation
Jorge Davila-Chacon Bio-Inspired SSL for Robot ASR 18
Complex IC outputComplex IC output
Bio-Inspired Sound Source LocalisationBio-Inspired Sound Source Localisation
Jorge Davila-Chacon Bio-Inspired SSL for Robot ASR 19
Static SSLStatic SSL
Dynamic SSL
Dynamic SSL
Feed forwardneural network
Robotic Automatic Speech RecognitionRobotic Automatic Speech Recognition
Jorge Davila-Chacon Bio-Inspired SSL for Robot ASR 20
Platforms used for ASR: iCub and Soundman
Platforms used for ASR: iCub and Soundman
Robotic Automatic Speech RecognitionRobotic Automatic Speech Recognition
Jorge Davila-Chacon Bio-Inspired SSL for Robot ASR 21
J. Dávila-Chacón, J. Twiefel, J. Liu, S. Wermter. Improving Humanoid Robot Speech Recognition with Sound Source Localisation. International Conference on Artificial Neural Networks, 2014.
Binary measure - Static ASRBinary measure - Static ASR
Robotic Automatic Speech RecognitionRobotic Automatic Speech Recognition
Jorge Davila-Chacon Bio-Inspired SSL for Robot ASR 22
Continuous measure - Static ASR
Continuous measure - Static ASR
J. Dávila-Chacón, J. Twiefel, J. Liu, S. Wermter. Improving Humanoid Robot Speech Recognition with Sound Source Localisation. International Conference on Artificial Neural Networks, 2014.
● Robotics as a “sandbox” for learning ML
● Neuroscience provides clues for computational principles
● Embodiment• iCub allows computation of spatial cues
• Interaction with environment can reduce noise
● Signal processing with ANN• Spiking ANN are an effective representation of spatial cues
• Bayesian integration important for dimensionality reduction
• Softmax Neural layer robust to ego-noise and reverberation
ConclusionConclusion
Jorge Davila-Chacon Bio-Inspired SSL for Robot ASR 23
Future WorkFuture Work
● Neural SSL• Integrate GPU version of MSO and LSO
• Propagation of probabilities through time
• From discrete to continuous
● Integration with vision• From supervised to unsupervised SSL
• Possible extension to sensorimotor contingencies• Vision to select between multiple sound sources
• Vision for speech segregation
Jorge Davila-Chacon Bio-Inspired SSL for Robot ASR 24
Thank you for your attention.
LinkedIn: Jorge Davila Chacon
• J. Liu, D. Perez-Gonzalez, A. Rees, H. Erwin, S. Wermter. A biologically inspired spiking neural network model of the auditory midbrain for sound source localisation. Neurocomputing (2010)
• J. Davila-Chacon, S. Heinrich, J. Liu, and S. Wermter. Biomimetic binaural sound source localisation with ego-noise cancellation. International Conference on Artificial Neural Networks (2012)
• J. Bauer, J. Davila-Chacon, E. Strahl, S. Wermter. Smoke and Mirrors — Virtual Realities for Sensor Fusion Experiments in Biomimetic Robotics. Multisensor Fusion and Integration for Intelligent Systems (2012)
• J. Davila-Chacon, S. Magg, J. Liu, S. Wermter. Neural and Statistical Processing of Spatial Cues for Sound Source Localisation. International Joint Conference on Neural Networks (2013)
• J. Dávila-Chacón, J. Twiefel, J. Liu, S. Wermter. Improving Humanoid Robot Speech Recognition with Sound Source Localisation. International Conference on Artificial Neural Networks (2014)
Jorge Davila-Chacon Bio-Inspired SSL for Robot ASR 25
AppendixAppendix
Best performances with clustering layer
Jorge Davila-Chacon Bio-Inspired SSL for Robot ASR 26
AppendixAppendix
Best performances with clustering layer
Jorge Davila-Chacon Bio-Inspired SSL for Robot ASR 27
AppendixAppendix
Bayesian IC model
Jorge Davila-Chacon Bio-Inspired SSL for Robot ASR 28
AppendixAppendix
Bayesian IC model
Jorge Davila-Chacon Bio-Inspired SSL for Robot ASR 29
AppendixAppendix
Levenshtein distance
Jorge Davila-Chacon Bio-Inspired SSL for Robot ASR 30
J. Dávila-Chacón, J. Twiefel, J. Liu, S. Wermter. Improving Humanoid Robot Speech Recognition with Sound Source Localisation. International Conference on Artificial Neural Networks, 2014.