Multimodal Sensory Fusion and Self Organization

Group Members:

  • Jorg Conradt, TU Munich
  • Garrick Orchard, Singapore Institute for Neurotechnology SINAPSE
  • Malcolm Slaney, Microsoft Conversational Systems Lab and Stanford
  • Mounya Elhilali, Johns Hopkins University
  • Michele Rucci, Boston University
  • Ryad Benjamin Benosman, University Pierre and Marie Curie
  • Ralph Etienne-Cummings, JHU/ECE Department
  • Song Hui Chon, McGill University
  • Shih-Chii Liu, Institute of Neuroinformatics, UNI/ETHZ
  • Steve Kalik, Toyota Research Institue
  • Timmer Horiuchi, University of Maryland
  • Tobi Delbruck, Institute of Neuroinformatics, UZH/ETH Zurich
  • Andre van Schaik, The University of Western Sydney

Organizers: Bert Shi & Patrick Sheridan

See wiki:2010/results/sf for results of this topic area.


  1. Ryad Benjamin Benosman (UPMC) (3 weeks)
  2. 'Piotr Dudek' (Manchester) (3 weeks)
  3. Alan Schoen (UPenn) (27 Jun - 9 July)
  4. Michele Rucci (BU) (27 Jun - 7 July)
  5. Daniel Bates (UMD) (9 Jul - 17 Jul)
  6. 'Dan Lee' (UPenn) (1 July - 9 July)

Focus and Goals

"Can we we make a BabyBot; a robot that develops like a baby?"

This topic area will look at how neural systems and smart robots can learn to work in and interact with an unpredictable environment; in particular, we will look at mechanisms and circuitry that combine information from multiple modalities to form a coherent percept of the world. We will look at self organizing mechanisms for development of neural systems and study how animals/robots can learn a spatial representation of their world from a combination of visual/auditory, and motor representations.

Neural systems and smart robots have to work in and interact with unpredictable environments. Thus they must develop autonomously and adjust to the environment they find themselves in. How is this environment represented, and how is this representation learned? How do visual and motor representations interact with this representation of space in order to generate coordinated behaviors? The focus here is to explore autonomous learning in neural systems. To do this we need to firstly incorporate feedback from the environment and secondly make the “experience” of the environment richer by adding multiple sensory percepts. Multimodal integration of visual and auditory cues is an ideal model system as it allows the use of localization and orientation to a sound/light source. Thus it allows us to provide the system with a performance feedback.

These are some of the general scientific questions with which this workgroup will be interested in. However, consistent with the spirit of the Telluride Neuromorphic Engineering Workshop, we will be approaching these questions from an engineering point of view but by utilizing circuits and learning rules present in real brains. We will be building neuromorphic perceptual systems that will enable a robot to interact with its environment by gathering visual and auditory information moving a 7D robotic arm.

The main goal here is to have the system develop autonomously, utilizing experience of the environment. As a concrete starting point, we will use two projects started at the 2009 Telluride Neuromorphic Engineering Workshop. In the first project (https://neuromorphs.net/ws2009/wiki/aud09-LearningTonotopy) we worked on a model of auditory cortex that developed tonotopic organization using realistic cortical circuitry, spiking neurons, and STDP and explored how this circuit behaved and learned. In the second project (https://neuromorphs.net/ws2009/wiki/sensormotor09 and described in more detail in this paper) a robot learned to point the end effector of its arm to the object it is looking at by simply watching its arm moving in front of itself. This architecture was neuromorphic in the sense that it used biologically plausible Hebbian learning rules to learn the mapping visual coordinates to arm joint coordinates. Through these projects we were able to identify several interesting questions that form the basis of individual projects.

Big questions

  • What is the best coordinate frame and how is it represented?
  • How do we link coordinate frames from different sensory modalities and motor space.
  • How can a system like this develop? How are the links between senses learned?
  • What are the temporal limits on learning? Can we do this in minutes or hours? Does the speed of learning depend on the statistics (richness) of the experienced environment or does it depend on neuronal dynamics?
  • How does one turn learning off without an outside observer? What is the measure of “rightness”? What quantity has to be maximized for these learning processes to occur?

Specific Projects

Multimodal map formation and alignment

Group Members:

  • Garrick Orchard, Singapore Institute for Neurotechnology SINAPSE
  • Michele Rucci, Boston University
  • Shih-Chii Liu, Institute of Neuroinformatics, UNI/ETHZ

Vergence with Tobi’s dynamic vision sensor

Group Members:

  • Michele Rucci, Boston University
  • Ryad Benjamin Benosman, University Pierre and Marie Curie

How does eye position/movement modulate maps and map formation.

Realistic cortical model of topographic development and plasticity

Group Members:

Rate vs. Spike in map formation

Group Members:

The subplate cortical map formation mode is based on spiking neurons. Can we get similar behavior with rate based models that are computationally less complex.

Initial projects that were merged to other projects or not adopted

Integrating reward

Group Members:

How do we tap the reward? Where does it have to go? How does it change learning? What if reward drives exploratory behavior?

Body self image

Group Members:

  • Michele Rucci, Boston University

Distinguish self generated and external motion. How do the two drive development differently?

Hardware acceleration

Group Members:

Implement map formation and map linking algorithms using parallel hardware available, such as GPU's ans/or the spinnaker system.


  • 7 degree of freedom robotic arm
  • active binocular vision head (pan/tilt controllable)
  • GPU-based cortically inspired vision system
  • SCAMP-based vision system (analog SIMD)
  • APRON software for topographic array processing
  • Binocular DVS system (Tobi's retina) with controllable vergence angle