Word Recognition with Auditory Spikes

Members: Shih-Chii Liu, Jonathan Tapson

This project is to perform word spotting using auditory input spikes from the AEREAR2 and the SKIM dendritic model. The sentences used are samples from the work described in Mesgarani and Chang 2012. Speakers produce sentences such as Tiger Go to Blue Two (Five) now or Ringo Go to Blue Two (Five) now. In the experiments, subjects listen to a mixture of 2 sentences and when the keyword "Tiger" is heard, then the subject has to listen to one of the speakers in the mixture and then to provide the number at the end of the sentence. The figure below shows the individual sentences, the mixture of the 2 sentences, and the spikes recorded in response to the sentences.

The first step was to build a dataset of labeled spike data. The data were labeled with two sets of labels - the actual words, and the residence time in Markov chain states. In the spike data shown, the spikes from both cochleas have been combined to reduce the dimensionality of the input. This may not have been a good method.

Labeled spike data

The labels are the lines below the spike raster, coinciding in time with the words in the spike stream.

This dataset was then fed through a SKIM network consisting of 64 cochlea spike input channels. There were 6 output neurons, one for each word. The input and output neurons were connected with all-to-all synapses to 300 dendritic branches for each output neuron. The synaptic kernels used were alpha function impulse responses with explicit axonal delays.

The wordspotting network was followed by a Markov chain network which was intended to implement and enforce the following state diagram:

It can be seen in this image that the wordspotting network was not always visibly correct, but the Markov chain network which followed was able to enforce the correct outcome.