Robot/Human? Gesture Recognition from Multimodal Acoustic Signatures

Thomas Murray, Andreas Andreou

Recognizing actions and gestures are critical tasks for artificial systems designed to interact with humans in natural environments. In order to develop algorithms for gesture recognition, we used both active ultrasound and passive acoustic sensors to characterize the movements of a toy WowWee? Robosapien. Spectrograms of the acoustic data allow for visualization of both the temporal and frequency modulations induced by the different actions of the robot. Using these spectrogram images as input, a deep belief network (DBN) was trained and the result was used to explore some of the data representations that can be learned from the acoustic sensor data. DBNs are composed of hierarchical layers, each of which learns to represent the data provided by the previous layer. The first layer is provided with the raw spectrogram input, while the last layer can implement a number of classifier algorithms such as support vector machines or Bayesian classifiers.