Two Phase Classification for Early Hand Gesture Recognition in 3D Top View Data

Aditya Tewari; Bertram Taetz; Frédéric Grandidier; Didier Stricker

In: International Symposium on Visual Computing : Advances in Visual Computing. International Conference on Visual Computing (ISVC-16), Advances in Visual Computing, December 12-14, Las Vegas, Nevada, USA, Pages 353-363, Vol. 10072, No. ISVC 2016: Advances in Visual Computing, Springer, Cham, 2016.


This work classifies top-view hand-gestures observed by a Time of Flight (ToF) camera using Long Short-Term Memory (LSTM) architecture of neural networks. We demonstrate a performance improvement by a two-phase classification. Therefore we reduce the number of classes to be separated in each phase and combine the output probabilities. The modified system architecture achieves an average cross-validation accuracy of 90.75% on a 9-gesture dataset. This is demonstrated to be an improvement over the single all-class LSTM approach. The networks are trained to predict the class-label continuously during the sequence. A frame-based gesture prediction, using accumulated gesture probabilities per frame of the video sequence, is introduced. This eliminates the latency due to prediction of gesture at the end of the sequence as is usually the case with majority voting based methods.

Weitere Links

Deutsches Forschungszentrum für Künstliche Intelligenz
German Research Center for Artificial Intelligence