Sparse-MVRVMs Tree for Fast and Accurate Head Pose Estimation in the Wild

Mohamed Selim, Alain Pagani, Didier Stricker

In: Michael Felsberg , Anders Heyden , Norbert Krüger (Hrsg.). Proceedings of the International Conference on Computer Analysis of Images and Patterns. International Conference on Computer Analysis of Images and Patterns (CAIP-17) August 22-24 Ystad Sweden ISBN 978-3-319-64689-3 Springer 2017.


Head pose estimation is an important problem in the field of computer vision and facial analysis. We model the problem of head pose estimation as a regression problem, where the three rotation angles (yaw, pitch, roll) are functions of the face appearance. We make use of that fact and learn the appearance of the face using a tree cascade of sparse Multi-Variate Relevance Vector Machines (MVRVM). Our method is fast and suitable for real-time applications as it is not computationally expensive. Our method learns the face appearance to estimate the head rotation angles. We evaluated our approach on two challenging datasets, the YouTube Faces and the Point and Shoot Challenging (PaSC) dataset. We achieved results of head pose estimation (yaw, pitch, roll) with mean error less than 5 degrees and with error tolerance less than 4 on the PaSC dataset. In terms of speed, one prediction takes around 6 milliseconds, which is suitable for real-time applications and also with high frame rate.

SelimCaip2017.pdf (pdf, 2 MB )

Deutsches Forschungszentrum für Künstliche Intelligenz
German Research Center for Artificial Intelligence