Pornography Detection in Video Benefits (a lot) from a Multi-modal Approach

Adrian Ulges, Christian Schulze, Damian Borth, Armin Stahl

In: Proceedings of the International Conference on Multimedia. ACM International Workshop on Audio and Multimedia Methods for Large-Scale Video Analysis (AMVA-2012) befindet sich ACM Multimedia 2012 October 29-November 2 Nara Japan ACM 10/2012.


We address the challenge of detecting pornographic content in video streams. On offensive material crawled from dif- ferent pornographic websites and non-offensive clips from YouTube (a total of 500 hours of video), we first study a compressed-domain activity descriptor based on MPEG motion compensation vectors. We show that the approach offers an interesting alternative but generalizes poorly be- tween videos compressed with different codecs, a problem that can be overcome to some extent by adding noise to the image data prior to video compression. Our main contribution is an evaluation that benchmarks the above motion-based descriptor as well as three other widely used features (audio-based MFCC features, skin color detection, and visual words). Here, we show that a multi- modal approach is a key strategy for an accurate detection or adult content: A combination of the different features gives considerable improvements in accuracy, reducing equal error by 36–56% compared to the best uni-modal system.

amva104-ulges.pdf (pdf, 2 MB )

Deutsches Forschungszentrum für Künstliche Intelligenz
German Research Center for Artificial Intelligence