Towards Cross-Media Feature Extraction

Thierry Declerck, Paul Buitelaar, Jan Nemrava, David Sadlier

In: Mark Maybury , Sharon Walter (Hrsg.). Multimedia Information Extraction (AAAI Fall Symposium). AAAI Fall Symposium November 7-9 Arlington Virginia United States Seiten 41-46 Technical Report FS-08-05 AAAI Press Menlo Park, California 11/2008.


In this paper we describe past and present work dealing with the use of textual resources, out of which semantic information can be extracted in order to provide for semantic annotation and indexing of associated image or video material. Since the emergence of semantic web technologies and resources, entities, relations and events extracted from textual resources by means of Information Extraction (IE) can now be marked up with semantic classes derived from ontologies, and those classes can be used for the semantic annotation and indexing of related image and video material. More recently our work aims additionally at taking into account extracted Audio-Video (A/V) features (such as motion, audio-pitch, close-up, etc.) to be combined with the results of Ontology-Based Information Extraction for the annotation and indexing of specific event types. As extraction of A/V features is then supported by textual evidence, and possibly also the other way around, our work can be considered as going towards a "cross-media feature extraction", which can be guided by shared ontologies (Multimedia, Linguistic and Domain ontologies).


Weitere Links

AAAI--Kspace-full-paper-final.pdf (pdf, 177 KB )

Deutsches Forschungszentrum für Künstliche Intelligenz
German Research Center for Artificial Intelligence