Proppian Content Descriptors in an Integrated Annotation Schema for Fairy Tales

Thierry Declerck, Antonia Scheidel, Piroska Lendvai
Proppian Content Descriptors in an Integrated Annotation Schema for Fairy Tales
2 Language Technology for Cultural Heritage. Selected Papers from the LaTeCH Workshop Series,
Theory and Applications of Natural Language Processing, Pages 155-169, Springer, Heidelberg, 2011

This chapter describes the actual state of development of a markup scheme that combines narrative and linguistic information for the fine-grained annotation of folktales. The scheme builds on and extends an existing mark-up language called PftML (Proppian fairy tale Markup Language) and combines this with textual and linguistic annotation standards as proposed by TEI (Text Encoding Initiative) and ISO TC37/SC4 on language resources management. We call our scheme therefore APftML (Augmented Proppian fairy tale Markup Language). One aim of this schema is to offer support for combined Natural Language Technology and Digital Humanities research, exemplified in the fairy tale domain. A final goal is to semi-automatically annotate fairy tales, in particular to locate and mark up fairy tale characters and the actions they are involved in, which can be subsequently queried in a corpus by both linguists and specialists in the field. The characters and actions are defined in Propp’s structural analysis to folk tales, which we aim to implement in a fully fledged way, contrary to existing resources. We argue that the approach devises a means for linguistic processing of folk tale texts in order to support their automated semantic annotation in terms of narrative units and functions.
