Propp Revisited: Integration of Linguistic Markup into Structured Content Descriptors of Tales

Piroska Lendvai; Thierry Declerck; Sándor Darányi; Scott Malec

In: Digital Humanities 2010. Digital Humanities - Annual International Conference for Digital Scholarship in the Humanities (DH-10), July 7-10, London, United Kingdom, Oxford University Press, 7/2010.


Metadata that serve as semantic markup, such as conceptual categories that describe the macrostructure of a plot in terms of actors and their mutual relationships, actions, and their ingredients annotated in folk narratives, are important additional resources of digital humanities research. Traditionally originating in structural analysis, in fairy tales, they are called functions (Propp, 1968), whereas in myths - mythemes (Levi-Strauss, 1955); a related, overarching type of content metadata is a folklore motif (Uther, 2004; Jason, 2000). In his influential study, Propp treated a corpus of tales in Afanasevs collection (Afanasev, 1945), establishing basic recurrent units of the plot (functions), such as Villainy, Liquidation of misfortune, Reward, or Test of Hero, and the combinations and sequences of elements employed to arrange them into moves1. His aim was to describe the DNA-like structure of the magic tale sub-genre as a novel way to provide comparisons. As a start along the way to developing a story grammar, the Proppian model is relatively straightforward to formalize for computational semantic annotation, analysis, and generation of fairy tales. Our study describes an effort towards creating a comprehensive XML markup of fairly tales following Propp's functions, by an approach that integrates functional text annotation with grammatical markup in order to be used across text types, genres and languages.


Deutsches Forschungszentrum für Künstliche Intelligenz
German Research Center for Artificial Intelligence