Publikation
Automatic Assignment of Semantic Frames in Dialogue
Natalia Skachkova
Mastersthesis, University of Saarland, 1/2021.
Zusammenfassung
Natural language understanding (NLU) is one of the most challenging NLP tasks. There
exist different approaches to ‘teaching’ a machine to ‘understand’ the meaning of words,
sentences and larger units of text. One of them employs the ideas of frame semantics.
According to frame semantics, a semantic frame represents some event or situation.
The meaning of a word can be understood only in context, and both the word and the
expressions that introduce the context take certain slots of a frame.
Currently, the application of frame semantics to NLU tasks is not a very well-
explored research topic. The task of automatic assignment of semantic frames in dialogue
is hardly studied at all. An additional challenge here is a lack of annotated data.
However, classification of utterances using frames offers a lot of advantages: it allows
to capture the semantics of the whole utterance, gives it structure and has necessary
inventory to formalize this structure together with the relations between its elements.
In this thesis we investigate the potential of frame semantics as a meaning representation
framework for team communication in a disaster response scenario. We
focus on the automatic frame assignment and retrain PAFIBERT, which is one of the
state-of-the-art frame classifiers, on English and German TRADR data. We examine
the performance of both models and discuss their adjustments, such as sampling of
additional instances from an unrelated domain, adding extra features to input token
representations, applying frame filtering to exclude unlikely candidate frames and some
others.
We show that sampling extra out-of-domain training data has a limited positive
effect if the original training set is small, and that filtering may also produce a positive
influence on the classifier performance, if the training set is large enough and has a
good coverage of frame-evoking targets. We discuss an unexpected impact of extra
features on the models’ behaviour and perform a careful study of the mistakes made
by the classifiers.
In addition, we present a detailed analysis of all the corpora used in our experiments
with respect to the distributions of semantic frames, lexical units that evoke them,
parts of speech of these units and so on. The results of this analysis can be useful for
further research in the area of frame semantics. Finally, we summarize our experience
of annotating TRADR data with frames in the form of annotation guidelines.