DFKI-LT - Prepositions in Applications: A Survey and Introduction to the Special Issue

Timothy Baldwin, Valia Kordoni, Aline Villavicencio
Prepositions in Applications: A Survey and Introduction to the Special Issue
in: Robert Dale (ed.):
1 Computational Linguistics volume 35 number No. 2, Pages 119-149, MIT Press, 2009
Prepositions - as well as prepositional phrases (PPs) and markers of various sorts - have a mixed history in computational linguistics (CL), as well as related fields such as artificial intelligence, information retrieval (IR), and computational psycholinguistics: On the one hand they have been championed as being vital to precise language understanding (e.g., in information extraction), and on the other they have been ignored on the grounds of being syntactically promiscuous and semantically vacuous, and relegated to the ignominious rank of "stop word" (e.g., in text classification and IR). Although NLP in general has benefitted from advances in those areas where prepositions have received attention, there are still many issues to be addressed. For example, in machine translation, generating a preposition (or "case marker" in languages such as Japanese) incorrectly in the target language can lead to critical semantic divergences over the source language string. Equivalently in information retrieval and information extraction, it would seem desirable to be able to predict that book on NLP and book about NLP mean largely the same thing, but paranoid about drugs and paranoid on drugs suggest very different things. Prepositions are often among the most frequent words in a language. For example, based on the British National Corpus (BNC; Burnard 2000), four out of the top-ten most-frequent words in English are prepositions (of, to, in, and for). In terms of both parsing and generation, therefore, accurate models of preposition usage are essential to avoid repeatedly making errors. Despite their frequency, however, they are notoriously difficult to master, even for humans (Chodorow, Tetreault, and Han 2007). For example, Lindstromberg (2001) estimates that less than 10% of upper-level English as a Second Language (ESL) students can use and understand prepositions correctly, and Izumi et al. (2003) reported error rates of English preposition usage by Japanese speakers of up to 10%. The purpose of this special issue is to showcase recent research on prepositions across the spectrumof computational linguistics, focusing on computational syntax and semantics. More importantly, however, we hope to reignite interest in the systematic treatment of prepositions in applications. To this end, this article is intended to present a cross-section view of research on prepositions and their use in NLP applications. We begin by outlining the syntax of prepositions and its relevance to NLP applications, focusing on PP attachment and prepositions in multiword expressions (Section 2). Next, we discuss formal and lexical semantic aspects of prepositions, and again their relevance to NLP applications (Section 3), and describe instances of applied research where prepositions have featured prominently (Section 4). Finally,we outline the contributions of the papers included in this special issue (Section 5) and conclude with a discussion of research areas relevant to prepositions which we believe are ripe for further exploration (Section 6).
Files: BibTeX, coli.2009.35.2.119