DFKI-LT - Preprocessing for Unification Parsing of Spoken Language

Mark-Jan Nederhof
Preprocessing for Unification Parsing of Spoken Language
in: Dimitris Christodoulakis (ed.):
1 Proceedings of the Natural Language Processing - NLP 2000, June 2-4,
Lecture Notes in Artificial Intelligence number 1835, Pages 118-129, Patras, Greece, Springer, 2000

Wordgraphs are structures that may be output by speech recognizers. We discuss various methods for turning wordgraphs into smaller structures. One of these methods is novel; this method relies on a new kind of determinization of acyclic weighted finite automata that is language-preserving but not fully weight-preserving, and results in smaller automata than in the case of traditional determinization of weighted finite automata. We present empirical data comparing the respective methods. The methods are relevant for systems in which wordgraphs form the input to kinds of syntactic analysis that are very time consuming, such as unification parsing.
Files: BibTeX, Nederhof:2000:PUPa.pdf