Regular Approximation of Context-Free Grammars through Transformation

Mehryar Mohri, Mark-Jan Nederhof

In: Jean-Claude Junqua , Gertjan van Noord (Hrsg.). Robustness in Language and Speech Technology. Seiten 153-163 Kluwer Academic Publishers Dordrecht 2001.


We present an algorithm for approximating context-free languages with regular languages. The algorithm is based on a simple transformation that applies to any context-free grammar and guarantees that the result can be compiled into a finite automaton. The resulting grammar contains at most one new nonterminal for any nonterminal symbol of the input grammar. The result thus remains readable and if necessary modifiable. We extend the approximation algorithm to the case of weighted context-free grammars. We also report experiments with several grammars showing that the size of the minimal deterministic automata accepting the resulting approximations is of practical use for applications such as speech recognition.

Weitere Links

Deutsches Forschungszentrum für Künstliche Intelligenz
German Research Center for Artificial Intelligence