Middleware Architecture for the Integration of Deep and Shallow Natural Language Processing Components

Heart of Gold is a middleware architecture for the integration of deep and shallow natural language processing components. It provides a uniform and flexible infrastructure for building applications that use Robust Minimal Recursion Semantics (RMRS) and/or general XML standoff annotation produced online by natural language processing components.

The main purpose Heart of Gold was developed for is tight integration of various shallow natural language processors with the deep parser PET .

The aim of the integration is to increase robustness of deep grammars for various languages such as English, German, Japanese, Greek and Norwegian. Deep grammars can be developed with the Linguistic Knowledge Builder LKB , compiled to a binary grammar image, and run within Heart of Gold.

Although the focus of Heart of Gold is deep-shallow integration, the framework itself is generic and hence can also be used to annotate corpora automatically and multi-dimensionally, combine multiple purely shallow systems on XML basis, or to integrate other deep parsers.

The core middleware architecture (and also the PET system) is available under the LGPL open source license. However, some of the components for which adapters are provided are only available for research purposes or come with licenses different from LGPL.

Project Manager:Ulrich Schäfer (Ulrich.Schaefer@dfki.de)
Contact:Ulrich Schäfer (Ulrich.Schaefer@dfki.de)