Skip to main content Skip to main navigation


Using XSLT for the Integration of Deep and Shallow Natural Language Processing Components

Ulrich Schäfer
In: Proceedings of the ESSLLI 2004 Workhop on Combining Shallow and Deep Processing for NLP. European Summer School in Logic, Language and Information (ESSLLI), Nancy, Pages 31-40, 8/2004.


Whiteboard is a hybrid XML-based architecture that integrates deep and shallow natural language processing components. The online system consists of a fast HPSG parser that utilizes tokenization, PoS, morphology, lexical, named entity, phrase chunk and (for German) topological sentence eld analyses from shallow components. This integration increases robustness, directs the search space and hence reduces processing time of the deep parser. In this paper, we focus on one of the central integration facilities, the XSLT-based Whiteboard Annotation Transformer (WHAT), report on the bene ts of XSLT-based NLP component integration, and present examples of XSL transformation of shallow and deep annotations used in the integrated architecture. Furthermore, we report on a recent application of XSL transformation for the conversion of XML-encoded typed feature structures representation in the context of the DeepThought project where deep-shallow integration is performed on the basis of Robust Minimal Recursion Semantics (RMRS).