DFKI-LT - Dissertation Series

Vol. II

Hans-Ulrich Krieger: TDL - A Type Description Language for Constraint-Based Grammars. Foundations, Implementation, and Applications

ISBN: 3-933218-01-2
424 pages
price: € 19

order form

This thesis describes the formal foundations, the implementation, and several applications of the system TDL (Type Description Language), an efficient representation language and inference system for arbitrary constraint-based grammar formalisms. TDL is a general purpose formalism that allows the definition of annotated CF-style grammars a la GPSG, LFG, or PATR-II, but it is also, and especially, designed to support highly-lexicalized and integrated theories like HPSG.

TDL has been fully implemented (and documented) and has been successfully installed at many sites outside the DFKI. The system has incorporated a number of additional features not described here and has been tested with several large grammars. TDL has been integrated into the grammar engineering environment PAGE.

The thesis is separated into three parts: foundations of TDL, implementation of TDL, and applications of TDL.

The first part establishes a precise set-theoretical semantics for TDL that culminates in two fixpoint constructions. In the first chapter, we follow the approach pioneered by Gert Smolka and adapt his framework to our needs. The adaption consists of several extensions and is due to the rich syntactic inventory and the open-world nature of types in TDL. We establish a compilation scheme between TDL and Smolka's feature constraint logic that enables us to transform a typed feature structure into a feature term and a type system of TDL (a set of type definitions) into a sort system. The next chapter comes up with a fixpoint semantics for TDL in order to cope with recursive type definitions. In contrast to other approaches, we have chosen the greatest fixpoint model. This approach is carefully explained in this thesis. We also prove that the greatest fixpoint can be reached in at most omega-many steps.

The second part of the thesis presents a detailed analysis of the implementation of TDL that respects the denotational semantics given so far. We highlight especially the novel features of TDL w.r.t. other systems and study the advanced techniques we have employed in implementing the inference modules. Although we have presented two translation schemata from TDL into first-order logic (feature logic and definite equivalences), the implementation of TDL follows a different line, viz., implementing specialized inference engines for restricted domains of reasoning. This approach benefits from the fact that certain constraints can be processed more efficiently than others. Thus it makes sense to distribute the workload onto different components, instead of proposing a unique reasoner. This is in sharp contrast to many other systems which embody the transformation approach in a single inference engine. We start with a ``top-down'' description of the architecture of TDL in order to give a quick overview of the whole system. The next four chapters describe precisely the main inference engines of TDL, viz., (i) type hierarchy management and reasoning, (ii) symbolic simplifier, (iii) memoization of precomputed results, and (iv) parameterized type expansion.

The last part of the thesis investigates the application of TDL in the well-known domain of computational morphology. We present a satisfactory framework that allow us to widen the scope of ``ordinary'' HPSG beyond the borders of syntax and semantics. We set up the necessary inventory for describing inflection and derivation in our framework. We argue that feature-based inheritance networks are the right framework for the specification of modern computational lexicons and that non-feature based formalisms can be neatly encoded in TDL. We then present our word-and-paradigm treatment of inflection. We argue that the central notion of paradigm should be defined directly in typed feature structures, and that it is more satisfactorily linked to syntactic/semantic information in this fashion. Hence, there is no longer a need for lexical rules. In order to describe allomorphy and morphotactics uniformly, we outline a general treatment for the representation of finite automata (FA) directly within the calculus of typed feature structures. We show how FA can be processed in this framework and furthermore present the correspondences to the closure properties of FA. The next chapter is the synthesis of the previous two. The representation of FA as a family of recursive typed feature structures immediately leads us to a tight coupling of morphotactics and morphophonemics in the same feature description language. Finally, the last chapter presents a novel approach to derivation which is fully in the spirit of HPSG.

We propose principles and rules, together with a rich hierarchical lexicon, that enables us to formulate "word grammars", similar to the HPSG proposal for phrasal/sentential grammars.