TG/2 Practical Text Generation: Some technical details
For a general
overview, you may want to consider the
flyer first - in case you didn't do that already :-).
TG/2 is a shallow generation system. The notion of shallow
(not to be mixed up with surface generation!) emphasizes some
similarity to shallow parsing. In both cases, the use of shallow models
of language sacrifices the completeness of coverage and many linguistic
generalizations. On the other hand, many useful applications can be
in an easy way. For both shallow parsing and shallow generation,
to the more comprehensive and theory-based models remain to be
While TG/2 produces surface strings as output, it is much more than
a surface generator. The most important difference ist that TG/2 can
to 'deep' linguistic representations or even domain-semantic
You can see a multilingual application
demo with TG/2 at work (in Chinese, French, English, German,
Japanese and Portuguese).
TG/2 is based on restricted production system techniques that
modularity of processing and linguistic knowledge, hence making the
transparent and reusable for various applications. Here is an overview
of TG/2's architecture that is discussed in the sequel.
Generation rules are written in the language TGL, expressing
and actions in a uniform format (for more details on TGL see [Busemann
A context-free backbone allows the system to select rules on the basis
of their categories (the left-hand side category is part of the
the right-hand side categories are each assigned to an action).
Input to TG/2 is first translated into a system-internal language
- in the figure called GIL - in order to abstract away from many
application-driven requirements on
the input structure representations. A GIL structure is fed to
engine, which performs the three-step processing cycle known from AI
systems on the available TGL rules:
The processing strategy for constructing derivations is top-down and
The set of actions in a rule is fired from left to right. Each TGL rule
may pick up some part of the current input structure, which forms the
for some action. If a TGL rule fails, backtracking is used to try
- identify all applicable rules,
- select an applicable rule (e.g. according to preferences),
- fire that rule.
The interpreter yields all formulations the grammar can generate. It
attempts to generate and output a first formulation, producing possible
alternatives only on external demand. The order in which formulations
generated can be influenced by parameterizing the generic backtracking
mechanism. This method also allows the user to have the system generate
a preferred formulation first. [Busemann
and [Wein 1996] give the details.
A current survey of ten years of TG/2
development and usage is found in [Busemann 2005].
Larger TG/2 grammars are nowadays being developed using the
development environment eGram implemented in Java [Busemann 2004].
The rule format is easier to understand, syntactic consistency is
checked, and grammars can easily be tested with TG/2 or XtraGen, the
sister implementation in Java [Stenzhorn 2002].
Many thanks to Matthias Rinck for considerably improving and debugging
the system, to Michael Wein, who implemented a first version of the
and the backtracking mechanism (and who drew the above picture), and to
Jan Alexandersson for influential work on an early version of the
Work on TG/2 was partially funded by the German Minister for Research
Technology (BMBF) under contract ITW~9402 (project COSMA,
1994-1996) and by the European Commission (Telematics Application
under contracts C9-2945 (project TEMSIS,
1996-1998) and MLIS 5015 (project MUSI,
Publications on TG/2
(for download of the paper search in the LT
- Stephan Busemann. Ten Years After: An Update on TG/2 (and
Friends), in: Graham Wilcock, Kristiina Jokinen, Chris Mellish, and
Ehud Reiter (eds): Proceedings of the Tenth European Natural
Workshop (ENLG 2005), Aberdeen, 2005, 32-39.
- Stephan Busemann. Best-First Surface Realization, in: Donia Scott
(ed.): Proceedings of the Eighth International Natural Language
Workshop (INLG '96), Herstmonceux, Sussex, 1996, 101-110. Also at
the Computation and
- Matthias Rinck. Ein Metaregelformalismus für TG/2.
Thesis, Institute for Computational Linguistics, University of the
- Holger Stenzhorn. XtraGen. A Natural Language Generation
Java- and XML-Technologies. Master's thesis, Institute for
Linguistics, University of the Saarland, 2002.
- Stephan Busemann. A Shallow Formalism for Defining Personalized
Workshop Professionelle Erstellung von Papier- und
für die automatische Textgenerierung at the 22nd Annual German
Conference on Artificial Intelligence (KI-98), Bremen, September 16-17,
- Michael Wein. Eine parametrisierbare Generierungskomponente
Backtracking. Master's thesis, Department for Computer Science,
of the Saarland, 1996.
Publications on Applications Using TG/2
(for download of the paper search in the LT
- Stephan Busemann. eGram - a Grammar Development Environment and
Its Usage for Natural Language Generation, in Proc. Fourth International Conference on
Language Resources and Evaluation (LREC), Lisbon, Portugal, 2004.
- Stephan Busemann. Language Generation for Cross-Lingual Document
in: Hyanye Sheng: Proceedings of the International Workshop on
Language Technology and Chinese Information Processing, Shanghai, 2001.
Science Press, Beijing.
- Stephan Busemann and Helmut Horacek. A Flexible Shallow Approach
Generation, in: E. Hovy (ed.): Proceedings of the Nineth
Natural Language Generation Workshop (INLG '98),
August 1998. Also at the Computation
and Language Archive. Related
- Stephan Busemann and Helmut Horacek. Generating Air-Quality
Environmental Data, in: Tilman Becker, Stephan Busemann, and Wolfgang
(eds.), DFKI Workshop on Natural Language Generation, April
DFKI Document D-97-06, Saarbrücken. Related
- Helmut Horacek and Stephan Busemann. Towards a Methodology for
Application-Oriented Report Generation, in: O. Herzog (ed.): KI-98.
22nd Annual German Conference on Artificial Intelligence, Bremen,
1998. Related Online
- Stephan Busemann, Thierry Declerck, Abdel Kader Diagne, Luca
Klein, and Sven Schmeier, ``Natural Language Dialogue Service for
Scheduling Agents'', in Proc. 5th Conference on Applied Natural
Processing, Washington, DC., 1997. Also at the Computation
and Language Archive.
last modified: October 24, 2005 Stephan