The Flag project is a research project aimed at developing grammar and controlled language checking technologies using state-of-the-art natural language processing technologies. Flag will use both "shallow" techniques - statistical taggers and partial parsers - and " deeper " rule-based parsing in combination, to provide sufficient information for reliable diagnosis and (subsequently) user-mediated correction of a broad set of errors. The set of errors is motivated by corpus investigation.
A key assumption in the development of a research prototype is an open system design, which is vital for integrating diverse technologies. This will be maintained by means of a modular design, new and existing NLP tools being integrated into a pipeline using the latest software engineering technologies (CORBA, XML, etc.), as well as taking account of emerging standards in the NLP community, such as the Text Encoding Initiative and the GATE engineering platform. A number of application prototypes are being investigated, including the use of grammar and controlled language checking tools as an aid in translation routing and pre- and post-editing for MT.
A major additional activity in Flag is the creation of large-scale resources for the assisting in the development and evaluation of grammar and controlled language checking components. Corpora are annotated semi-automatically using tools developed in Saarbrücken.
- Research goal to establish "appropriate technology" for specific NL processing tasks
- Modular system design using shallow and deep processing
- Data driven error models underlying the research effort