Intelligent Extraction of Information from On-line Documents


Along with the rapidly growing distribution of the Internet, the problem of information overload is beginning to take over: the more on-line texts are accessible, the more difficult it is to use this information potential constructively, i.e. to find relevant information, extract it and represent it in a compact and comprehensive manner. In the ParaDime project an intelligent system is being developed with the Saarbrücker Information Extraction System SMES to enable targeted information extraction from German on-line documents (press releases, economic reports, technical descriptions). Innovative language technology is being employed so that even complex facts can be extracted and represented in compact form. This completely new type of procedure supports content-search and indexing, allowing the extraction of such complex information as the turnover and profit development of individual companies from on-line reports. In order to keep up with the constantly changing course of events, machine learning processes are employed for the automatic configuration of SMES and to adapt them to new functions.

Funded by:Federal Ministry of Education and Research
Project Manager:Hans Uszkoreit (Hans.Uszkoreit@dfki.de)
Contact:Günter Neumann (Guenter.Neumann@dfki.de)
Duration: 1997 - 2000