DFKI-LT - Processing Document Collections to Automatically Extract Linked Data: Semantic Storytelling Technologies for Smart Curation Workflows

Peter Bourgonje, Julian Moreno Schneider, Georg Rehm, Felix Sasaki
Processing Document Collections to Automatically Extract Linked Data: Semantic Storytelling Technologies for Smart Curation Workflows
in: Aldo Gangemi and Claire Gardent (ed.):
4 Proceedings of the 2nd International Workshop on Natural Language Generation and the Semantic Web (WebNLG 2016), Pages 13-16, Edinburgh, United Kingdom, The Association for Computational Linguistics, The Association for Computational Linguistics, Association for Computational Linguistics (ACL) 209 N. Eighth Street Stroudsburg, PA 18360 USA, 9/2016
 
We develop a system that operates on a document collection and represents the contained information to enable the intuitive and efficient exploration of the collection. Using various NLP, IE and Semantic Web methods, we generate a semantic layer on top of the collection, from which we take the key concepts. We define templates for structured reorganisation and rearrange the information related to the key concepts to fit the respective template. The use case of the system is to support knowledge workers (journalists, editors, curators, etc.) in their task of processing large amounts of documents by summarising the information contained in these documents and suggesting potential story paths that the knowledge worker can then process further.
 
Files: BibTeX, book.pdf