Publikation

Towards an ACL Anthology Corpus with Logical Document Structure. An Overview of the ACL 2012 Contributed Task

Ulrich Schäfer, Jonathon Read, Stephan Oepen

In: Proceedings of the ACL-2012 Main Conference Workshop on Rediscovering 50 Years of Discoveries. Annual Meeting of the Association for Computational Linguistics (ACL-2012) befindet sich ACL-2012 July 10 Jeju Island South Korea Seiten 88-97 ISBN 978-1-937284-29-9 Association for Computational Linguistics 7/2012.

Abstrakt

The ACL 2012 Contributed Task is a community effort aiming to provide the full ACL Anthology as a high-quality corpus with rich markup, following the TEI P5 guidelines - a new resource dubbed the ACL Anthology Corpus (AAC). The goal of the task is threefold: (a) to provide a shared resource for experimentation on scientific text; (b) to serve as a basis for advanced search over the ACL Anthology, based on textual content and citations; and, by combining the aforementioned goals, (c) to present a showcase of the benefits of natural language processing to a broader audience. The Contributed Task extends the current Anthology Reference Corpus (ARC) both in size, quality, and by aiming to provide tools that allow the corpus to be automatically extended with new content - be they scanned or born-digital.

Projekte

Weitere Links

W12-3210.pdf (pdf, 185 KB )

Deutsches Forschungszentrum für Künstliche Intelligenz
German Research Center for Artificial Intelligence