A Graphical Citation Browser for the ACL Anthology

Benjamin Weitz; Ulrich Schäfer
In: Proceedings of LREC-2012. International Conference on Language Resources and Evaluation (LREC-2012), May 23-25, Istanbul, Turkey, Pages 1718-1722, ISBN 978-2-9517408-7-7, ELRA, 5/2012.


Navigation in large scholarly paper collections is tedious and not well supported in most scientific digital libraries. We describe a novel browser-based graphical tool implemented using HTML5 Canvas. It displays citation information extracted from the paper text to support useful navigation. The tool is implemented using a client/server architecture. A citation graph of the digital library is built in the memory of the server. On the client side, egdes of the displayed citation (sub)graph surrounding a document are labeled with keywords signifying the kind of citation made from one document to another. These keywords were extracted using NLP tools such as tokenizer, sentence boundary detection and part-of-speech tagging applied to the text extracted from the original PDF papers (currently 22,500). By clicking on an egde, the user can inspect the corresponding citation sentence in context, in most cases even also highlighted in the original PDF layout. The system is publicly accessible as part of the ACL Anthology Searchbench at



Weitere Links