DFKI-LT - ParDeepBank: Multiple Parallel Deep Treebanking

Daniel Flickinger, Valia Kordoni, Yi Zhang, António Branco, Kiril Simov, Petya Osenova, Catarina Carvalheiro, Francisco Costa, Sérgio Castro
ParDeepBank: Multiple Parallel Deep Treebanking
3 Proceedings of the Eleventh International Workshop on Treebanks and Linguistic Theories, Pages 97-108, Lisbon, Portugal, Edições Colibri, Lisbon, 2012
This paper describes the creation of an innovative and highly parallel tree- bank of three languages from different language groups — English, Por- tuguese and Bulgarian. The linguistic analyses for the three languages are done by compatible parallel automatic HPSG grammars using the same for- malism, tools and implementation strategy. The final analysis for each sen- tence in each language consists of (1) a detailed feature structure analysis by the corresponding grammar and (2) derivative information such as derivation trees, constituent trees, dependency trees, and Minimal Recursion Seman- tics structures. The parallel sentences are extracted from the Penn Treebank and translated into the other languages. The Parallel Deep Bank (ParDeep- Bank) has potentially many applications: for HPSG grammar development; machine translation; evaluation of parsers on comparable data; etc.
Files: BibTeX, ParDeepBank_TLT11.pdf