ParDeepBank: Multiple Parallel Deep Treebanking

Daniel Flickinger, Valia Kordoni, Yi Zhang, António Branco, Kiril Simov, Petya Osenova, Catarina Carvalheiro, Francisco Costa, Sérgio Castro

In: Proceedings of the Eleventh International Workshop on Treebanks and Linguistic Theories. International Workshop on Treebanks and Linguistic Theories (TLT-11) 11th November 30-December 1 Lisbon Portugal Seiten 97-108 Edições Colibri Lisbon 2012.


This paper describes the creation of an innovative and highly parallel tree- bank of three languages from different language groups — English, Por- tuguese and Bulgarian. The linguistic analyses for the three languages are done by compatible parallel automatic HPSG grammars using the same for- malism, tools and implementation strategy. The final analysis for each sen- tence in each language consists of (1) a detailed feature structure analysis by the corresponding grammar and (2) derivative information such as derivation trees, constituent trees, dependency trees, and Minimal Recursion Seman- tics structures. The parallel sentences are extracted from the Penn Treebank and translated into the other languages. The Parallel Deep Bank (ParDeep- Bank) has potentially many applications: for HPSG grammar development; machine translation; evaluation of parsers on comparable data; etc.


ParDeepBank_TLT11.pdf (pdf, 328 KB )

Deutsches Forschungszentrum für Künstliche Intelligenz
German Research Center for Artificial Intelligence