Hybrid Constituent and Dependency Parsing with Tsinghua Chinese Treebank

Rui Wang, Yi Zhang

In: Proceedings of the seventh international conference on Language Resources and Evaluation. International Conference on Language Resources and Evaluation (LREC-10) May 19-21 Valletta Malta ELRA 5/2010.


In this paper, we describe our hybrid parsing model on Mandarin Chinese processing. The model combines the mainstream constitute and dependency parsing and the dataset we use it the Tsinghua Chinese Treebank, whose annotation has both constitutes and head information. We show the adaption of this annotation scheme to the normal constitute structure, dependency structure, and the integration of both. We achieve a f1-score of 85.23% for the constitute parsing, 82.35% for the partial head information, and 74.27% for the complete head information.

LREC2010b.pdf (pdf, 329 KB )

Deutsches Forschungszentrum für Künstliche Intelligenz
German Research Center for Artificial Intelligence