Discriminative Parse Reranking for Chinese with Homogeneous and Heterogeneous Annotations

Weiwei Sun, Rui Wang, Yi Zhang

In: Proceedings of CIPS-SIGHAN Joint Conference on Chinese Language Processing. CIPS-SIGHAN Joint Conference on Chinese Language Processing (CLP-2010) First August 28-29 Beijing China Chinese Information Processing Society of China 8/2010.


Discriminative parse reranking has been shown to be an effective technique to improve the generative parsing models. In this paper, we present a series of experiments on parsing the Tsinghua Chinese Treebank with hierarchically split-merge grammars and reranked with a perceptron-based discriminative model. In addition to the homogeneous annotation on TCT, we also incorporate the PCTB-based parsing result as heterogeneous annotation into the reranking feature model. The reranking model achieved 1.12% absolute improvement on F1 over the Berkeley parser on a development set. The head labels in Task 2.1 are annotated with a sequence labeling model. The system achieved 80.32 (B+C+H F1) in CIPS-SIGHAN-2010 Task 2.1 (Open Track) and 76.11 (Overall F1) in Task 2.2 (Open Track).


CLP2010.pdf (pdf, 356 KB )

Deutsches Forschungszentrum für Künstliche Intelligenz
German Research Center for Artificial Intelligence