Skip to main content Skip to main navigation


Textual Entailment Recognition: A Data-Driven Approach

Rui Wang
Mastersthesis, Universität des Saarlandes, 9/2007.


In this thesis, we present our work on Recognizing Textual Entailment (RTE). On the broad view, we have utilized three approaches: the main approach and two backup strategies. In the main approach, we have proposed a novel feature representation extracted from the dependency structure and then applied kernel-based machine learning techniques based on the entailment patterns. One backup strategy is based on local dependency relations and the other one is a simple bag-of-words method. In practice, we have taken part in the RTE-3 Challenge using our system and achieved 66.9% of accuracy on the test set, which is among the top-5 of all the results from 26 research groups. Further experiments have been performed on the RTE-2 data set (63.6% of accuracy, would score the 4th rank) and other extra data we have collected. Notice that we have only used the output from the dependency parsers without any external knowledge bases or other resources. The whole RTE-centered framework we have established has not only explored approaches tackling the problem itself, but has also tested the RTE system on other natural language processing applications, such as binary relation extraction and answer validation. In addition, the graphic user interface can also assist the annotators and developers. Some parts of Chapter III, Chapter IV, and experiments on the RTE-2 data set and the extra data in Chapter V have been published in Proceedings of the Twenty-Second Conference on Artificial Intelligence (AAAI-07) (Wang and Neumann, 2007a); some parts of Chapter III, Chapter IV, and our participation of the RTE-3 Challenge in Chapter V have been published in Proceedings of ACL-PASCAL Workshop on Textual Entailment and Paraphrasing (Wang and Neumann, 2007b); some parts of Chapter III, Chapter IV, and the main parts of Chapter VI will be published in Working Notes of the AVE task of CLEF2007 (Wang and Neumann, 2007c).