Multilingual Evidence Improves Clustering-based Taxonomy Extraction

Hans Hjelm; Paul Buitelaar

In: Malik Ghallab; C.D. Spyropoulos; N. Fakotatis; N. Avouris (Hrsg.). Proceedings of the 18th European Conference on Artificial Intelligence. European Conference on Artificial Intelligence (ECAI-2008), 18th, July 21-25, Patras, Greece, Pages 288-292, Frontiers in Artificial Intelligence and Applications, Vol. 178, IOS Press, 2008.


We present a system for taxonomy extraction, aimed at providing a taxonomic backbone in an ontology learning environment. We follow previous research in using hierarchical clustering based on distributional similarity of the terms in texts. We show that basing the clustering on a comparable corpus in four languages gives a considerable improvement in accuracy compared to using only the monolingual English texts. We also show that hierarchical k-means clustering increases the similarity to the original taxonomy, when compared with a bottom-up agglomerative clustering approach.


Deutsches Forschungszentrum für Künstliche Intelligenz
German Research Center for Artificial Intelligence