Aus Aifbportal
(Weitergeleitet von Inproceedings1758/en)
Wechseln zu:Navigation, Suche

Enriching the crosslingual link structure of Wikipedia - A classification-based approach

Published: 2008 Juni

Buchtitel: Proceedings of the AAAI 2008 Workshop on Wikipedia and Artifical Intelligence

Referierte Veröffentlichung


The crosslingual link structure of Wikipedia represents a valuable resource which can be exploited for crosslingual natural language processing applications. However, this requires that it has a reasonable coverage and is furthermore accurate. For the specific language pair German/English that we consider in our experiments, we show that roughly 50% of the articles are linked from German to English and only 14% from English to German. These figures clearly corroborate the need for an approach to automatically induce new cross-language links, especially in the light of such a dynamically growing resource such as Wikipedia. In this paper we present a classification-based approach with the goal of infering new cross-language links. Our experiments show that this approach has a recall of 70% with a precision of 94% for the task of learning cross-language links on a test dataset.

Download: Media:2008_1758_Sorg_Enriching_the_c_1.pdf






Maschinelles Lernen, Knowledge Discovery, Data Mining, Künstliche Intelligenz