Aus Aifbportal
Wechseln zu:Navigation, Suche

Multilingual Expert Search using Linked Open Data

Hristina Taneva

Information on the Thesis

Type of Final Thesis: Diplom
Supervisor: Daniel M. HerzigRudi Studer
Research Group: Web Science

Archive Number: 3.239
Status of Thesis: Completed
Date of start: 2010-04-01
Date of submission: 2010-09-30

Further Information

Most information retrieval models take documents as a Bag-of-Words, which consists of taking the collection of words that occur in it as the representation for documents. Thereby they are bound to the language of the documents. The purpose of this diploma thesis is to develop approach using Linked Open Data resources, i.e. Uniform Resource Identifier(URIs), as interlingual document representations. Documents and queries are summarized by the resources they contain. Wikipedia is used as the corpus of resources, but the approach is not limited to the usage of it. First a mixture language model, that uses only the resources and their labels in different languages, is introduced. In the second part of this study the language model is extended and the typed links between resources are exploited. The new model uses the rdf : type property, that indicates that the resource being described is an instance of the class, here called category. That is that the documents are summarized by the categories of the resources. The applicability of both approaches for the multilingual retrieval with a case study on ex- pert search is shown. The experiments show that the both mixture models outperform the standard BM25 + Z-Score(43) baseline and the approach is suitable for the cross-lingual expert finding task.