Published: 2011 Oktober
Buchtitel: Proceedings of the Second International Workshop on Consuming Linked Data (COLD2011),
Verlag: CEUR Workshop Proceedings (CEUR-WS.org)
Recently, SPARQL became the standard language for querying RDF data on the Web. Like other formal query languages, it applies a Boolean-match semantics, i.e. results adhere strictly to the query. Thus, queries formulated for one dataset can not easily be reused for querying other datasets. If another target dataset is to be queried, the queries need to be rewritten using the vocabulary of the target dataset, while preserving the captured information need. This is a tedious manual task, which requires the knowledge of the target vocabulary and often relies on computational expensive techniques, such as mapping, data consolidation or reasoning methods. Given the scale as well as the dynamics of Web datasets, even automatic rewriting is often infeasible. In this paper, we elaborate on a novel approach, which allows to reuse existing SPARQL queries adhering to one dataset to search for entities in other dataset, which are neither linked nor otherwise integrated beforehand. We use the results returned by the given seed query, to construct an entity relevance model (ERM), which captures the content and the structure of relevant results. Candidate entities from the target dataset are obtained using existing keyword search techniques and subsequently ranked according to their similarity to the ERM. During this ranking task, we compute mappings between the structure of the ERM and of the candidates on- the-fly. The effectiveness of this approach is shown in experiments using large-scale datasets and compared to a keyword search baseline.
Download: Media:HerzigAndTran COLD2011.pdf
Weitere Informationen unter: Link