Published: 2011 Dezember
Institution: Institut AIFB, KIT
In recent years, top-k query processing has attracted much attention in large-scale scenarios, where computing only the k “best” results is often sufficient. Top-k query processing has been dealt with in different contexts. One line of research targets the so-called top-k join problem, where the k best final results are obtained through joining partial results. In this paper, we study top-k join in a Linked Data setting, where partial results to be joined are located in different sources and can only be accessed via URI source lookups. We show how existing work on top-k join processing can be adapted to the Linked Data setting. We elaborate on strategies for a better estimation of scores of unprocessed join result (to obtain tighter bounds for early termination) and for a more aggressive pruning of results. Based on experiments on real-world Linked Data, we show that the proposed top-k join processing technique substantially improves runtime performance.