Published: 2011 Dezember
Institution: Institut AIFB, KIT
In recent years, top-k query processing has attracted much attention in large-scale scenarios, where computing only the k “best” results is often sufficient. One line of research targets the so-called top-k join problem, where the k best final results are obtained through joining partial results. In this paper, we study the top-k join problem in a Linked Data setting, where partial results are located at different sources and can only be accessed via URI lookups. We show how existing work on top-k join processing can be adapted to the Linked Data setting. Further, we elaborate on strategies for a better estimation of scores of unprocessed join results (to obtain tighter bounds for early termination) and for an aggressive pruning of partial results. Based on experiments on real-world Linked Data, we show that the proposed top-k join processing technique substantially improves runtime performance.