Veröffentlicht: 2016 Februar
Journal: TLDKS (Large-scale Data and Knowledge-Centered Systems)
We consider the problem of source selection and query decomposition in federations of SPARQL endpoints, where query decompositions of a SPARQL query should reduce execution time and maximize answer completeness. This problem is in general intractable, and performance and answer completeness of SPARQL queries can be consider- ably affected when the number of SPARQL endpoints in a federation in- creases. We devise a formalization of this problem as the Vertex Coloring Problem and propose an approximate algorithm named Fed-DSATUR. We rely on existing results from graph theory to characterize the family of SPARQL queries for which Fed-DSATUR can produce optimal decompositions in polynomial time on the size of the query, i.e., on the number of SPARQL triple patterns in the query. Fed-DSATUR scales up much better to SPARQL queries with a large number of triple patterns, and may exhibit significant improvements in performance while answer completeness remains close to 100%. More importantly, we put our results in perspective, and provide evidence of SPARQL queries that are hard to decompose and constitute new challenges for data management.
Weitere Informationen unter: Link
DOI Link: 10.1007/978-3-662-49534-6_4