Discovering Related Data Sources in Data-Portals
Published: 2013 November
Buchtitel: First International Workshop on Semantic Statistics, co-located with the 12th International Semantic Web Conference
To allow effective querying on the Web of data, systems frequently rely on data from multiple sources for answering queries. For instance, a user may wish to combine data from sources comprised in different statistical catalogs. Given such federated queries, in order to enable an interactive exploration of results, systems must allow user involvement during data source selection. That is, a user should be able to choose data sources contributing to query results, thereby allowing to refine/expand current findings. For this, one needs effective recommendations for data sources to be picked: data source contextualization. Recent work, however, solely aims at source contextualization for “Web tables”, while heavily relying on schema information and simple table structures. Addressing these shortcomings, we exploit work from the field of data mining and show how to enable effective Web data source contextualization. Based on a real-world finance use-case, we built a contextualization engine, which we integrated into a Web search system, our data portal, for accessing statistics data sets.