Inproceedings3263: Unterschied zwischen den Versionen
Wa5886 (Diskussion | Beiträge) |
(kein Unterschied)
|
Aktuelle Version vom 26. Juni 2014, 14:37 Uhr
Representing Interoperable Provenance Descriptions for ETL Workflows
Representing Interoperable Provenance Descriptions for ETL Workflows
Published: 2012
Mai
Buchtitel: In Proceedings of the 3rd International Workshop on Role of Semantic Web in Provenance Management (SWPM 2012), Extended Semantic Web Conference (ESWC)
Verlag: CEUR-WS.org
Referierte Veröffentlichung
Note: (Selected Workshop Paper for ESWC Post-Proceedings)
BibTeX
Kurzfassung
The increasing availability of data on the Web provided by the emergence of Web 2.0 applications
and, more recently by Linked Data, brought additional complexity to data management tasks, where the
number of available data sources and their associated heterogeneity drastically increases. In this
scenario, where data is reused and repurposed on a new scale, the pattern expressed as
Extract-Transform-Load (ETL) emerges as a fundamental and recurrent process for both producers
and consumers of data on the Web. In addition to ETL, provenance, the representation of
source artifacts, processes and agents behind data, becomes another cornerstone element for Web data
management, playing a fundamental role in data quality assessment, data semantics and facilitating
the reproducibility of data transformation processes. This paper proposes the convergence of this
two Web data management concerns, introducing a principled provenance model for ETL processes in the
form of a vocabulary based on the Open Provenance Model (OPM) standard and focusing on the provision of
an interoperable provenance model for Web-based ETL environments. The proposed ETL provenance model is instantiated in a real-world sustainability reporting scenario.
Download: Media:Freitas et al swpm12 provenance ETL workflow.pdf
Weitere Informationen unter: Link
There is a refined paper version (to be published soon): Media:Preprint_provenance_ETL_workflow_eswc_highlights.pdf
Presentation given at conference in SlideShare: [1]