Published: 2014 Juni
Buchtitel: INLG2014 - 8th International Natural Language Generation Conference
Verlag: The Association for Computer Linguistics
With the rise of the Semantic Web more and more data become available encoded using the Semantic Web standard RDF. This representation is faced towards machines: designed to be easily processable by machines it is difficult to understand by non-experts. Transforming RDF data into human-comprehensible text would facilitate non-experts to assess this information. In this paper we present a language-independent method for extracting RDF verbalization templates from a parallel corpus of text and data. Our method is based on distant-supervised simultaneous multi relation learning and frequent maximal subgraph pattern mining. We demonstrate the feasibility of this method on a parallel corpus of Wikipedia articles and DBpedia data for English and German.
Download: Media:INLG2014 RDF.pdf