Published: 2013 Oktober
Buchtitel: International Semantic Web Conference
Text-rich structured data become more and more ubiquitous on the Web and on the enterprise databases by encoding heterogeneous structural relationships between entities such as people, locations, or organizations and the associated textual information. For analyzing this type of data, existing topic modeling approaches, which are highly tailored toward document collections, require manually-defined regularization terms to exploit and to bias the topic learning towards structure information. We propose an approach, called Topical Relational Model, as a principled approach for automatically learning topics from both textual and structure information. As a topic model, we show that our approach is effective in exploiting heterogeneous structure information, outperforming a state-of-the-art approach that requires manually-tuned regularization.