Semantic Kernels for Text Classification based on Topological Measures of Feature Similarity
Published: 2006 Dezember
Buchtitel: Proceedings of the 6th IEEE International Conference on Data Mining (ICDM 06), Hong Kong, 18-22 December 2006
Seiten: 808 - 812
Recently, there has been an increased interest in the exploitation of background knowledge in the context of text mining tasks, especially text classification. At the same time, Kernel-based learning algorithms, especially Support Vector Machines, have become a dominant paradigm in the text mining community. This is also due to their capability to achieve more accurate learning results by incorporating a-priori knowledge by replacing standard linear kernels of bag-of-words with the so called 'semantic' kernels. In this paper we propose extensions and alternatives to previously proposed approaches to the design of semantic kernels by incorporating a variety of well-known measures of semantic similarity between terms. The experimental evaluation versus the standard linear kernel indicates that our approach improves performance in a variety of domains while being consistently superior in cases where little training data is available.
Weitere Informationen unter: Link, Link