Home |  ENGLISH |  Kontakt |  Impressum |  Anmelden |  KIT

Article19

Aus Aifbportal

Wechseln zu: Navigation, Suche

(This page contains COinS metadata)

Text Clustering Based on Good Aggregations




Veröffentlicht: 2002

Journal: Künstliche Intelligenz (KI)
Nummer: 4
Seiten: 48-54

Volume: 16

Referierte Veröffentlichung
BibTeX




Kurzfassung
Text clustering typically involves clustering in a high dimensional space, which appears difficult with regard to virtually all practical settings. In addition, given a particular clustering result it is typically very hard to come up with a good explanation of why the text clusters have been constructed the way they are. In this paper, we propose a new approach for applying background knowledge during preprocessing in order to improve clustering results and allow for selection between results. We preprocess our input data applying an ontology-based heuristics for feature selection and feature aggregation. Thus, we construct a number of alternative text representations. Based on these representations, we compute multiple clustering results using K-Means. The results may be distinguished and explained by the corresponding selection of concepts in the ontology. Our results compare favourably with a sophisticated baseline preprocessing strategy.

Download: Media:2002_19_Hotho_Text_Clustering_1.pdf



Forschungsgruppe

Betriebliche Informationssysteme,Komplexitätsmanagement


Forschungsgebiet
Text Mining