Article977: Unterschied zwischen den Versionen
K (Added from ontology) |
K (Added from ontology) |
||
Zeile 1: | Zeile 1: | ||
+ | {{Publikation Author | ||
+ | |Rank=2 | ||
+ | |Author=Andreas Hotho | ||
+ | }} | ||
{{Publikation Author | {{Publikation Author | ||
|Rank=3 | |Rank=3 | ||
Zeile 6: | Zeile 10: | ||
|Rank=1 | |Rank=1 | ||
|Author=Philipp Cimiano | |Author=Philipp Cimiano | ||
− | |||
− | |||
− | |||
− | |||
}} | }} | ||
{{Article | {{Article | ||
Zeile 24: | Zeile 24: | ||
|Abstract=bstract: We present a novel approach to the automatic acquisition of taxonomies or concept hierarchies from a text corpus. The approach is based on Formal Concept Analysis (FCA), a method mainly used for the analysis of data, i.e. for investigating and processing explicitly given information. We follow Harris' distributional hypothesis and model the context of a certain term as a vector representing syntactic dependencies which are automatically acquired from the text corpus with a linguistic parser. On the basis of this context information, FCA produces a lattice that we convert into a special kind of partial order constituting a concept hierarchy. The approach is evaluated by comparing the resulting concept hierarchies with hand-crafted taxonomies for two domains: tourism and finance. We also directly compare our approach with hierarchical agglomerative clustering as well as with Bi-Section-KMeans as an instance of a divisive clustering algorithm. Furthermore, we investigate the impact of using different measures weighting the contribution of each attribute as well as of applying a particular smoothing technique to cope with data sparseness. | |Abstract=bstract: We present a novel approach to the automatic acquisition of taxonomies or concept hierarchies from a text corpus. The approach is based on Formal Concept Analysis (FCA), a method mainly used for the analysis of data, i.e. for investigating and processing explicitly given information. We follow Harris' distributional hypothesis and model the context of a certain term as a vector representing syntactic dependencies which are automatically acquired from the text corpus with a linguistic parser. On the basis of this context information, FCA produces a lattice that we convert into a special kind of partial order constituting a concept hierarchy. The approach is evaluated by comparing the resulting concept hierarchies with hand-crafted taxonomies for two domains: tourism and finance. We also directly compare our approach with hierarchical agglomerative clustering as well as with Bi-Section-KMeans as an instance of a divisive clustering algorithm. Furthermore, we investigate the impact of using different measures weighting the contribution of each attribute as well as of applying a particular smoothing technique to cope with data sparseness. | ||
|VG Wort-Seiten= | |VG Wort-Seiten= | ||
− | |Download= | + | |Download=2005_977_Cimiano_Learning_Concep_1.pdf, 2005_977_Cimiano_Learning_Concep_2.ps |
|Link=http://www.jair.org/contents/v24.html | |Link=http://www.jair.org/contents/v24.html | ||
− | + | |Projekt=Dot.Kom, SmartWeb, | |
− | |Projekt= | ||
|Forschungsgruppe= | |Forschungsgruppe= | ||
+ | }} | ||
+ | {{Forschungsgebiet Auswahl | ||
+ | |Forschungsgebiet=Ontology Learning | ||
}} | }} |
Version vom 15. August 2009, 17:01 Uhr
Learning Concept Hierarchies from Text Corpora using Formal Concept Anaylsis
Learning Concept Hierarchies from Text Corpora using Formal Concept Anaylsis
Veröffentlicht: 2005 August
Journal: Journal of Artificial Intelligence Research (JAIR)
Seiten: 305-339
Volume: 24
Referierte Veröffentlichung
Kurzfassung
bstract: We present a novel approach to the automatic acquisition of taxonomies or concept hierarchies from a text corpus. The approach is based on Formal Concept Analysis (FCA), a method mainly used for the analysis of data, i.e. for investigating and processing explicitly given information. We follow Harris' distributional hypothesis and model the context of a certain term as a vector representing syntactic dependencies which are automatically acquired from the text corpus with a linguistic parser. On the basis of this context information, FCA produces a lattice that we convert into a special kind of partial order constituting a concept hierarchy. The approach is evaluated by comparing the resulting concept hierarchies with hand-crafted taxonomies for two domains: tourism and finance. We also directly compare our approach with hierarchical agglomerative clustering as well as with Bi-Section-KMeans as an instance of a divisive clustering algorithm. Furthermore, we investigate the impact of using different measures weighting the contribution of each attribute as well as of applying a particular smoothing technique to cope with data sparseness.
Download: Media:2005_977_Cimiano_Learning_Concep_1.pdf,Media:2005_977_Cimiano_Learning_Concep_2.ps
Weitere Informationen unter: Link