Stage-oe-small.jpg

Article977: Unterschied zwischen den Versionen

Aus Aifbportal
Wechseln zu:Navigation, Suche
K (Added from ontology)
 
K (Added from ontology)
Zeile 1: Zeile 1:
{{Publikation Author
 
|Rank=2
 
|Author=Andreas Hotho
 
}}
 
 
{{Publikation Author
 
{{Publikation Author
 
|Rank=3
 
|Rank=3
Zeile 10: Zeile 6:
 
|Rank=1
 
|Rank=1
 
|Author=Philipp Cimiano
 
|Author=Philipp Cimiano
 +
}}
 +
{{Publikation Author
 +
|Rank=2
 +
|Author=Andreas Hotho
 
}}
 
}}
 
{{Article
 
{{Article
Zeile 24: Zeile 24:
 
|Abstract=bstract: We present a novel approach to the automatic acquisition of taxonomies or concept hierarchies from a text corpus. The approach is based on Formal Concept Analysis (FCA), a method mainly used for the analysis of data, i.e. for investigating and processing explicitly given information. We follow Harris' distributional hypothesis and model the context of a certain term as a vector representing syntactic dependencies which are automatically acquired from the text corpus with a linguistic parser. On the basis of this context information, FCA produces a lattice that we convert into a special kind of partial order constituting a concept hierarchy. The approach is evaluated by comparing the resulting concept hierarchies with hand-crafted taxonomies for two domains: tourism and finance. We also directly compare our approach with hierarchical agglomerative clustering as well as with Bi-Section-KMeans as an instance of a divisive clustering algorithm. Furthermore, we investigate the impact of using different measures weighting the contribution of each attribute as well as of applying a particular smoothing technique to cope with data sparseness.
 
|Abstract=bstract: We present a novel approach to the automatic acquisition of taxonomies or concept hierarchies from a text corpus. The approach is based on Formal Concept Analysis (FCA), a method mainly used for the analysis of data, i.e. for investigating and processing explicitly given information. We follow Harris' distributional hypothesis and model the context of a certain term as a vector representing syntactic dependencies which are automatically acquired from the text corpus with a linguistic parser. On the basis of this context information, FCA produces a lattice that we convert into a special kind of partial order constituting a concept hierarchy. The approach is evaluated by comparing the resulting concept hierarchies with hand-crafted taxonomies for two domains: tourism and finance. We also directly compare our approach with hierarchical agglomerative clustering as well as with Bi-Section-KMeans as an instance of a divisive clustering algorithm. Furthermore, we investigate the impact of using different measures weighting the contribution of each attribute as well as of applying a particular smoothing technique to cope with data sparseness.
 
|VG Wort-Seiten=
 
|VG Wort-Seiten=
|Link PDF=http://www.aifb.uni-karlsruhe.de/WBS/pci/Publications/cimiano05.pdf
+
|Download=2005_977_Cimiano_Learning Concep_1.pdf, 2005_977_Cimiano_Learning Concep_2.ps
 
|Link=http://www.jair.org/contents/v24.html
 
|Link=http://www.jair.org/contents/v24.html
|Downloadlink PDF=http://www.aifb.uni-karlsruhe.de/WBS/pci/Publications/cimiano05.pdf
 
|Downloadlink PS=http://www.aifb.uni-karlsruhe.de/WBS/pci/Publications/cimiano05.ps
 
|Link extern=
 
 
|Forschungsgebiet=Ontology Learning,  
 
|Forschungsgebiet=Ontology Learning,  
|Projekt=Dot.Kom, SmartWeb,  
+
|Projekt=SmartWeb, Dot.Kom,  
 
|Forschungsgruppe=
 
|Forschungsgruppe=
 
}}
 
}}

Version vom 7. August 2009, 12:59 Uhr


Learning Concept Hierarchies from Text Corpora using Formal Concept Anaylsis


Learning Concept Hierarchies from Text Corpora using Formal Concept Anaylsis



Veröffentlicht: 2005 August

Journal: Journal of Artificial Intelligence Research (JAIR)

Seiten: 305-339

Volume: 24


Referierte Veröffentlichung

BibTeX




Kurzfassung
bstract: We present a novel approach to the automatic acquisition of taxonomies or concept hierarchies from a text corpus. The approach is based on Formal Concept Analysis (FCA), a method mainly used for the analysis of data, i.e. for investigating and processing explicitly given information. We follow Harris' distributional hypothesis and model the context of a certain term as a vector representing syntactic dependencies which are automatically acquired from the text corpus with a linguistic parser. On the basis of this context information, FCA produces a lattice that we convert into a special kind of partial order constituting a concept hierarchy. The approach is evaluated by comparing the resulting concept hierarchies with hand-crafted taxonomies for two domains: tourism and finance. We also directly compare our approach with hierarchical agglomerative clustering as well as with Bi-Section-KMeans as an instance of a divisive clustering algorithm. Furthermore, we investigate the impact of using different measures weighting the contribution of each attribute as well as of applying a particular smoothing technique to cope with data sparseness.

Download: Media:2005_977_Cimiano_Learning Concep_1.pdf,Media:2005_977_Cimiano_Learning Concep_2.ps
Weitere Informationen unter: Link

Projekt

SmartWebDot.Kom



Forschungsgebiet

Ontology Learning