Gimme The Context: Context-driven automatic semantic annotation with C-PANKOW
Published: 2005 Mai
Herausgeber: Allan Ellis, Tatsuya Hagino
Buchtitel: Proceedings of the 14th World Wide Web Conference
Seiten: 332 - 341
Verlag: ACM Press
Erscheinungsort: Chiba, Japan
Without the proliferation of formal semantic annotations, the Semantic Web is certainly doomed to failure. In earlier work we presented a new paradigm to avoid this: the 'Self Annotating Web', in which globally available knowledge is used to annotate resources such as web pages. In particular, we presented a concrete method instantiating this paradigm, called PANKOW (Pattern-based ANnotation through Knowledge On the Web). In PANKOW, a named entity to be annotated is put into several linguistic patterns that convey competing semantic meanings. The patterns that are matched most often on the Web indicate the meaning of the named entity --- leading to automatic or semi-automatic annotation. In this paper we present C-PANKOW (Context-driven PANKOW), which alleviates several shortcomings of PANKOW. First, by downloading abstracts and processing them off-line, we avoid the generation of large number of linguistic patterns and correspondingly large number of Google queries. Second, by linguistically analyzing and normalizing the downloaded abstracts, we increase the coverage of our pattern matching mechanism and overcome several limitations of the earlier pattern generation process. Third, we use the annotation context in order to distinguish the significance of a pattern match for the given annotation task. Our experiments show that C-PANKOW inherits all the advantages of PANKOW (no training required etc.), but in addition it is far more efficient and effective.