Home |  DEUTSCH |  Contact |  Imprint |  Data Protection |  Login |  KIT

Knowledge Based Short Text Categorization Using Entity and Category Embeddings

Aus Aifbportal

Wechseln zu: Navigation, Suche


Knowledge Based Short Text Categorization Using Entity and Category Embeddings




Published: 2019 Juni


BibTeX

Kurzfassung
Short text categorization is an important task due to therapid growth of online available short texts in various domains such asweb search snippets, etc. Most of the traditional methods suffer fromsparsity and shortness of the text. Moreover, supervised learning meth-ods require a significant amount of training data and manually labelingsuch data can be very time-consuming and costly. In this study, we pro-pose a novel probabilistic model for Knowledge-Based Short Text Cat-egorization (KBSTC), which does not require any labeled training datato classify a short text. This is achieved by leveraging entities and cat-egories from large knowledge bases, which are further embedded into acommon vector space, for which we propose a new entity and categoryembedding model. Given a short text, its category (e.g.Business,Sports,etc.) can then be derived based on the entities mentioned in the text byexploiting semantic similarity between entities and categories. To vali-date the effectiveness of the proposed method, we conducted experimentson two real-world datasets, i.e., AG News and Google Snippets. The ex-perimental results show that our approach significantly outperforms theclassification approaches which do not require any labeled data, while itcomes close to the results of the supervised approaches. Keywords:Short Text Classification, Dataless Text Classification, Net-work Embeddings

Download: Media:Knowledge Based Short Text Categorization Using Entity and Category Embedding



Forschungsgruppe

Information Service Engineering