Dataless Short Text Classification for German Language
Betreuer: Rima Türker, Harald Sack
Forschungsgruppe: Information Service Engineering
Partner: FIZ Karlsruhe
Archivierungsnummer: ess Short Text Classification for German Language„ess Short Text Classification for German Language“ ist keine Zahl.
Beginn: 10. Oktober 2019
Short text categorization is an important task due to the rapid growth of online available shorttexts in various domains such as web search snippets, short messages etc. Recently, several supervised learning approaches have been proposed for short text classification. However, most of them require a significant amount of training data and manually labeling such data can bevery time-consuming and costly. Another characteristic of existing approaches is that they allsuffer from issues such as data sparsity, and insufficient text length. Moreover, due to the lack of contextual information, short texts can be highly ambiguous. Thus, short text classification is much more challenging in comparison to traditional long documents. Further, if the short text to be classified is not English text, the classification task gets even more challenging, because most of the the available resources on the Web such as text classification benchmarks are in English.
Ausschreibung: Download (pdf)