Home |  ENGLISH |  Kontakt |  Impressum |  Datenschutz |  Anmelden |  KIT

Dataless Short Text Classification for German Language: Unterschied zwischen den Versionen

Aus Aifbportal

Wechseln zu: Navigation, Suche
(Die Seite wurde neu angelegt: „{{Abschlussarbeit |Titel=Dataless Short Text Classification for German Language |Abschlussarbeitstyp=Master |Betreuer=Rima Türker; Harald Sack |Partner=FIZ Ka…“)
 
 
Zeile 8: Zeile 8:
 
|Beginn=2019/10/10
 
|Beginn=2019/10/10
 
|Ausschreibung=Dataless Text Classification.pdf
 
|Ausschreibung=Dataless Text Classification.pdf
 +
|Beschreibung DE=Short text categorization is an important task due to the rapid growth of online available shorttexts in various domains such as web search snippets, short messages etc. Recently, several supervised learning approaches have been proposed for short text classification. However, most of them require a significant amount of training data and manually labeling such data can bevery time-consuming and costly. Another characteristic of existing approaches is that they allsuffer from issues such as data sparsity, and insufficient text length. Moreover, due to the lack of contextual information, short texts can be highly ambiguous. Thus, short text classification is much more challenging in comparison to traditional long documents. Further, if the short text to be classified is not English text, the classification task gets even more challenging, because most of the the available resources on the Web such as text classification benchmarks are in English.
 
}}
 
}}

Aktuelle Version vom 10. Oktober 2019, 16:01 Uhr



Dataless Short Text Classification for German Language




Informationen zur Arbeit

Abschlussarbeitstyp: Master
Betreuer: Rima TürkerHarald Sack
Forschungsgruppe: Information Service Engineering
Partner: FIZ Karlsruhe
Archivierungsnummer: ess Short Text Classification for German Language„ess Short Text Classification for German Language“ ist keine Zahl.
Abschlussarbeitsstatus: Offen
Beginn: 10. Oktober 2019
Abgabe: unbekannt

Weitere Informationen

Short text categorization is an important task due to the rapid growth of online available shorttexts in various domains such as web search snippets, short messages etc. Recently, several supervised learning approaches have been proposed for short text classification. However, most of them require a significant amount of training data and manually labeling such data can bevery time-consuming and costly. Another characteristic of existing approaches is that they allsuffer from issues such as data sparsity, and insufficient text length. Moreover, due to the lack of contextual information, short texts can be highly ambiguous. Thus, short text classification is much more challenging in comparison to traditional long documents. Further, if the short text to be classified is not English text, the classification task gets even more challenging, because most of the the available resources on the Web such as text classification benchmarks are in English.


Ausschreibung: Download (pdf)