Stage-oe-small.jpg

Inproceedings3533: Unterschied zwischen den Versionen

Aus Aifbportal
Wechseln zu:Navigation, Suche
(Die Seite wurde neu angelegt: „{{Publikation Erster Autor |ErsterAutorNachname=Färber |ErsterAutorVorname=Michael }} {{Publikation Author |Rank=2 |Author=Achim Rettinger }} {{Publikation Autho…“)
 
Zeile 22: Zeile 22:
 
|Abstract=While large Knowledge Graphs (KGs) already cover a broad range of domains to an extent sufficient for general use, they typically lack emerging entities that are just starting to attract the public interest.
 
|Abstract=While large Knowledge Graphs (KGs) already cover a broad range of domains to an extent sufficient for general use, they typically lack emerging entities that are just starting to attract the public interest.
 
This disqualifies such KGs for tasks like entity-based media monitoring, since a large portion of news inherently covers entities that have not been noted by the public before. Such entities are unlinkable, which ultimately means, they cannot be monitored in media streams. This is the first paper that thoroughly investigates all types of challenges that arise from out-of-KG entities for entity linking tasks. By large-scale analytics of news streams we quantify the importance of each challenge for real-world applications. We then propose a machine learning approach which tackles the most frequent but least investigated challenge, i.e., when entities are missing in the KG and cannot be considered by entity linking systems. We construct a publicly available benchmark data set based on English news articles and editing behavior on Wikipedia. Our experiments show that predicting whether an entity will be added to Wikipedia is challenging. However, we can reliably identify emerging entities that could be added to the KG according to Wikipedia’s own notability criteria.
 
This disqualifies such KGs for tasks like entity-based media monitoring, since a large portion of news inherently covers entities that have not been noted by the public before. Such entities are unlinkable, which ultimately means, they cannot be monitored in media streams. This is the first paper that thoroughly investigates all types of challenges that arise from out-of-KG entities for entity linking tasks. By large-scale analytics of news streams we quantify the importance of each challenge for real-world applications. We then propose a machine learning approach which tackles the most frequent but least investigated challenge, i.e., when entities are missing in the KG and cannot be considered by entity linking systems. We construct a publicly available benchmark data set based on English news articles and editing behavior on Wikipedia. Our experiments show that predicting whether an entity will be added to Wikipedia is challenging. However, we can reliably identify emerging entities that could be added to the KG according to Wikipedia’s own notability criteria.
 +
|Download=NovelEntityDetection EKAW2016.pdf,
 
|Projekt=XLiMe
 
|Projekt=XLiMe
 
|Forschungsgruppe=Web Science und Wissensmanagement
 
|Forschungsgruppe=Web Science und Wissensmanagement
 
}}
 
}}

Version vom 22. November 2016, 08:46 Uhr


On Emerging Entity Detection


On Emerging Entity Detection



Published: 2016 November

Buchtitel: Proceedings of the 20th International Conference on Knowledge Engineering and Knowledge Management (EKAW'16)
Verlag: Springer

Referierte Veröffentlichung

BibTeX

Kurzfassung
While large Knowledge Graphs (KGs) already cover a broad range of domains to an extent sufficient for general use, they typically lack emerging entities that are just starting to attract the public interest. This disqualifies such KGs for tasks like entity-based media monitoring, since a large portion of news inherently covers entities that have not been noted by the public before. Such entities are unlinkable, which ultimately means, they cannot be monitored in media streams. This is the first paper that thoroughly investigates all types of challenges that arise from out-of-KG entities for entity linking tasks. By large-scale analytics of news streams we quantify the importance of each challenge for real-world applications. We then propose a machine learning approach which tackles the most frequent but least investigated challenge, i.e., when entities are missing in the KG and cannot be considered by entity linking systems. We construct a publicly available benchmark data set based on English news articles and editing behavior on Wikipedia. Our experiments show that predicting whether an entity will be added to Wikipedia is challenging. However, we can reliably identify emerging entities that could be added to the KG according to Wikipedia’s own notability criteria.

Download: Media:NovelEntityDetection EKAW2016.pdf

Projekt

XLiMe



Forschungsgruppe

Web Science und Wissensmanagement


Forschungsgebiet

Information Retrieval, Informationsextraktion, Semantic Web