Home |  ENGLISH |  Kontakt |  Impressum |  Datenschutz |  Anmelden |  KIT

Inproceedings3787: Unterschied zwischen den Versionen

Aus Aifbportal

Wechseln zu: Navigation, Suche
 
Zeile 24: Zeile 24:
 
|Abstract=A major domain of research in natural language processing is named entity recognition and disambiguation (NERD). One of the main ways of attempting to achieve this goal is through use of Semantic Web technologies and its structured data formats. Due to the nature of structured data, information can be extracted more easily, therewith allowing for the creation of knowledge graphs. In order to properly evaluate a NERD system, gold standard data sets are required. A plethora of different evaluation data sets exists, mostly relying on either Wikipedia or DBpedia. Therefore, we have extended a widely-used gold standard data set, KORE 50, to not only accommodate NERD tasks for DBpedia, but also for YAGO, Wikidata and Crunchbase. As such, our data set, KORE 50 DYWC , allows for a broader spectrum of evaluation. Among others, the knowledge graph agnosticity of NERD systems may be evaluated which, to the best of our knowledge, was not possible until now for this number of knowledge graphs.
 
|Abstract=A major domain of research in natural language processing is named entity recognition and disambiguation (NERD). One of the main ways of attempting to achieve this goal is through use of Semantic Web technologies and its structured data formats. Due to the nature of structured data, information can be extracted more easily, therewith allowing for the creation of knowledge graphs. In order to properly evaluate a NERD system, gold standard data sets are required. A plethora of different evaluation data sets exists, mostly relying on either Wikipedia or DBpedia. Therefore, we have extended a widely-used gold standard data set, KORE 50, to not only accommodate NERD tasks for DBpedia, but also for YAGO, Wikidata and Crunchbase. As such, our data set, KORE 50 DYWC , allows for a broader spectrum of evaluation. Among others, the knowledge graph agnosticity of NERD systems may be evaluated which, to the best of our knowledge, was not possible until now for this number of knowledge graphs.
 
|Download=KORE50-DYWC_LREC2020.pdf
 
|Download=KORE50-DYWC_LREC2020.pdf
|Forschungsgruppe=Security • Usability • Society
+
|Forschungsgruppe=Web Science
 
}}
 
}}
 
{{Forschungsgebiet Auswahl
 
{{Forschungsgebiet Auswahl

Aktuelle Version vom 18. März 2020, 22:23 Uhr


KORE 50^DYWC: An Evaluation Data Set for Entity Linking Based on DBpedia, YAGO, Wikidata and Crunchbase




Published: 2020

Buchtitel: Proceedings of the 12th Conference on Language Resources and Evaluation (LREC'20)
Verlag: European Language Resources Association (ELRA)

Referierte Veröffentlichung

BibTeX


Kurzfassung
A major domain of research in natural language processing is named entity recognition and disambiguation (NERD). One of the main ways of attempting to achieve this goal is through use of Semantic Web technologies and its structured data formats. Due to the nature of structured data, information can be extracted more easily, therewith allowing for the creation of knowledge graphs. In order to properly evaluate a NERD system, gold standard data sets are required. A plethora of different evaluation data sets exists, mostly relying on either Wikipedia or DBpedia. Therefore, we have extended a widely-used gold standard data set, KORE 50, to not only accommodate NERD tasks for DBpedia, but also for YAGO, Wikidata and Crunchbase. As such, our data set, KORE 50 DYWC , allows for a broader spectrum of evaluation. Among others, the knowledge graph agnosticity of NERD systems may be evaluated which, to the best of our knowledge, was not possible until now for this number of knowledge graphs.

Download: Media:KORE50-DYWC_LREC2020.pdf


Verknüpfte Datasets

KORE 50^DYWC


Forschungsgruppe

Web Science


Forschungsgebiet

Information Retrieval, Informationsextraktion, Natürliche Sprachverarbeitung