Stage-oe-small.jpg

Thema4983

Aus Aifbportal
Wechseln zu:Navigation, Suche



An Extensive Survey of Automatic Evaluation Metrics for Text generation




Informationen zur Arbeit

Abschlussarbeitstyp: Bachelor, Master
Betreuer: Shuzhou Yuan
Forschungsgruppe: Web Science

Archivierungsnummer: 4983
Abschlussarbeitsstatus: Offen
Beginn: 14. Dezember 2022
Abgabe: unbekannt

Weitere Informationen

Background

Natural Language Generation (NLG) is deemed as the task of generating text from various input including graphs, text, speech, etc. [1] With the breakthrough of pre-trained language model like ChatGPT [2], how to evaluate the quality of machine generated text has aroused the interest in the research community of artificial intelli-gence. Due to the huge cost of human judgement, the automatic evaluation metrics of machine translation are widely used for most text generation tasks, e.g.: graph to text generation.

Goal

In this work, you’re supposed to conduct an extensive survey of the evaluation met-rics for text generation tasks, i.e.: BLEU, METEOR, ROUGE. Based on the implemen-tation of the metrics, you have the chance to evaluate the quality of the machine gen-erated text and analyze the reliability of the metrics. An example of the research about evaluation metrics can be found in [3].


Prerequisites

• Solid programming skills (e.g. Python).

• Strong interest in natural language processing, especially nature language generation.

• Experience in pre-trained language models or HuggingFace library is a plus.


[1] https://www.jair.org/index.php/jair/article/view/11173/26378

[2] https://openai.com/blog/chatgpt/

[2] https://arxiv.org/pdf/2107.10821.pdf