TruthfulLM/en

TruthfulLM: Verifying and Ensuring Truthfulness in Large Language Models

Contact: Michael Färber, Nicholas Popovic

Project Status: active

Description

This research project focuses on improving the factual correctness of text generated by language models such as ChatGPT. The current approach used to improve the quality of generated text is reinforcement learning from human feedback (RLHF), which does not necessarily optimize for factual accuracy and indirectly addresses the issue of hallucination. The risk of solely relying on RLHF to develop better models is that it may inadvertently allow misinformation to appear legitimate rather than avoiding it. Therefore, the central objective of this project is to develop and evaluate methods that continuously check the output of language models for factual correctness and automatically correct any inaccuracies. The proposed approach builds on a previous micro-project by Aleph Alpha and KIT-AIFB that involved extracting structured information from text and comparing it with a knowledge graph to verify the accuracy of generated text. In case of hallucination, the method corrects any inaccuracies using knowledge graph-based decoding strategies. This approach can be applied to pre-trained language models without further training, which significantly increases efficiency and applicability, as training is the most energy- and cost-intensive part of model development.

Involved Persons

Michael Färber, Nicholas Popovic

Information

from: 1 Juli 2023
until: 31 Dezember 2023
Funding: BMBF
Predecessing Project: KD4RE

Partners

Aleph Alpha

Research Group

Web Science

Area of Research

Knowledge Representation And Reasoning, Natural Language Processing, Artificial Intelligence

Publications Belonging to the Project

article