Aus Aifbportal
Wechseln zu:Navigation, Suche

Annotation Quality in Medical Image Annotation

Information on the Thesis

Type of Final Thesis: Bachelor, Master
Supervisor: Simon Warsinsky
Research Group: Critical Information Infrastructures

Archive Number: 4.685
Status of Thesis: Open
Date of start: 2023-08-02

Further Information


Problem: Nowadays, there is an increasing diffusion of machine learning (ML) models in healthcare. ML models can, for example, be applied as part of cognitive surgical robots or for automated diagnosis. With the rise of ML models, there is an increasing call for high-quality training data to train these models with. Poor-quality training data may affect ML model performance negatively, which may ultimately lead to negative influences on patients’ health. As most ML models in healthcare are supervised, they require training data, that is, training data to which relevant metadata (i.e. annotations) has been added. For example, a chest X-ray may be annotated with a 0 or 1 depending on whether the respective patient has a broken rib or not; or relevant anatomical structures (e.g., organs) can be traced. Annotation quality plays an important role in the training of ML models. However, what actually constitutes “good” annotation quality in light of training data for ML models in healthcare is not a trivial question to answer. In information systems (IS) literature, data quality is usually abstractly defined as “fitness for use” and then operationalized into various dimensions such as accuracy or consistency. Yet, in extant medical research, annotation quality is usually not viewed as granularly. Hence, it remains unclear (1) how different dimensions of annotation quality relate to ML model performance, and (2) what researchers, annotators, and medical experts actually mean, when they speak of annotation quality.

Objective(s): The objectives of this work are (1) to assess how annotation l quality is conceptualized in extant research on ML models in healthcare or medical image annotation and (2) to gain insights into relationships between individual annotation quality dimensions and ML model performance in healthcare. If meaningful, the work can be situated into a specific application context for ML models in healthcare (e.g., cognitive surgical robots).

Method(s): I welcome both conceptual approaches (e.g., literature review) as well as empirical work (e.g., surveys, expert interviews). A method can be discussed during an early meeting.


- Batini, C., Cappiello, C., Francalanci, C., & Maurino, A. (2009). Methodologies for data quality assessment and improvement. ACM computing surveys (CSUR), 41(3), 1-52.

- Thomas M. Ward, Danyal M. Fer, Yutong Ban, Guy Rosman, Ozanan R. Meireles & Daniel A. Hashimoto (2021) Challenges in surgical video annotation, Computer Assisted Surgery, 26:1, 58-68.

- Meireles, Ozanan R., et al. "SAGES consensus recommendations on an annotation framework for surgical video." Surgical endoscopy 35.9 (2021): 4918-4929.