Betreuer: Harald Sack, Mehwish Alam, Genet Asefa Gesese
Forschungsgruppe: Information Service Engineering
Partner: FIZ Karlsruhe
Beginn: 01. Oktober 2020
Many real world Knowledge Graphs (KGs) such as Wikidata are susceptible to having unbalanced distributions of people of different genders, ethnicities, religions, and nationalities . Biases due to such unbalanced distributions are then propagated to the Deep Learning models making use of KGs such as embeddings (KGEs) that are trained on those KGs. A very recent study  has demonstrated that due to such unequal distribution in Wikidata, biases related to professions are seen in KGEs. These biases when encoded in KGEs are harmful to downstream tasks such as machine translation that use such KGEs. The study  measures biases in Wikidata and Freebase KGs considering one relation (gender, ethnicity, religion, or nationality) at a time. As an example, in terms of gender bias, men are more likely to be bankers and women more likely to be homekeepers. In this thesis, differently from , the emphasis would be on combining the different relations and measuring their effect on creating biases related to profession. For instance, answering questions such as: what are the most likely professions of people who are
-male and from Germany?
-female and from USA?
This thesis will be supervised by Prof. Dr. Harald Sack, Genet Asefa Gesese, and Dr. Mehwish Alam, Information Service Engineering at Institute AIFB, KIT, in collaboration with FIZ Karlsruhe.
Which prerequisites should you have?
Very Good programming skills in Python
Interest in Machine/Deep Learning technologies
Ausschreibung: Download (pdf)