Home |  ENGLISH |  Kontakt |  Impressum |  Anmelden |  KIT

Inproceedings3598

Aus Aifbportal

Wechseln zu: Navigation, Suche

(This page contains COinS metadata)

Discovering Connotations as Labels for Weakly Supervised Image-Sentence Data




Published: 2018 April

Buchtitel: The Web Conference (Cognitive Computing Track)
Verlag: ACM
Nicht-referierte Veröffentlichung
BibTeX

Kurzfassung
Growth of multimodal content on the web and social media has generated abundant weakly aligned image-sentence pairs. However, it is hard to interpret them directly due to intrinsic “intension”. In this paper, we aim to annotate such image-sentence pairs with connotations as labels to capture the intrinsic “intension”. We achieve it with a connotation multimodal embedding model (CMEM) using a novel loss function. It’s unique characteristics over previous models include: (i) the exploitation of multimodal data as opposed to only visual information, (ii) robustness to outlier labels in a multi-label scenario and (iii) works effectively with large-scale weakly supervised data. With extensive quantitative evaluation, we exhibit the effectiveness of CMEM for detection of multiple labels over other state-of-the-art approaches. Also, we show that in addition to annotation of image-sentence pairs with connotation labels, byproduct of our model inherently supports cross-modal retrieval i.e. image query - sentence retrieval.

Download: Media:Ctp147-mogadalaA.pdf





Forschungsgruppe

Web Science


Forschungsgebiet
Maschinelles Lernen, Information Retrieval, WWW Systeme


-->