Inproceedings3598: Unterschied zwischen den Versionen
Nk6388 (Diskussion | Beiträge) |
Cj2486 (Diskussion | Beiträge) |
||
(3 dazwischenliegende Versionen von 2 Benutzern werden nicht angezeigt) | |||
Zeile 16: | Zeile 16: | ||
}} | }} | ||
{{Inproceedings | {{Inproceedings | ||
− | |Referiert= | + | |Referiert=True |
+ | |BibTex-ID=www2018 | ||
|Title=Discovering Connotations as Labels for Weakly Supervised Image-Sentence Data | |Title=Discovering Connotations as Labels for Weakly Supervised Image-Sentence Data | ||
|Year=2018 | |Year=2018 | ||
|Month=April | |Month=April | ||
− | |Booktitle=The Web Conference | + | |Booktitle=WWW'18: Proceedings of The Web Conference 2018, Lyon, France, April 2018 |
+ | |Pages=379-386 | ||
|Publisher=ACM | |Publisher=ACM | ||
}} | }} | ||
{{Publikation Details | {{Publikation Details | ||
− | |Abstract= | + | |Abstract=Growth of multimodal content on the web and social media has |
− | + | generated abundant weakly aligned image-sentence pairs. However, it is hard to interpret them directly due to intrinsic “intension”. In this paper, we aim to annotate such image-sentence pairs with connotations as labels to capture the intrinsic “intension”. We achieve it with a connotation multimodal embedding model (CMEM) using a novel loss function. It’s unique characteristics over previous models include: (i) the exploitation of multimodal data as opposed to only visual information, (ii) robustness to outlier labels in a multi-label scenario and (iii) works effectively with large-scale weakly supervised data. With extensive quantitative evaluation, we exhibit the effectiveness of CMEM for detection of multiple labels over other state-of-the-art approaches. Also, we show that in addition to annotation of image-sentence pairs with connotation labels, byproduct of our model inherently supports cross-modal retrieval i.e. image query - sentence retrieval. | |
− | + | |ISBN=978-1-4503-5640-4 | |
− | connotations | + | |Link=https://dl.acm.org/citation.cfm?id=3184558.3186352 |
− | + | |DOI Name=10.1145/3184558.3186352 | |
− | novel loss function. | ||
− | include (i) | ||
− | information, (ii) | ||
− | and (iii) works | ||
− | extensive quantitative evaluation, we exhibit the effectiveness of | ||
− | CMEM for detection of multiple labels over other state-of-the-art | ||
− | approaches. Also, we show that in addition to annotation of | ||
− | with connotation labels, | ||
− | supports cross-modal retrieval. | ||
− | | | ||
|Forschungsgruppe=Web Science | |Forschungsgruppe=Web Science | ||
}} | }} | ||
Zeile 49: | Zeile 41: | ||
{{Forschungsgebiet Auswahl | {{Forschungsgebiet Auswahl | ||
|Forschungsgebiet=WWW Systeme | |Forschungsgebiet=WWW Systeme | ||
+ | }} | ||
+ | {{Forschungsgebiet Auswahl | ||
+ | |Forschungsgebiet=Künstliche Intelligenz | ||
}} | }} |
Aktuelle Version vom 28. August 2018, 13:45 Uhr
Discovering Connotations as Labels for Weakly Supervised Image-Sentence Data
Discovering Connotations as Labels for Weakly Supervised Image-Sentence Data
Published: 2018
April
Buchtitel: WWW'18: Proceedings of The Web Conference 2018, Lyon, France, April 2018
Seiten: 379-386
Verlag: ACM
Referierte Veröffentlichung
BibTeX
Kurzfassung
Growth of multimodal content on the web and social media has
generated abundant weakly aligned image-sentence pairs. However, it is hard to interpret them directly due to intrinsic “intension”. In this paper, we aim to annotate such image-sentence pairs with connotations as labels to capture the intrinsic “intension”. We achieve it with a connotation multimodal embedding model (CMEM) using a novel loss function. It’s unique characteristics over previous models include: (i) the exploitation of multimodal data as opposed to only visual information, (ii) robustness to outlier labels in a multi-label scenario and (iii) works effectively with large-scale weakly supervised data. With extensive quantitative evaluation, we exhibit the effectiveness of CMEM for detection of multiple labels over other state-of-the-art approaches. Also, we show that in addition to annotation of image-sentence pairs with connotation labels, byproduct of our model inherently supports cross-modal retrieval i.e. image query - sentence retrieval.
ISBN: 978-1-4503-5640-4
Weitere Informationen unter: Link
DOI Link: 10.1145/3184558.3186352
Information Retrieval, Maschinelles Lernen, Künstliche Intelligenz, WWW Systeme