Memes are used for spreading ideas through social networks. Although most memes are created for humor, some memes become hateful under the combination of pictures and text. Automatically detecting hateful memes can help reduce their harmful social impact. Compared to the conventional multimodal tasks, where the visual and textual information is semantically aligned, hateful memes detection is a more challenging task since the image and text in memes are weakly aligned or even irrelevant. Thus, it requires the model to have a deep understanding of the content and perform reasoning over multiple modalities. This paper focuses on multimodal hateful memes detection and proposes a novel method incorporating the image captioning process into the meme's detection process. We conduct extensive experiments on multimodal meme datasets and illustrate the effectiveness of our approach. Our model achieves promising results on the Hateful Memes Detection Challenge. Our code is made publicly available at GitHub.
Y. Zhou et al., "Multimodal Learning For Hateful Memes Detection," 2021 IEEE International Conference on Multimedia and Expo Workshops, ICMEW 2021, Institute of Electrical and Electronics Engineers, Jan 2021.
The definitive version is available at https://doi.org/10.1109/ICMEW53276.2021.9455994
Keywords and Phrases
Hateful Memes Detection; Multimodal
International Standard Book Number (ISBN)
Article - Conference proceedings
© 2023 Institute of Electrical and Electronics Engineers, All rights reserved.
01 Jan 2021