File size: 8,775 Bytes
e44768c |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 |
---
license: llama3.2
base_model: meta-llama/Llama-3.2-11B-Vision-Instruct
datasets:
- QCRI/MemeXplain
language:
- en
- ar
pipeline_tag: image-text-to-text
tags:
- meme-detection
- propaganda
- hate-speech
- multimodal
- vision-language
- explainability
library_name: transformers
---
# MemeIntel: Explainable Detection of Propagandistic and Hateful Memes
MemeIntel is a Vision-Language Model fine-tuned from [meta-llama/Llama-3.2-11B-Vision-Instruct](https://huggingface.co/meta-llama/Llama-3.2-11B-Vision-Instruct) for detecting propaganda in Arabic memes and hateful content in English memes, with explainable reasoning.
## Model Description
MemeIntel addresses the challenge of understanding and moderating complex, context-dependent multimodal content on social media. The model performs:
- **Label Detection**: Classifies memes into categories (propaganda/not-propaganda/not-meme/other for Arabic; hateful/not-hateful for English)
- **Explanation Generation**: Provides human-readable explanations for its predictions
The model was trained using a novel multi-stage optimization approach on the [MemeXplain](https://huggingface.co/datasets/QCRI/MemeXplain) dataset.
## Usage
```python
from transformers import MllamaForConditionalGeneration, AutoProcessor
from PIL import Image
# Load model and processor
model = MllamaForConditionalGeneration.from_pretrained(
"QCRI/MemeIntel",
torch_dtype=torch.bfloat16,
device_map="auto"
)
processor = AutoProcessor.from_pretrained("QCRI/MemeIntel")
# Load your meme image
image = Image.open("path/to/meme.jpg")
```
### Arabic Propaganda Meme Detection (Arabic Explanation)
```python
messages = [
{"role": "system", "content": "You are an expert social media image analyzer specializing in identifying propaganda in Arabic contexts."},
{"role": "user", "content": [
{"type": "image"},
{"type": "text", "text": "You are an expert social media image analyzer specializing in identifying propaganda in Arabic contexts. I will provide you with Arabic memes and the text extracted from these images. Your task is to classify the image as one of the following: 'propaganda', 'not-propaganda', 'not-meme', or 'other', and provide a brief explanation in Arabic. Start your response with 'Label:' followed by the classification label, then on a new line begin with 'Explanation:' and briefly state your reasoning. Text extracted: لما يقولي انتي مالكيش عزيز\nاعز ما ليا البطاطس المقلية"}
]}
]
input_text = processor.apply_chat_template(messages, add_generation_prompt=True)
inputs = processor(image, input_text, add_special_tokens=False, return_tensors="pt").to(model.device)
output = model.generate(**inputs, max_new_tokens=256)
print(processor.decode(output[0], skip_special_tokens=True))
```
### Arabic Propaganda Meme Detection (English Explanation)
```python
messages = [
{"role": "system", "content": "You are an expert social media image analyzer specializing in identifying propaganda in Arabic contexts."},
{"role": "user", "content": [
{"type": "image"},
{"type": "text", "text": "You are an expert social media image analyzer specializing in identifying propaganda in Arabic contexts. I will provide you with Arabic memes and the text extracted from these images. Your task is to classify the image as one of the following: 'propaganda', 'not-propaganda', 'not-meme', or 'other', and provide a brief explanation in English. Start your response with 'Label:' followed by the classification label, then on a new line begin with 'Explanation:' and briefly state your reasoning. Text extracted: وأنا أبكي\n٣\nانت تتمنى وانا البي\n{7"}
]}
]
input_text = processor.apply_chat_template(messages, add_generation_prompt=True)
inputs = processor(image, input_text, add_special_tokens=False, return_tensors="pt").to(model.device)
output = model.generate(**inputs, max_new_tokens=256)
print(processor.decode(output[0], skip_special_tokens=True))
```
### English Hateful Meme Detection
```python
messages = [
{"role": "system", "content": "You are an expert social media image analyzer specializing in identifying hateful content in memes"},
{"role": "user", "content": [
{"type": "image"},
{"type": "text", "text": "I will provide you with memes and the text extracted from these images. Your task is to classify the image as one of the following: 'hateful' or 'not-hateful' and provide a brief explanation. Start your response with 'Label:' followed by the classification label, then on a new line begin with 'Explanation:' and briefly state your reasoning. Text extracted: bows here, bows there, bows everywhere"}
]}
]
input_text = processor.apply_chat_template(messages, add_generation_prompt=True)
inputs = processor(image, input_text, add_special_tokens=False, return_tensors="pt").to(model.device)
output = model.generate(**inputs, max_new_tokens=256)
print(processor.decode(output[0], skip_special_tokens=True))
```
## Prompt Templates
### Arabic Meme (Arabic Explanation)
```
System: You are an expert social media image analyzer specializing in identifying propaganda in Arabic contexts.
User: You are an expert social media image analyzer specializing in identifying propaganda in Arabic contexts. I will provide you with Arabic memes and the text extracted from these images. Your task is to classify the image as one of the following: 'propaganda', 'not-propaganda', 'not-meme', or 'other', and provide a brief explanation in Arabic. Start your response with 'Label:' followed by the classification label, then on a new line begin with 'Explanation:' and briefly state your reasoning. Text extracted: {OCR_TEXT}
```
### Arabic Meme (English Explanation)
```
System: You are an expert social media image analyzer specializing in identifying propaganda in Arabic contexts.
User: You are an expert social media image analyzer specializing in identifying propaganda in Arabic contexts. I will provide you with Arabic memes and the text extracted from these images. Your task is to classify the image as one of the following: 'propaganda', 'not-propaganda', 'not-meme', or 'other', and provide a brief explanation in English. Start your response with 'Label:' followed by the classification label, then on a new line begin with 'Explanation:' and briefly state your reasoning. Text extracted: {OCR_TEXT}
```
### English Hateful Meme
```
System: You are an expert social media image analyzer specializing in identifying hateful content in memes
User: I will provide you with memes and the text extracted from these images. Your task is to classify the image as one of the following: 'hateful' or 'not-hateful' and provide a brief explanation. Start your response with 'Label:' followed by the classification label, then on a new line begin with 'Explanation:' and briefly state your reasoning. Text extracted: {OCR_TEXT}
```
## Expected Output Format
The model outputs in the following format:
```
Label: [classification_label]
Explanation: [reasoning for the classification]
```
## Training
- **Base Model**: [meta-llama/Llama-3.2-11B-Vision-Instruct](https://huggingface.co/meta-llama/Llama-3.2-11B-Vision-Instruct)
- **Training Dataset**: [QCRI/MemeXplain](https://huggingface.co/datasets/QCRI/MemeXplain)
- **Training Method**: Multi-stage optimization approach
## Performance
MemeIntel achieves state-of-the-art results:
- **ArMeme (Arabic Propaganda)**: ~3% absolute improvement over previous SOTA
- **Hateful Memes (English)**: ~7% absolute improvement over previous SOTA
## Citation
If you use this model, please cite:
```bibtex
@inproceedings{kmainasi-etal-2025-memeintel,
title = "{M}eme{I}ntel: Explainable Detection of Propagandistic and Hateful Memes",
author = "Kmainasi, Mohamed Bayan and
Hasnat, Abul and
Hasan, Md Arid and
Shahroor, Ali Ezzat and
Alam, Firoj",
editor = "Christodoulopoulos, Christos and
Chakraborty, Tanmoy and
Rose, Carolyn and
Peng, Violet",
booktitle = "Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing",
month = nov,
year = "2025",
address = "Suzhou, China",
publisher = "Association for Computational Linguistics",
url = "https://aclanthology.org/2025.emnlp-main.1539/",
doi = "10.18653/v1/2025.emnlp-main.1539",
pages = "30263--30279",
ISBN = "979-8-89176-332-6",
}
```
## License
This model is released under the [Llama 3.2 Community License](https://www.llama.com/llama3_2/license/).
## Authors
- Mohamed Bayan Kmainasi
- Abul Hasnat
- Md Arid Hasan
- Ali Ezzat Shahroor
- Firoj Alam
Qatar Computing Research Institute (QCRI), Hamad Bin Khalifa University
|