File size: 5,055 Bytes
0d2210c
 
 
 
 
 
 
 
 
 
 
 
e41450b
0d2210c
be4ba9c
 
 
 
 
 
5a2ffb6
7199ae6
5a2ffb6
be4ba9c
 
 
 
3db455d
be4ba9c
2843a50
 
 
 
0ffdf9b
 
3db455d
 
4d3a8b2
2e4fe12
4d3a8b2
 
 
 
 
 
2e4fe12
 
 
 
 
 
4d3a8b2
798d1aa
4d3a8b2
2e4fe12
 
 
4d3a8b2
 
 
 
 
 
 
2e4fe12
 
 
4d3a8b2
 
 
 
 
 
2e4fe12
4d3a8b2
 
 
 
 
 
 
2e4fe12
4d3a8b2
 
 
 
 
 
 
 
 
2e4fe12
4d3a8b2
 
 
 
 
 
 
 
 
2e4fe12
4d3a8b2
2e4fe12
4d3a8b2
2e4fe12
4d3a8b2
 
2e4fe12
4d3a8b2
 
be4ba9c
 
eea982c
 
 
 
 
 
 
 
be4ba9c
 
 
 
 
 
 
 
 
 
 
1d2811e
 
 
 
 
 
19736a9
 
 
 
 
 
 
 
 
 
 
c99b934
4884097
 
 
eb777e4
1d2811e
c99b934
 
1d2811e
 
c99b934
1d2811e
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
---
base_model: unsloth/llama-3.2-11b-vision-instruct-unsloth-bnb-4bit
tags:
- text-generation-inference
- transformers
- unsloth
- mllama
license: apache-2.0
language:
- en
---

# Llama3.2-11B based Hate Detection in Arabic MultiModal Memes

The rise of social media and online communication platforms has led to the spread of Arabic memes as a key form of digital expression. 
While these contents can be humorous and informative, they are also increasingly being used to spread offensive language and hate speech.
Consequently, there is a growing demand for precise analysis of content in Arabic meme. 

This work used Llama 3.2 with its vision capability to effectively identify hate content within Arabic memes.
The evaluation is conducted using a dataset of Arabic memes proposed in the ArabicNLP MAHED 2025 challenge. 
The results underscore the capacity of ***Llama 3.2-11B fine-tuned with Arabic memes***, to deliver the superior performance. 

They achieve **accuracy** of **80.3%** and **macro F1 score** of **73.3%**.

The proposed solutions offer a more nuanced understanding of memes for accurate and efficient Arabic content moderation systems.
 

# Examples of Arabic Memes from ArabicNLP MAHED 2025 challenge

# Examples

| | | |
|:-------------------------:|:-------------------------:|:-------------------------:|
|<img width="500"  height="500" src="https://cdn-uploads.huggingface.co/production/uploads/656ee240c5ac4733e9ccdd0e/jBuVCt5163WlugFRXkSgq.jpeg"> |<img width="500" height="500" src="https://cdn-uploads.huggingface.co/production/uploads/656ee240c5ac4733e9ccdd0e/jiPId6f5IiGXxpI898llC.jpeg"> |  
|<img width="500"  height="500" src="https://cdn-uploads.huggingface.co/production/uploads/656ee240c5ac4733e9ccdd0e/61acyltUsTB--ZOAMkv0a.jpeg"> |<img width="500" height="500" src="https://cdn-uploads.huggingface.co/production/uploads/656ee240c5ac4733e9ccdd0e/_alSRnwG0azE_iYq2BrpP.jpeg"> |


``` python

import pandas as pd
import os
from unsloth import FastVisionModel
import torch
from datasets import load_dataset
from transformers import TextStreamer
from PIL import Image


import os
os.environ["TOKENIZERS_PARALLELISM"] = "false"


model_name = "NYUAD-ComNets/Llama3.2_MultiModal_Memes_Hate_Detector"

model, tokenizer = FastVisionModel.from_pretrained(model_name, token='xxxxxxxxxxxxxxxxxxxxxx')




FastVisionModel.for_inference(model)


dataset_test = load_dataset("QCRI/Prop2Hate-Meme", split = "test")


print(dataset_test)

def add_labels_column(example):
    example["labels"] = "no_hate" if example["hate_label"] == 0 else "hate"
    return example

dataset_test = dataset_test.map(add_labels_column)




pred=[]

for k in range(606):
        image = dataset_test[k]["image"]
        text = dataset_test[k]["text"]
        lab = dataset_test[k]["labels"]
        
        
        messages = [
           
            {"role": "user", "content": [
                {"type": "image"},
                {"type": "text", "text": text}
            ]}
        ]
         
        input_text = tokenizer.apply_chat_template(messages,add_generation_prompt = True)
        inputs = tokenizer(
            image,
            input_text,
            add_special_tokens = False,
            return_tensors = "pt",
        ).to("cuda")
        
        text_streamer = TextStreamer(tokenizer, skip_prompt = True)

        p = model.generate(**inputs, streamer = text_streamer, max_new_tokens = 128,
                           use_cache = False, temperature = 0.3, min_p = 0.3)

 
        p = tokenizer.decode(p[0], skip_special_tokens=True)

        pred.append(p.split('assistant')[1].strip())

print(pred)
```





![image/png](https://cdn-uploads.huggingface.co/production/uploads/656ee240c5ac4733e9ccdd0e/jRSB8JxqqoV-2E97N5QQM.png)



We used Low-Rank Adaptation (LoRA) as the Parameter-Efficient Fine-Tuning (PEFT) method for fine-tuning utilizing the unsloth framework.

The hyper-parameters of Llama 3.2-11B are as follows:

the training batch size per device is set to 4.
gradients are accumulated over 4 steps.
the learning rate warm-up lasts for 5 steps.
the total number of training steps is 150.
the learning rate is set to 0.0002.
the optimizer used is 8-bit AdamW
weight decay is set to 0.01.
a linear learning rate scheduler is used.

# BibTeX entry and citation info

```



@inproceedings{aldahoul2025nyuad,
  title={NYUAD at MAHED Shared Task: Detecting Hope, Hate, and Emotion in Arabic Textual Speech and Multi-modal Memes Using Large Language Models},
  author={Aldahoul, Nouar and Zaki, Yasir},
  booktitle={Proceedings of The Third Arabic Natural Language Processing Conference: Shared Tasks},
  pages={575--584},
  year={2025}
}


@misc{aldahoul2025detectinghopehateemotion,
      title={Detecting Hope, Hate, and Emotion in Arabic Textual Speech and Multi-modal Memes Using Large Language Models}, 
      author={Nouar AlDahoul and Yasir Zaki},
      year={2025},
      eprint={2508.15810},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2508.15810}, 
}


```