File size: 5,055 Bytes
0d2210c e41450b 0d2210c be4ba9c 5a2ffb6 7199ae6 5a2ffb6 be4ba9c 3db455d be4ba9c 2843a50 0ffdf9b 3db455d 4d3a8b2 2e4fe12 4d3a8b2 2e4fe12 4d3a8b2 798d1aa 4d3a8b2 2e4fe12 4d3a8b2 2e4fe12 4d3a8b2 2e4fe12 4d3a8b2 2e4fe12 4d3a8b2 2e4fe12 4d3a8b2 2e4fe12 4d3a8b2 2e4fe12 4d3a8b2 2e4fe12 4d3a8b2 2e4fe12 4d3a8b2 be4ba9c eea982c be4ba9c 1d2811e 19736a9 c99b934 4884097 eb777e4 1d2811e c99b934 1d2811e c99b934 1d2811e |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 |
---
base_model: unsloth/llama-3.2-11b-vision-instruct-unsloth-bnb-4bit
tags:
- text-generation-inference
- transformers
- unsloth
- mllama
license: apache-2.0
language:
- en
---
# Llama3.2-11B based Hate Detection in Arabic MultiModal Memes
The rise of social media and online communication platforms has led to the spread of Arabic memes as a key form of digital expression.
While these contents can be humorous and informative, they are also increasingly being used to spread offensive language and hate speech.
Consequently, there is a growing demand for precise analysis of content in Arabic meme.
This work used Llama 3.2 with its vision capability to effectively identify hate content within Arabic memes.
The evaluation is conducted using a dataset of Arabic memes proposed in the ArabicNLP MAHED 2025 challenge.
The results underscore the capacity of ***Llama 3.2-11B fine-tuned with Arabic memes***, to deliver the superior performance.
They achieve **accuracy** of **80.3%** and **macro F1 score** of **73.3%**.
The proposed solutions offer a more nuanced understanding of memes for accurate and efficient Arabic content moderation systems.
# Examples of Arabic Memes from ArabicNLP MAHED 2025 challenge
# Examples
| | | |
|:-------------------------:|:-------------------------:|:-------------------------:|
|<img width="500" height="500" src="https://cdn-uploads.huggingface.co/production/uploads/656ee240c5ac4733e9ccdd0e/jBuVCt5163WlugFRXkSgq.jpeg"> |<img width="500" height="500" src="https://cdn-uploads.huggingface.co/production/uploads/656ee240c5ac4733e9ccdd0e/jiPId6f5IiGXxpI898llC.jpeg"> |
|<img width="500" height="500" src="https://cdn-uploads.huggingface.co/production/uploads/656ee240c5ac4733e9ccdd0e/61acyltUsTB--ZOAMkv0a.jpeg"> |<img width="500" height="500" src="https://cdn-uploads.huggingface.co/production/uploads/656ee240c5ac4733e9ccdd0e/_alSRnwG0azE_iYq2BrpP.jpeg"> |
``` python
import pandas as pd
import os
from unsloth import FastVisionModel
import torch
from datasets import load_dataset
from transformers import TextStreamer
from PIL import Image
import os
os.environ["TOKENIZERS_PARALLELISM"] = "false"
model_name = "NYUAD-ComNets/Llama3.2_MultiModal_Memes_Hate_Detector"
model, tokenizer = FastVisionModel.from_pretrained(model_name, token='xxxxxxxxxxxxxxxxxxxxxx')
FastVisionModel.for_inference(model)
dataset_test = load_dataset("QCRI/Prop2Hate-Meme", split = "test")
print(dataset_test)
def add_labels_column(example):
example["labels"] = "no_hate" if example["hate_label"] == 0 else "hate"
return example
dataset_test = dataset_test.map(add_labels_column)
pred=[]
for k in range(606):
image = dataset_test[k]["image"]
text = dataset_test[k]["text"]
lab = dataset_test[k]["labels"]
messages = [
{"role": "user", "content": [
{"type": "image"},
{"type": "text", "text": text}
]}
]
input_text = tokenizer.apply_chat_template(messages,add_generation_prompt = True)
inputs = tokenizer(
image,
input_text,
add_special_tokens = False,
return_tensors = "pt",
).to("cuda")
text_streamer = TextStreamer(tokenizer, skip_prompt = True)
p = model.generate(**inputs, streamer = text_streamer, max_new_tokens = 128,
use_cache = False, temperature = 0.3, min_p = 0.3)
p = tokenizer.decode(p[0], skip_special_tokens=True)
pred.append(p.split('assistant')[1].strip())
print(pred)
```

We used Low-Rank Adaptation (LoRA) as the Parameter-Efficient Fine-Tuning (PEFT) method for fine-tuning utilizing the unsloth framework.
The hyper-parameters of Llama 3.2-11B are as follows:
the training batch size per device is set to 4.
gradients are accumulated over 4 steps.
the learning rate warm-up lasts for 5 steps.
the total number of training steps is 150.
the learning rate is set to 0.0002.
the optimizer used is 8-bit AdamW
weight decay is set to 0.01.
a linear learning rate scheduler is used.
# BibTeX entry and citation info
```
@inproceedings{aldahoul2025nyuad,
title={NYUAD at MAHED Shared Task: Detecting Hope, Hate, and Emotion in Arabic Textual Speech and Multi-modal Memes Using Large Language Models},
author={Aldahoul, Nouar and Zaki, Yasir},
booktitle={Proceedings of The Third Arabic Natural Language Processing Conference: Shared Tasks},
pages={575--584},
year={2025}
}
@misc{aldahoul2025detectinghopehateemotion,
title={Detecting Hope, Hate, and Emotion in Arabic Textual Speech and Multi-modal Memes Using Large Language Models},
author={Nouar AlDahoul and Yasir Zaki},
year={2025},
eprint={2508.15810},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2508.15810},
}
```
|