|
|
--- |
|
|
language: |
|
|
- multilingual |
|
|
license: mit |
|
|
tags: |
|
|
- hate-speech |
|
|
- classification |
|
|
- transformer |
|
|
- xlmr |
|
|
- multilingual-hate-speech |
|
|
datasets: |
|
|
- HateXplain |
|
|
- HateCheck |
|
|
model-index: |
|
|
- name: Multilingual Hate Speech Detection (WhiterBB) |
|
|
results: [] |
|
|
--- |
|
|
|
|
|
# Multilingual Hate Speech Detection - XLM-RoBERTa |
|
|
|
|
|
This is a fine-tuned version of **XLM-RoBERTa Base** trained for multilingual hate speech detection in **Spanish πͺπΈ, English π¬π§, and French π«π·**. |
|
|
It is part of a master's thesis project focused on real-time detection of hate in videos and transcripts. |
|
|
|
|
|
## π§ Intended Use |
|
|
|
|
|
This model is designed to work with short- to medium-length text snippets extracted from video subtitles or transcripts. |
|
|
It returns a binary classification (`hate` or `not hate`) with a probability score for further analysis. |
|
|
|
|
|
## π Training Data |
|
|
|
|
|
This model was fine-tuned on a **custom multilingual dataset** composed of selected and preprocessed samples from **multiple public corpora** and **custom-curated sets**. The training set was carefully constructed to achieve **language balance** and mitigate **demographic bias** in hate speech detection. |
|
|
|
|
|
| Source Dataset | Language(s) | Description | |
|
|
|----------------|-------------|-------------| |
|
|
| [`manueltonneau/spanish-hate-speech-superset`](https://huggingface.co/datasets/manueltonneau/spanish-hate-speech-superset) | Spanish πͺπΈ | Aggregated Spanish hate speech datasets. | |
|
|
| [`manueltonneau/english-hate-speech-superset`](https://huggingface.co/datasets/manueltonneau/english-hate-speech-superset) | English π¬π§ | Extensive superset with over 300k samples from English corpora. | |
|
|
| [`manueltonneau/french-hate-speech-superset`](https://huggingface.co/datasets/manueltonneau/french-hate-speech-superset) | French π«π· | Curated superset from multiple French datasets. | |
|
|
| `HateCheck` | English (original) + Spanish + French π | Translated into Spanish and French to test multilingual generalization and error cases. | |
|
|
| `Custom Bias Correction Dataset` | Multilingual π | Designed to mitigate gender, racial, and cultural bias in predictions. | |
|
|
|
|
|
> π§© The final dataset consists of **~60,000 balanced samples**, with **comparable representation across Spanish, English, and French**, ensuring no language dominates the training phase. |
|
|
|
|
|
This balancing process involved **sampling**, **filtering**, and **label unification** from larger sources. The result is a compact, diverse, and inclusive dataset designed to generalize across cultures and languages while avoiding common pitfalls in hate speech modeling. |
|
|
|
|
|
|
|
|
## π How to use |
|
|
|
|
|
```python |
|
|
from transformers import AutoTokenizer, AutoModelForSequenceClassification |
|
|
import torch.nn.functional as F |
|
|
import torch |
|
|
|
|
|
model = AutoModelForSequenceClassification.from_pretrained("WhiterBB/multilingual-hatespeech-detection") |
|
|
tokenizer = AutoTokenizer.from_pretrained("WhiterBB/multilingual-hatespeech-detection") |
|
|
|
|
|
text = "Je dΓ©teste cette personne" |
|
|
inputs = tokenizer(text, return_tensors="pt") |
|
|
with torch.no_grad(): |
|
|
logits = model(**inputs).logits |
|
|
probs = F.softmax(logits, dim=-1) |
|
|
predicted_class = torch.argmax(probs).item() |
|
|
confidence = probs[0][predicted_class].item() |
|
|
|
|
|
label = "Hate" if predicted_class == 1 else "Not Hate" |
|
|
print(f"{label} ({confidence:.2%})") |
|
|
``` |
|
|
|
|
|
## π§ͺ Metrics |
|
|
|
|
|
The model was evaluated on a balanced multilingual dataset consisting of over 56,000 examples. Below are the performance metrics: |
|
|
|
|
|
| Class | Precision | Recall | F1-score | Support | |
|
|
|-----------|-----------|--------|----------|---------| |
|
|
| Not Hate | 0.85 | 0.83 | 0.84 | 30,352 | |
|
|
| Hate | 0.81 | 0.83 | 0.82 | 26,609 | |
|
|
|
|
|
**Overall Accuracy:** 0.83 |
|
|
**Macro Average:** Precision: 0.83, Recall: 0.83, F1-score: 0.83 |
|
|
**Weighted Average:** Precision: 0.83, Recall: 0.83, F1-score: 0.83 |
|
|
|
|
|
## π License |
|
|
|
|
|
MIT License β feel free to use for academic and non-commercial projects. |
|
|
|
|
|
## βοΈ Author |
|
|
|
|
|
Made with β€οΈ by [WhiterBB](https://github.com/WhiterBB) as part of a final master's thesis (TFM) in Artificial Intelligence. |