WhiterBB's picture
Update: README.md
79ee5a5
---
language:
- multilingual
license: mit
tags:
- hate-speech
- classification
- transformer
- xlmr
- multilingual-hate-speech
datasets:
- HateXplain
- HateCheck
model-index:
- name: Multilingual Hate Speech Detection (WhiterBB)
results: []
---
# Multilingual Hate Speech Detection - XLM-RoBERTa
This is a fine-tuned version of **XLM-RoBERTa Base** trained for multilingual hate speech detection in **Spanish πŸ‡ͺπŸ‡Έ, English πŸ‡¬πŸ‡§, and French πŸ‡«πŸ‡·**.
It is part of a master's thesis project focused on real-time detection of hate in videos and transcripts.
## 🧠 Intended Use
This model is designed to work with short- to medium-length text snippets extracted from video subtitles or transcripts.
It returns a binary classification (`hate` or `not hate`) with a probability score for further analysis.
## πŸ“Š Training Data
This model was fine-tuned on a **custom multilingual dataset** composed of selected and preprocessed samples from **multiple public corpora** and **custom-curated sets**. The training set was carefully constructed to achieve **language balance** and mitigate **demographic bias** in hate speech detection.
| Source Dataset | Language(s) | Description |
|----------------|-------------|-------------|
| [`manueltonneau/spanish-hate-speech-superset`](https://huggingface.co/datasets/manueltonneau/spanish-hate-speech-superset) | Spanish πŸ‡ͺπŸ‡Έ | Aggregated Spanish hate speech datasets. |
| [`manueltonneau/english-hate-speech-superset`](https://huggingface.co/datasets/manueltonneau/english-hate-speech-superset) | English πŸ‡¬πŸ‡§ | Extensive superset with over 300k samples from English corpora. |
| [`manueltonneau/french-hate-speech-superset`](https://huggingface.co/datasets/manueltonneau/french-hate-speech-superset) | French πŸ‡«πŸ‡· | Curated superset from multiple French datasets. |
| `HateCheck` | English (original) + Spanish + French 🌐 | Translated into Spanish and French to test multilingual generalization and error cases. |
| `Custom Bias Correction Dataset` | Multilingual 🌍 | Designed to mitigate gender, racial, and cultural bias in predictions. |
> 🧩 The final dataset consists of **~60,000 balanced samples**, with **comparable representation across Spanish, English, and French**, ensuring no language dominates the training phase.
This balancing process involved **sampling**, **filtering**, and **label unification** from larger sources. The result is a compact, diverse, and inclusive dataset designed to generalize across cultures and languages while avoiding common pitfalls in hate speech modeling.
## πŸ”Ž How to use
```python
from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch.nn.functional as F
import torch
model = AutoModelForSequenceClassification.from_pretrained("WhiterBB/multilingual-hatespeech-detection")
tokenizer = AutoTokenizer.from_pretrained("WhiterBB/multilingual-hatespeech-detection")
text = "Je dΓ©teste cette personne"
inputs = tokenizer(text, return_tensors="pt")
with torch.no_grad():
logits = model(**inputs).logits
probs = F.softmax(logits, dim=-1)
predicted_class = torch.argmax(probs).item()
confidence = probs[0][predicted_class].item()
label = "Hate" if predicted_class == 1 else "Not Hate"
print(f"{label} ({confidence:.2%})")
```
## πŸ§ͺ Metrics
The model was evaluated on a balanced multilingual dataset consisting of over 56,000 examples. Below are the performance metrics:
| Class | Precision | Recall | F1-score | Support |
|-----------|-----------|--------|----------|---------|
| Not Hate | 0.85 | 0.83 | 0.84 | 30,352 |
| Hate | 0.81 | 0.83 | 0.82 | 26,609 |
**Overall Accuracy:** 0.83
**Macro Average:** Precision: 0.83, Recall: 0.83, F1-score: 0.83
**Weighted Average:** Precision: 0.83, Recall: 0.83, F1-score: 0.83
## πŸ“„ License
MIT License – feel free to use for academic and non-commercial projects.
## ✍️ Author
Made with ❀️ by [WhiterBB](https://github.com/WhiterBB) as part of a final master's thesis (TFM) in Artificial Intelligence.