--- language: - multilingual license: mit tags: - hate-speech - classification - transformer - xlmr - multilingual-hate-speech datasets: - HateXplain - HateCheck model-index: - name: Multilingual Hate Speech Detection (WhiterBB) results: [] --- # Multilingual Hate Speech Detection - XLM-RoBERTa This is a fine-tuned version of **XLM-RoBERTa Base** trained for multilingual hate speech detection in **Spanish πŸ‡ͺπŸ‡Έ, English πŸ‡¬πŸ‡§, and French πŸ‡«πŸ‡·**. It is part of a master's thesis project focused on real-time detection of hate in videos and transcripts. ## 🧠 Intended Use This model is designed to work with short- to medium-length text snippets extracted from video subtitles or transcripts. It returns a binary classification (`hate` or `not hate`) with a probability score for further analysis. ## πŸ“Š Training Data This model was fine-tuned on a **custom multilingual dataset** composed of selected and preprocessed samples from **multiple public corpora** and **custom-curated sets**. The training set was carefully constructed to achieve **language balance** and mitigate **demographic bias** in hate speech detection. | Source Dataset | Language(s) | Description | |----------------|-------------|-------------| | [`manueltonneau/spanish-hate-speech-superset`](https://huggingface.co/datasets/manueltonneau/spanish-hate-speech-superset) | Spanish πŸ‡ͺπŸ‡Έ | Aggregated Spanish hate speech datasets. | | [`manueltonneau/english-hate-speech-superset`](https://huggingface.co/datasets/manueltonneau/english-hate-speech-superset) | English πŸ‡¬πŸ‡§ | Extensive superset with over 300k samples from English corpora. | | [`manueltonneau/french-hate-speech-superset`](https://huggingface.co/datasets/manueltonneau/french-hate-speech-superset) | French πŸ‡«πŸ‡· | Curated superset from multiple French datasets. | | `HateCheck` | English (original) + Spanish + French 🌐 | Translated into Spanish and French to test multilingual generalization and error cases. | | `Custom Bias Correction Dataset` | Multilingual 🌍 | Designed to mitigate gender, racial, and cultural bias in predictions. | > 🧩 The final dataset consists of **~60,000 balanced samples**, with **comparable representation across Spanish, English, and French**, ensuring no language dominates the training phase. This balancing process involved **sampling**, **filtering**, and **label unification** from larger sources. The result is a compact, diverse, and inclusive dataset designed to generalize across cultures and languages while avoiding common pitfalls in hate speech modeling. ## πŸ”Ž How to use ```python from transformers import AutoTokenizer, AutoModelForSequenceClassification import torch.nn.functional as F import torch model = AutoModelForSequenceClassification.from_pretrained("WhiterBB/multilingual-hatespeech-detection") tokenizer = AutoTokenizer.from_pretrained("WhiterBB/multilingual-hatespeech-detection") text = "Je dΓ©teste cette personne" inputs = tokenizer(text, return_tensors="pt") with torch.no_grad(): logits = model(**inputs).logits probs = F.softmax(logits, dim=-1) predicted_class = torch.argmax(probs).item() confidence = probs[0][predicted_class].item() label = "Hate" if predicted_class == 1 else "Not Hate" print(f"{label} ({confidence:.2%})") ``` ## πŸ§ͺ Metrics The model was evaluated on a balanced multilingual dataset consisting of over 56,000 examples. Below are the performance metrics: | Class | Precision | Recall | F1-score | Support | |-----------|-----------|--------|----------|---------| | Not Hate | 0.85 | 0.83 | 0.84 | 30,352 | | Hate | 0.81 | 0.83 | 0.82 | 26,609 | **Overall Accuracy:** 0.83 **Macro Average:** Precision: 0.83, Recall: 0.83, F1-score: 0.83 **Weighted Average:** Precision: 0.83, Recall: 0.83, F1-score: 0.83 ## πŸ“„ License MIT License – feel free to use for academic and non-commercial projects. ## ✍️ Author Made with ❀️ by [WhiterBB](https://github.com/WhiterBB) as part of a final master's thesis (TFM) in Artificial Intelligence.