multilingual-hatespeech-detection / README.md

Update: README.md

79ee5a5 7 months ago

4.11 kB

	---
	language:
	- multilingual
	license: mit
	tags:
	- hate-speech
	- classification
	- transformer
	- xlmr
	- multilingual-hate-speech
	datasets:
	- HateXplain
	- HateCheck
	model-index:
	- name: Multilingual Hate Speech Detection (WhiterBB)
	results: []
	---

	# Multilingual Hate Speech Detection - XLM-RoBERTa

	This is a fine-tuned version of XLM-RoBERTa Base trained for multilingual hate speech detection in Spanish 🇪🇸, English 🇬🇧, and French 🇫🇷.
	It is part of a master's thesis project focused on real-time detection of hate in videos and transcripts.

	## 🧠 Intended Use

	This model is designed to work with short- to medium-length text snippets extracted from video subtitles or transcripts.
	It returns a binary classification (`hate` or `not hate`) with a probability score for further analysis.

	## 📊 Training Data

	This model was fine-tuned on a custom multilingual dataset composed of selected and preprocessed samples from multiple public corpora and custom-curated sets. The training set was carefully constructed to achieve language balance and mitigate demographic bias in hate speech detection.

	\| Source Dataset \| Language(s) \| Description \|
	\|----------------\|-------------\|-------------\|
	\| [`manueltonneau/spanish-hate-speech-superset`](https://huggingface.co/datasets/manueltonneau/spanish-hate-speech-superset) \| Spanish 🇪🇸 \| Aggregated Spanish hate speech datasets. \|
	\| [`manueltonneau/english-hate-speech-superset`](https://huggingface.co/datasets/manueltonneau/english-hate-speech-superset) \| English 🇬🇧 \| Extensive superset with over 300k samples from English corpora. \|
	\| [`manueltonneau/french-hate-speech-superset`](https://huggingface.co/datasets/manueltonneau/french-hate-speech-superset) \| French 🇫🇷 \| Curated superset from multiple French datasets. \|
	\| `HateCheck` \| English (original) + Spanish + French 🌐 \| Translated into Spanish and French to test multilingual generalization and error cases. \|
	\| `Custom Bias Correction Dataset` \| Multilingual 🌍 \| Designed to mitigate gender, racial, and cultural bias in predictions. \|

	> 🧩 The final dataset consists of ~60,000 balanced samples, with comparable representation across Spanish, English, and French, ensuring no language dominates the training phase.

	This balancing process involved sampling, filtering, and label unification from larger sources. The result is a compact, diverse, and inclusive dataset designed to generalize across cultures and languages while avoiding common pitfalls in hate speech modeling.


	## 🔎 How to use

	```python
	from transformers import AutoTokenizer, AutoModelForSequenceClassification
	import torch.nn.functional as F
	import torch

	model = AutoModelForSequenceClassification.from_pretrained("WhiterBB/multilingual-hatespeech-detection")
	tokenizer = AutoTokenizer.from_pretrained("WhiterBB/multilingual-hatespeech-detection")

	text = "Je déteste cette personne"
	inputs = tokenizer(text, return_tensors="pt")
	with torch.no_grad():
	logits = model(**inputs).logits
	probs = F.softmax(logits, dim=-1)
	predicted_class = torch.argmax(probs).item()
	confidence = probs[0][predicted_class].item()

	label = "Hate" if predicted_class == 1 else "Not Hate"
	print(f"{label} ({confidence:.2%})")
	```

	## 🧪 Metrics

	The model was evaluated on a balanced multilingual dataset consisting of over 56,000 examples. Below are the performance metrics:

	\| Class \| Precision \| Recall \| F1-score \| Support \|
	\|-----------\|-----------\|--------\|----------\|---------\|
	\| Not Hate \| 0.85 \| 0.83 \| 0.84 \| 30,352 \|
	\| Hate \| 0.81 \| 0.83 \| 0.82 \| 26,609 \|

	Overall Accuracy: 0.83
	Macro Average: Precision: 0.83, Recall: 0.83, F1-score: 0.83
	Weighted Average: Precision: 0.83, Recall: 0.83, F1-score: 0.83

	## 📄 License

	MIT License – feel free to use for academic and non-commercial projects.

	## ✍️ Author

	Made with ❤️ by [WhiterBB](https://github.com/WhiterBB) as part of a final master's thesis (TFM) in Artificial Intelligence.