EA-HS: East Africa Hate Speech Classifier v3
Multilingual hate speech classifier for East African languages, built for conflict monitoring and peacebuilding applications.
Model Details
- Base model: Davlan/afro-xlmr-base (Africa-focused XLM-RoBERTa)
- Fine-tuned on: AfriHate (7 African languages) + HatEval (Arabic) + HateXplain (English)
- Labels: 0 (not hate), 1 (hate), 2 (offensive)
- Languages: Swahili, Somali, Amharic, Oromo, Tigrinya, Kinyarwanda, Nigerian Pidgin, Arabic, English
Performance
| Version | Base Model | Accuracy | F1 |
|---|---|---|---|
| v3 (current) | afro-xlmr-base | 77.10% | 76.87% |
| v2 | xlm-roberta-base | 76.18% | 75.99% |
Usage
from transformers import pipeline
classifier = pipeline('text-classification', model='KSvendsen/EA-HS')
result = classifier('This is a test sentence')
Training
- 5 epochs, batch size 16, learning rate 2e-5
- Class-weighted loss + minority upsampling
- ~95k training samples across 9 languages
Developed by
MERLx / RIKO - AI-augmented conflict monitoring
- Downloads last month
- 135
Dataset used to train KSvendsen/EA-HS
Evaluation results
- Accuracyself-reported0.771
- F1self-reported0.769