MAKINI-Hate v1 β African Language Hate Speech Detection
Developed by Algedi Intelligence Labs (Nairobi, Kenya).
Model Summary
MAKINI-Hate v1 is a fine-tuned XLM-RoBERTa model for detecting hate speech in Swahili and French text. It is the first hate speech detection model trained on both Swahili (AfriHate) and French data with explicit documentation of African context gaps.
Binary classification: Hate vs Normal.
Labels
| ID | Label | Description |
|---|---|---|
| 0 | Normal | No hate speech detected |
| 1 | Hate | Hate or abusive language detected |
Performance (Test Set β 4,976 examples)
Overall
| Class | Precision | Recall | F1 |
|---|---|---|---|
| Normal | 0.91 | 0.89 | 0.90 |
| Hate | 0.87 | 0.89 | 0.88 |
| Macro avg | 0.89 | 0.89 | 0.89 |
| Accuracy | 0.89 |
Benchmark β MAKINI-Hate v1 vs Existing Models
Evaluated on AfriHate Swahili test set (3,168 examples).
| Model | Swahili F1 Macro |
|---|---|
| MAKINI-Hate v1 (ours) | 0.92 |
| AfroXLMR-76L (monolingual) | 0.78* |
| GPT-4o (20-shot) | 0.75 |
| SetFit (20-shot) | 0.75 |
| Mistral-7B (5-shot) | 0.59 |
*AfroXLMR-76L results from Muhammad et al. (2025) AfriHate paper. Note: Comparisons are not perfectly controlled β AfriHate baselines use 3-class labels while MAKINI uses binary. Binary is an easier task. This limitation is documented transparently.
By Language
| Language | F1 Macro | Accuracy | Notes |
|---|---|---|---|
| Swahili | 0.92 | 0.92 | Strong β native African annotations |
| French | 0.79 | 0.84 | Weaker β European French training data |
The 13-point gap between Swahili and French performance is a direct consequence of training data geography, not model architecture. This is a documented limitation and active research gap β see Known Limitations below.
Training Data
| Dataset | Language | Size | Source |
|---|---|---|---|
AfriHate (swa) |
Swahili | 21,092 | Native-annotated African tweets |
| French Hate Speech Superset | French (European) | 18,071 | 5 European French datasets merged |
| Total | 39,163 |
Label Mapping
AfriHate uses three labels (Hate, Abuse, Normal). The French superset
is binary (1, 0). Both were mapped to binary for this model:
| Original | Source | β Mapped |
|---|---|---|
Hate |
AfriHate | Hate (1) |
Abuse |
AfriHate | Hate (1) |
Normal |
AfriHate | Normal (0) |
1 |
French superset | Hate (1) |
0 |
French superset | Normal (0) |
Important: The French superset binarized its source datasets, meaning
what was originally Abusive was collapsed into 1 or 0 by the original
authors. The model therefore has no French-language Abusive examples β
the Abusive vs Hate distinction in French is an open research gap.
Training Approach
Architecture
Fine-tuned xlm-roberta-base (125M parameters) for binary sequence
classification. XLM-RoBERTa was chosen for its multilingual pretraining
across 100 languages including French and Swahili.
Why We Changed Approach Mid-Training
Attempt 1 (failed): Our first training run produced 0.881 F1 macro on
the validation set across 4 epochs β numbers that looked strong. However,
test set evaluation revealed the model was predicting only Normal for
every input. Hate F1 = 0.00.
Root cause: The WeightedTrainer subclass was incompatible with the
num_items_in_batch handling introduced in recent versions of the
Transformers library. The weighted cross-entropy loss was being silently
bypassed during training. The model learned to predict the majority class
(Normal, ~60% of data) and achieved artificially high accuracy β a classic
majority-class collapse on imbalanced data.
The fix: We rewrote compute_loss to explicitly move class weights to
the same device as the logits (logits.device) and apply
CrossEntropyLoss directly on reshaped tensors, bypassing the Trainer's
internal loss handling entirely. This confirmed working weighted loss from
epoch 1 (F1 0.858, both classes learning).
Lesson: Always evaluate per-class F1 on the test set immediately after training. Macro accuracy on validation alone is insufficient to detect majority-class collapse.
Hyperparameters
| Parameter | Value |
|---|---|
| Base model | xlm-roberta-base |
| Epochs | 4 |
| Batch size | 32 (train), 64 (eval) |
| Learning rate | 2e-5 |
| Warmup steps | 100 |
| Weight decay | 0.01 |
| Max sequence length | 128 |
| Loss function | Weighted CrossEntropyLoss |
| Class weights | [0.83, 1.25] (Normal, Hate) |
| Hardware | NVIDIA Tesla T4 |
| Training time | ~31 minutes |
Usage
from transformers import pipeline
makini_hate = pipeline(
"text-classification",
model="Daudipdg/makini-hate-v1",
return_all_scores=True
)
# Swahili
result = makini_hate("Mtu huyu anastahili kufa")
print(result)
# French
result = makini_hate("Ces gens ne mΓ©ritent pas de vivre ici")
print(result)
Known Limitations
1. European French Bias (Critical)
The French training data comes entirely from European sources (France-based Twitter, NGO datasets). African French β as spoken in Senegal, CΓ΄te d'Ivoire, DRC, Cameroon, and other Francophone African countries β has distinct slang, code-switching patterns, and culturally-specific hate targets that this model has never seen.
Impact: The 13-point F1 gap between Swahili (0.92) and French (0.79) is directly attributable to this. A model trained on European French hate speech will systematically underperform on African French content.
What is needed: A purpose-built African French hate speech corpus annotated by native speakers from Francophone Africa. This does not currently exist. Building it is Algedi Intelligence Labs' highest-priority data collection goal for v2.
2. No Abusive/Hate Distinction in French
The French superset binarized all labels before release. The model cannot
distinguish between Abusive and Hate in French β only in Swahili
where AfriHate preserves this distinction. A future 3-class version
requires properly annotated French data.
3. Code-Switching Not Covered
Swahili-English (Sheng), French-Wolof (as in AWOFRO), and Camfranglais are common in African online spaces. The model was not trained on code-switched text. We evaluated AWOFRO (3,510 Wolof-French tweets) as a potential training source but excluded it due to inconsistent annotation quality. This remains an open gap.
4. Tweet-Format Bias
Both training datasets are sourced from Twitter. Performance on other platforms (Facebook, WhatsApp forwards, forum posts) is untested and may degrade, particularly for longer texts.
5. Languages
Swahili and French only. No support for Hausa, Yoruba, Amharic, Zulu, or other African languages in v1.
What's Next: MAKINI-Hate v2 Roadmap
| Priority | Task | Impact |
|---|---|---|
| π΄ High | Collect African French hate speech corpus | Fix 13-point FR gap |
| π΄ High | Add SHAP/attention explainability layer | Cultural marker extraction |
| π‘ Medium | Expand to Hausa + Yoruba (AfriHate data exists) | Broader coverage |
| π‘ Medium | 3-class model (Hate / Abusive / Normal) | Finer-grained output |
| π‘ Medium | Code-switching robustness benchmark | Sheng, Camfranglais |
| π’ Low | Fine-tune on longer-form content | Beyond tweet format |
Explainability (Planned for v2)
MAKINI-Hate v2 will return structured output including:
{
"label": "Hate",
"confidence": 0.94,
"cultural_markers": ["term_1", "term_2"],
"explanation": "Detected hate targeting ethnicity",
"recommendation": "High confidence β automated action defensible",
"data_warning": null
}
This positions the model as a moderation assistant rather than a black-box classifier β critical for enterprise deployment where decisions must be explainable and defensible.
The African French Gap: A Research Note
This model quantifies for the first time the performance cost of using European French hate speech data to moderate African French content. The 13-point F1 gap (0.92 Swahili vs 0.79 French) is not a modeling failure β it is a data infrastructure failure.
No publicly available African French hate speech dataset exists as of March 2026. The French Hate Speech Superset authors themselves noted that their dataset overrepresents France relative to Francophone Africa. AWOFRO (Ndao et al., 2024) covers Wolof-French code-switching but not standard African French, and was excluded from this work due to annotation inconsistency.
This gap affects every NLP system deployed for content moderation in Francophone Africa. We document it here explicitly so that future work can measure against this baseline.
Citation
@model{wachira2026makinihate,
title = {MAKINI-Hate v1: African Language Hate Speech Detection},
author = {Wachira, David Maina},
year = {2026},
publisher = {Algedi Intelligence Labs},
url = {https://huggingface.co/Daudipdg/makini-hate-v1},
note = {Fine-tuned on AfriHate (Swahili) and French Hate Speech
Superset. First hate speech model with explicit African
French context gap documentation.}
}
License
CC BY-NC 4.0 β Free for research and non-commercial use.
Commercial use requires explicit permission from Algedi Intelligence Labs.
Contact
David Maina Wachira β david@makini.tech
Algedi Intelligence Labs β makini.tech
- Downloads last month
- 52