MAKINI-Hate v1 — African Language Hate Speech Detection

Developed by Algedi Intelligence Labs (Nairobi, Kenya).

Model Summary

MAKINI-Hate v1 is a fine-tuned XLM-RoBERTa model for detecting hate speech in Swahili and French text. It is the first hate speech detection model trained on both Swahili (AfriHate) and French data with explicit documentation of African context gaps.

Binary classification: Hate vs Normal.

Labels

ID	Label	Description
0	Normal	No hate speech detected
1	Hate	Hate or abusive language detected

Performance (Test Set — 4,976 examples)

Overall

Class	Precision	Recall	F1
Normal	0.91	0.89	0.90
Hate	0.87	0.89	0.88
Macro avg	0.89	0.89	0.89
Accuracy			0.89

Benchmark — MAKINI-Hate v1 vs Existing Models

Evaluated on AfriHate Swahili test set (3,168 examples).

Model	Swahili F1 Macro
MAKINI-Hate v1 (ours)	0.92
AfroXLMR-76L (monolingual)	0.78*
GPT-4o (20-shot)	0.75
SetFit (20-shot)	0.75
Mistral-7B (5-shot)	0.59

*AfroXLMR-76L results from Muhammad et al. (2025) AfriHate paper. Note: Comparisons are not perfectly controlled — AfriHate baselines use 3-class labels while MAKINI uses binary. Binary is an easier task. This limitation is documented transparently.

By Language

Language	F1 Macro	Accuracy	Notes
Swahili	0.92	0.92	Strong — native African annotations
French	0.79	0.84	Weaker — European French training data

The 13-point gap between Swahili and French performance is a direct consequence of training data geography, not model architecture. This is a documented limitation and active research gap — see Known Limitations below.

Training Data

Dataset	Language	Size	Source
AfriHate (`swa`)	Swahili	21,092	Native-annotated African tweets
French Hate Speech Superset	French (European)	18,071	5 European French datasets merged
Total		39,163

Label Mapping

AfriHate uses three labels (Hate, Abuse, Normal). The French superset is binary (1, 0). Both were mapped to binary for this model:

Original	Source	→ Mapped
`Hate`	AfriHate	`Hate` (1)
`Abuse`	AfriHate	`Hate` (1)
`Normal`	AfriHate	`Normal` (0)
`1`	French superset	`Hate` (1)
`0`	French superset	`Normal` (0)

Important: The French superset binarized its source datasets, meaning what was originally Abusive was collapsed into 1 or 0 by the original authors. The model therefore has no French-language Abusive examples — the Abusive vs Hate distinction in French is an open research gap.

Training Approach

Architecture

Fine-tuned xlm-roberta-base (125M parameters) for binary sequence classification. XLM-RoBERTa was chosen for its multilingual pretraining across 100 languages including French and Swahili.

Why We Changed Approach Mid-Training

Attempt 1 (failed): Our first training run produced 0.881 F1 macro on the validation set across 4 epochs — numbers that looked strong. However, test set evaluation revealed the model was predicting only Normal for every input. Hate F1 = 0.00.

Root cause: The WeightedTrainer subclass was incompatible with the num_items_in_batch handling introduced in recent versions of the Transformers library. The weighted cross-entropy loss was being silently bypassed during training. The model learned to predict the majority class (Normal, ~60% of data) and achieved artificially high accuracy — a classic majority-class collapse on imbalanced data.

The fix: We rewrote compute_loss to explicitly move class weights to the same device as the logits (logits.device) and apply CrossEntropyLoss directly on reshaped tensors, bypassing the Trainer's internal loss handling entirely. This confirmed working weighted loss from epoch 1 (F1 0.858, both classes learning).

Lesson: Always evaluate per-class F1 on the test set immediately after training. Macro accuracy on validation alone is insufficient to detect majority-class collapse.

Hyperparameters

Parameter	Value
Base model	xlm-roberta-base
Epochs	4
Batch size	32 (train), 64 (eval)
Learning rate	2e-5
Warmup steps	100
Weight decay	0.01
Max sequence length	128
Loss function	Weighted CrossEntropyLoss
Class weights	[0.83, 1.25] (Normal, Hate)
Hardware	NVIDIA Tesla T4
Training time	~31 minutes

Usage

from transformers import pipeline

makini_hate = pipeline(
    "text-classification",
    model="Daudipdg/makini-hate-v1",
    return_all_scores=True
)

# Swahili
result = makini_hate("Mtu huyu anastahili kufa")
print(result)

# French
result = makini_hate("Ces gens ne méritent pas de vivre ici")
print(result)

Known Limitations

1. European French Bias (Critical)

The French training data comes entirely from European sources (France-based Twitter, NGO datasets). African French — as spoken in Senegal, Côte d'Ivoire, DRC, Cameroon, and other Francophone African countries — has distinct slang, code-switching patterns, and culturally-specific hate targets that this model has never seen.

Impact: The 13-point F1 gap between Swahili (0.92) and French (0.79) is directly attributable to this. A model trained on European French hate speech will systematically underperform on African French content.

What is needed: A purpose-built African French hate speech corpus annotated by native speakers from Francophone Africa. This does not currently exist. Building it is Algedi Intelligence Labs' highest-priority data collection goal for v2.

2. No Abusive/Hate Distinction in French

The French superset binarized all labels before release. The model cannot distinguish between Abusive and Hate in French — only in Swahili where AfriHate preserves this distinction. A future 3-class version requires properly annotated French data.

3. Code-Switching Not Covered

Swahili-English (Sheng), French-Wolof (as in AWOFRO), and Camfranglais are common in African online spaces. The model was not trained on code-switched text. We evaluated AWOFRO (3,510 Wolof-French tweets) as a potential training source but excluded it due to inconsistent annotation quality. This remains an open gap.

4. Tweet-Format Bias

Both training datasets are sourced from Twitter. Performance on other platforms (Facebook, WhatsApp forwards, forum posts) is untested and may degrade, particularly for longer texts.

5. Languages

Swahili and French only. No support for Hausa, Yoruba, Amharic, Zulu, or other African languages in v1.

What's Next: MAKINI-Hate v2 Roadmap

Priority	Task	Impact
🔴 High	Collect African French hate speech corpus	Fix 13-point FR gap
🔴 High	Add SHAP/attention explainability layer	Cultural marker extraction
🟡 Medium	Expand to Hausa + Yoruba (AfriHate data exists)	Broader coverage
🟡 Medium	3-class model (Hate / Abusive / Normal)	Finer-grained output
🟡 Medium	Code-switching robustness benchmark	Sheng, Camfranglais
🟢 Low	Fine-tune on longer-form content	Beyond tweet format

Explainability (Planned for v2)

MAKINI-Hate v2 will return structured output including:

{
  "label": "Hate",
  "confidence": 0.94,
  "cultural_markers": ["term_1", "term_2"],
  "explanation": "Detected hate targeting ethnicity",
  "recommendation": "High confidence — automated action defensible",
  "data_warning": null
}

This positions the model as a moderation assistant rather than a black-box classifier — critical for enterprise deployment where decisions must be explainable and defensible.

The African French Gap: A Research Note

This model quantifies for the first time the performance cost of using European French hate speech data to moderate African French content. The 13-point F1 gap (0.92 Swahili vs 0.79 French) is not a modeling failure — it is a data infrastructure failure.

No publicly available African French hate speech dataset exists as of March 2026. The French Hate Speech Superset authors themselves noted that their dataset overrepresents France relative to Francophone Africa. AWOFRO (Ndao et al., 2024) covers Wolof-French code-switching but not standard African French, and was excluded from this work due to annotation inconsistency.

This gap affects every NLP system deployed for content moderation in Francophone Africa. We document it here explicitly so that future work can measure against this baseline.

Citation

@model{wachira2026makinihate,
  title     = {MAKINI-Hate v1: African Language Hate Speech Detection},
  author    = {Wachira, David Maina},
  year      = {2026},
  publisher = {Algedi Intelligence Labs},
  url       = {https://huggingface.co/Daudipdg/makini-hate-v1},
  note      = {Fine-tuned on AfriHate (Swahili) and French Hate Speech 
               Superset. First hate speech model with explicit African 
               French context gap documentation.}
}

License

CC BY-NC 4.0 — Free for research and non-commercial use.
Commercial use requires explicit permission from Algedi Intelligence Labs.

Contact

David Maina Wachira — david@makini.tech
Algedi Intelligence Labs — makini.tech

Downloads last month: 7

Safetensors

Model size

0.3B params

Tensor type

F32

Daudipdg
/

makini-hate-v1