AfriAware — Amharic Hate Speech Classifier 🛡️

A fine-tuned Amharic hate speech classifier for digital safety research in Ethiopia and across Africa.

Usage

from transformers import pipeline

classifier = pipeline(
    "text-classification",
    model="YOUR_USERNAME/afriaware-amharic-hate-speech"
)

result = classifier("ኢትዮጵያ ሀገራችን ናት")
# [{'label': 'normal', 'score': 0.94}]

Labels

Label ID Meaning
hate 0 Hatred directed at a person or group
abusive 1 Offensive/aggressive language
normal 2 Non-harmful content

Training Details

Setting Value
Base model Davlan/bert-base-multilingual-cased-finetuned-amharic
Dataset AfriHate (amh)
Epochs 5 (early stopping, patience=2)
Learning rate 2e-5
Batch size 16
Max length 128 tokens
Optimizer AdamW, weight decay 0.01

Dataset

Trained on the Amharic subset of AfriHate (Muhammad et al., NAACL 2025) — tweets annotated by native Amharic speakers with full sociocultural context.

Ethical Considerations

  • Intended for research and journalist support tools, not automated removal
  • Annotations reflect Ethiopian social media from 2020–2022
  • Model should always be combined with human review
  • Do not use for surveillance of individuals

Citation

@inproceedings{muhammad-etal-2025-afrihate,
    title = "{A}fri{H}ate: A Multilingual Collection of Hate Speech and Abusive
             Language Datasets for {A}frican Languages",
    author = {Muhammad, Shamsuddeen Hassan and others},
    booktitle = "Proceedings of NAACL 2025",
    year = "2025"
}
Downloads last month
17
Safetensors
Model size
0.2B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for abrhamLearns/afriaware-amharic-hate-speech

Finetuned
(3)
this model

Dataset used to train abrhamLearns/afriaware-amharic-hate-speech

Space using abrhamLearns/afriaware-amharic-hate-speech 1