EA-HS / README.md
KSvendsen's picture
EA-HS v3: AfroXLMR base, 9 langs, 77.1% acc / 76.9% F1
2accfa5 verified
metadata
language:
  - sw
  - so
  - am
  - om
  - ti
  - rw
  - pcm
  - ar
  - en
license: apache-2.0
tags:
  - hate-speech
  - text-classification
  - african-languages
  - east-africa
  - afro-xlmr
  - peacebuilding
datasets:
  - AfriHate
  - hateval
  - hatexplain
metrics:
  - accuracy
  - f1
pipeline_tag: text-classification
model-index:
  - name: EA-HS-v3
    results:
      - task:
          type: text-classification
          name: Hate Speech Detection
        metrics:
          - name: Accuracy
            type: accuracy
            value: 0.771
          - name: F1
            type: f1
            value: 0.7687

EA-HS: East Africa Hate Speech Classifier v3

Multilingual hate speech classifier for East African languages, built for conflict monitoring and peacebuilding applications.

Model Details

  • Base model: Davlan/afro-xlmr-base (Africa-focused XLM-RoBERTa)
  • Fine-tuned on: AfriHate (7 African languages) + HatEval (Arabic) + HateXplain (English)
  • Labels: 0 (not hate), 1 (hate), 2 (offensive)
  • Languages: Swahili, Somali, Amharic, Oromo, Tigrinya, Kinyarwanda, Nigerian Pidgin, Arabic, English

Performance

Version Base Model Accuracy F1
v3 (current) afro-xlmr-base 77.10% 76.87%
v2 xlm-roberta-base 76.18% 75.99%

Usage

from transformers import pipeline
classifier = pipeline('text-classification', model='KSvendsen/EA-HS')
result = classifier('This is a test sentence')

Training

  • 5 epochs, batch size 16, learning rate 2e-5
  • Class-weighted loss + minority upsampling
  • ~95k training samples across 9 languages

Developed by

MERLx / RIKO - AI-augmented conflict monitoring