File size: 1,253 Bytes
e8f88e1
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
---
license: apache-2.0
language: sw
tags:
  - hate-speech
  - swahili
  - text-classification
  - bert
  - offensive-language
  - political-hate-speech
datasets:
  - custom
pipeline_tag: text-classification
---

# Swahili Hate Speech Classification Model

This is a fine-tuned BERT model for **multi-class text classification** in Swahili. It predicts whether a given text is:

- **Non-hate speech**
- **Political hate speech**
- **Offensive language**

## 🧠 Model Details

- **Architecture**: BERT (base)
- **Languages**: Swahili
- **Classes**: 3
- **Model size**: 178M parameters
- **Framework**: PyTorch
- **Training data**: A custom labeled dataset of Swahili social media or online comments (non-public)

## 🏷️ Labels

| Label ID | Class Name              |
|----------|--------------------------|
| `LABEL_0` | Non-hate speech         |
| `LABEL_1` | Political hate speech   |
| `LABEL_2` | Offensive language      |

## 🚀 Usage

You can load and test the model using the `transformers` library:

```python
from transformers import pipeline

classifier = pipeline("text-classification", model="sandbox338/hatespeech")

result = classifier("Hii ni ujumbe wa kawaida bila matusi.")
print(result)  # [{'label': 'LABEL_0', 'score': 0.98}]