File size: 1,253 Bytes
e8f88e1 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 |
---
license: apache-2.0
language: sw
tags:
- hate-speech
- swahili
- text-classification
- bert
- offensive-language
- political-hate-speech
datasets:
- custom
pipeline_tag: text-classification
---
# Swahili Hate Speech Classification Model
This is a fine-tuned BERT model for **multi-class text classification** in Swahili. It predicts whether a given text is:
- **Non-hate speech**
- **Political hate speech**
- **Offensive language**
## 🧠 Model Details
- **Architecture**: BERT (base)
- **Languages**: Swahili
- **Classes**: 3
- **Model size**: 178M parameters
- **Framework**: PyTorch
- **Training data**: A custom labeled dataset of Swahili social media or online comments (non-public)
## 🏷️ Labels
| Label ID | Class Name |
|----------|--------------------------|
| `LABEL_0` | Non-hate speech |
| `LABEL_1` | Political hate speech |
| `LABEL_2` | Offensive language |
## 🚀 Usage
You can load and test the model using the `transformers` library:
```python
from transformers import pipeline
classifier = pipeline("text-classification", model="sandbox338/hatespeech")
result = classifier("Hii ni ujumbe wa kawaida bila matusi.")
print(result) # [{'label': 'LABEL_0', 'score': 0.98}]
|