File size: 2,233 Bytes
e5aec30 646daad e5aec30 646daad e5aec30 646daad e5aec30 646daad e5aec30 646daad e5aec30 646daad e5aec30 646daad e5aec30 646daad e5aec30 646daad e5aec30 646daad e5aec30 646daad e5aec30 646daad e5aec30 646daad e5aec30 646daad e5aec30 646daad e5aec30 646daad e5aec30 646daad |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 |
---
library_name: transformers
license: mit
language:
- am
---
# Amharic Hate Speech Detection Model using Fine-Tuned mBERT
## Overview
This repository presents a **Hate Speech Detection Model for the Amharic language**, fine-tuned from the multilingual BERT (mBERT) model. Leveraging the **HuggingFace Trainer API**, this model is specifically designed to detect hate speech in Amharic with high accuracy and precision.
## Model Details
The base model for this project is **Davlan's bert-base-multilingual-cased-finetuned-amharic** from Huggingface. This pretrained model was further fine-tuned on a custom dataset for the downstream task of **hate speech detection** in Amharic.
### Key Highlights:
- **Model Architecture**: mBERT (Multilingual BERT)
- **Training Framework**: HuggingFace's Trainer API
- **Performance**:
- **F1-Score**: 0.9172
- **Accuracy**: 91.59%
- **Training Parameters**:
- **Epochs**: 15
- **Learning Rate**: 5e-5
## Dataset
The model was fine-tuned using a dataset sourced from [Mendeley Data](https://data.mendeley.com/datasets/ymtmxx385m). The dataset consists of **30,000 labeled instances**, making it one of the most comprehensive datasets for Amharic hate speech detection.
### Dataset Overview:
- **Total Samples**: 30,000
- **Source**: Mendeley Data Repository
- **Language**: Amharic
## Model Usage
For those interested in utilizing or exploring this model further, the complete Google Colab notebook detailing the training process and performance metrics is available on GitHub. You can easily access it via the following link:
**[Google Colab Notebook: Amharic Hate Speech Detection Using mBERT](https://github.com/dawit2123/amharic-hate-speech-detection-using-ML/blob/main/Hate_speech_detection_using_amharic_language.ipynb)**
## How to Use
To use this model for Amharic hate speech detection, you can follow the steps in the Google Colab notebook to load and test the model on new data. The notebook includes all necessary instructions for:
- Loading the fine-tuned mBERT model
- Preprocessing Amharic text data
- Making predictions on new instances
---
### Contact Information
If you have any questions or suggestions, feel free to reach out or contribute via GitHub. |