Update README.md
Browse files
README.md
CHANGED
|
@@ -11,11 +11,11 @@ base_model: aubmindlab/bert-base-arabertv2
|
|
| 11 |
library_name: transformers
|
| 12 |
---
|
| 13 |
|
| 14 |
-
#
|
| 15 |
|
| 16 |
This model is a fine-tuned version of `aubmindlab/bert-base-arabertv2`, specifically trained on Egyptian Arabic text for hate speech and offensive language classification
|
| 17 |
|
| 18 |
-
with
|
| 19 |
|
| 20 |
It classifies text into 6 categories :
|
| 21 |
|
|
@@ -30,7 +30,7 @@ It classifies text into 6 categories :
|
|
| 30 |
|
| 31 |
---
|
| 32 |
|
| 33 |
-
##
|
| 34 |
|
| 35 |
- **Base Model:** AraBERT v2 (`aubmindlab/bert-base-arabertv2`)
|
| 36 |
- **Dataset:** Egyptian Arabic hate speech dataset with 6 labeled categories
|
|
@@ -40,7 +40,7 @@ It classifies text into 6 categories :
|
|
| 40 |
|
| 41 |
---
|
| 42 |
|
| 43 |
-
##
|
| 44 |
|
| 45 |
Training was done using a GPU (`cuda`). Below is a snapshot of model performance across the first 15 epochs:
|
| 46 |
|
|
@@ -63,7 +63,7 @@ Training was done using a GPU (`cuda`). Below is a snapshot of model performance
|
|
| 63 |
| 15 | 0.4289 | 0.6864 | 0.9153 | 0.9171 | 0.9153 | 0.9157 |
|
| 64 |
|
| 65 |
---
|
| 66 |
-
##
|
| 67 |
|
| 68 |
Since the model `Woolv7007/egyptian-text-classification` and the `labels.json` file are publicly available, you can load the model, tokenizer, and labels directly without needing any Hugging Face token or special setup.
|
| 69 |
|
|
|
|
| 11 |
library_name: transformers
|
| 12 |
---
|
| 13 |
|
| 14 |
+
# Egyptian Dialect Text Classification by Fine-tuning AraBERT 🇪🇬
|
| 15 |
|
| 16 |
This model is a fine-tuned version of `aubmindlab/bert-base-arabertv2`, specifically trained on Egyptian Arabic text for hate speech and offensive language classification
|
| 17 |
|
| 18 |
+
with 92% accurency.
|
| 19 |
|
| 20 |
It classifies text into 6 categories :
|
| 21 |
|
|
|
|
| 30 |
|
| 31 |
---
|
| 32 |
|
| 33 |
+
## Training Details
|
| 34 |
|
| 35 |
- **Base Model:** AraBERT v2 (`aubmindlab/bert-base-arabertv2`)
|
| 36 |
- **Dataset:** Egyptian Arabic hate speech dataset with 6 labeled categories
|
|
|
|
| 40 |
|
| 41 |
---
|
| 42 |
|
| 43 |
+
## Performance Metrics
|
| 44 |
|
| 45 |
Training was done using a GPU (`cuda`). Below is a snapshot of model performance across the first 15 epochs:
|
| 46 |
|
|
|
|
| 63 |
| 15 | 0.4289 | 0.6864 | 0.9153 | 0.9171 | 0.9153 | 0.9157 |
|
| 64 |
|
| 65 |
---
|
| 66 |
+
## Using the Model Without a Token (For End Users)
|
| 67 |
|
| 68 |
Since the model `Woolv7007/egyptian-text-classification` and the `labels.json` file are publicly available, you can load the model, tokenizer, and labels directly without needing any Hugging Face token or special setup.
|
| 69 |
|