Update readme.md
Browse files
README.md
CHANGED
|
@@ -16,7 +16,11 @@ tags:
|
|
| 16 |
# ModernBERT Arabic Model Card
|
| 17 |
|
| 18 |
## Overview
|
| 19 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
| 20 |
- **Dataset:** Arabic Wikipedia
|
| 21 |
- **Size:** 1.8 GB
|
| 22 |
- **Tokens:** 228,788,529 tokens
|
|
@@ -32,7 +36,7 @@ This model demonstrates how ModernBERT can be adapted to Arabic for tasks like t
|
|
| 32 |
|
| 33 |
## Dataset Used For Training:
|
| 34 |
|
| 35 |
-
- [SANAD
|
| 36 |
|
| 37 |
## How to Use
|
| 38 |
|
|
|
|
| 16 |
# ModernBERT Arabic Model Card
|
| 17 |
|
| 18 |
## Overview
|
| 19 |
+
|
| 20 |
+
[!NOTE]
|
| 21 |
+
> This model demonstrates how ModernBERT can be adapted to Arabic for tasks like topic classification.
|
| 22 |
+
|
| 23 |
+
This is an Experimental Arabic version of [ModernBERT-base](https://huggingface.co/answerdotai/ModernBERT-base),trained ONLY on Topic Classification Task using the base model of original modernbert with a custom Arabic trained tokenizer with the following details:
|
| 24 |
- **Dataset:** Arabic Wikipedia
|
| 25 |
- **Size:** 1.8 GB
|
| 26 |
- **Tokens:** 228,788,529 tokens
|
|
|
|
| 36 |
|
| 37 |
## Dataset Used For Training:
|
| 38 |
|
| 39 |
+
- [SANAD DATASET](https://huggingface.co/datasets/arbml/SANAD) was used for training and testing which contains 7 different topics such as Politics, Finance, Medical, Culture, Sport , Tech and Religion.
|
| 40 |
|
| 41 |
## How to Use
|
| 42 |
|