Update README.md
Browse files
README.md
CHANGED
|
@@ -9,6 +9,38 @@ tags:
|
|
| 9 |
- classification
|
| 10 |
library_name: PyTorch
|
| 11 |
---
|
|
|
|
| 12 |
# Dzarashield
|
| 13 |
|
| 14 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 9 |
- classification
|
| 10 |
library_name: PyTorch
|
| 11 |
---
|
| 12 |
+
|
| 13 |
# Dzarashield
|
| 14 |
|
| 15 |
+
Dzarashield is a fine-tuned model based on [DzaraBert](https://huggingface.co/Sifal/dzarabert) designed for text classification tasks. It specializes in hate speech detection for Arabic text.
|
| 16 |
+
|
| 17 |
+
|
| 18 |
+
## Limitations
|
| 19 |
+
|
| 20 |
+
It's important to note that this model has been fine-tuned solely on Arabic characters, which means that tokens from other languages have been pruned.
|
| 21 |
+
|
| 22 |
+
# How to use
|
| 23 |
+
|
| 24 |
+
```
|
| 25 |
+
!git lfs install
|
| 26 |
+
!git clone https://huggingface.co/Sifal/dzarashield
|
| 27 |
+
%cd dzarashield
|
| 28 |
+
|
| 29 |
+
from model import BertClassifier
|
| 30 |
+
from transformers import PreTrainedTokenizerFast
|
| 31 |
+
|
| 32 |
+
dzarashield = BertClassifier()
|
| 33 |
+
PATH = "./pytorch_model.bin"
|
| 34 |
+
|
| 35 |
+
dzarashield.load_state_dict(torch.load(PATH))
|
| 36 |
+
tokenizer = PreTrainedTokenizerFast(tokenizer_file="tokenizer.json")
|
| 37 |
+
|
| 38 |
+
```
|
| 39 |
+
|
| 40 |
+
## Acknowledgments
|
| 41 |
+
|
| 42 |
+
Dzarashield is built upon the foundations of [Dziribert](https://huggingface.co/alger-ia/dziribert), and I am grateful for their work in making this project possible.
|
| 43 |
+
|
| 44 |
+
## References
|
| 45 |
+
|
| 46 |
+
- [Dziribert](https://arxiv.org/pdf/2109.12346.pdf)
|