Update README.md
Browse files
README.md
CHANGED
|
@@ -3,7 +3,7 @@ language: ar
|
|
| 3 |
widget:
|
| 4 |
- text: "لكي نتجنب فيروس [MASK]"
|
| 5 |
---
|
| 6 |
-
# arabert_c19: An Arabert model pretrained on 1.5 million COVID-19 multi-dialect Arabic tweets
|
| 7 |
**ARABERT COVID-19** is a pretrained (fine-tuned) version of the AraBERT v2 model (https://huggingface.co/aubmindlab/bert-base-arabertv02). The pretraining was done using 1.5 million multi-dialect Arabic tweets regarding the COVID-19 pandemic from the “Large Arabic Twitter Dataset on COVID-19” (https://arxiv.org/abs/2004.04315).
|
| 8 |
The model can achieve better results for the tasks that deal with multi-dialect Arabic tweets in relation to the COVID-19 pandemic.
|
| 9 |
|
|
@@ -28,6 +28,20 @@ text = "للوقايه من عدم انتشار كورونا عليك اولا
|
|
| 28 |
arabert_prep.preprocess(text)
|
| 29 |
```
|
| 30 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 31 |
|
| 32 |
# Contacts
|
| 33 |
**Hadj Ameur**: [Github](https://github.com/MohamedHadjAmeur) | <mohamedhadjameur@gmail.com> | <mhadjameur@cerist.dz>
|
|
|
|
| 3 |
widget:
|
| 4 |
- text: "لكي نتجنب فيروس [MASK]"
|
| 5 |
---
|
| 6 |
+
# arabert_c19 (https://arxiv.org/pdf/2105.03143.pdf): An Arabert model pretrained on 1.5 million COVID-19 multi-dialect Arabic tweets
|
| 7 |
**ARABERT COVID-19** is a pretrained (fine-tuned) version of the AraBERT v2 model (https://huggingface.co/aubmindlab/bert-base-arabertv02). The pretraining was done using 1.5 million multi-dialect Arabic tweets regarding the COVID-19 pandemic from the “Large Arabic Twitter Dataset on COVID-19” (https://arxiv.org/abs/2004.04315).
|
| 8 |
The model can achieve better results for the tasks that deal with multi-dialect Arabic tweets in relation to the COVID-19 pandemic.
|
| 9 |
|
|
|
|
| 28 |
arabert_prep.preprocess(text)
|
| 29 |
```
|
| 30 |
|
| 31 |
+
# Citation
|
| 32 |
+
|
| 33 |
+
Please cite as:
|
| 34 |
+
|
| 35 |
+
``` bibtex
|
| 36 |
+
@misc{ameur2021aracovid19mfh,
|
| 37 |
+
title={AraCOVID19-MFH: Arabic COVID-19 Multi-label Fake News and Hate Speech Detection Dataset},
|
| 38 |
+
author={Mohamed Seghir Hadj Ameur and Hassina Aliane},
|
| 39 |
+
year={2021},
|
| 40 |
+
eprint={2105.03143},
|
| 41 |
+
archivePrefix={arXiv},
|
| 42 |
+
primaryClass={cs.CL}
|
| 43 |
+
}
|
| 44 |
+
```
|
| 45 |
|
| 46 |
# Contacts
|
| 47 |
**Hadj Ameur**: [Github](https://github.com/MohamedHadjAmeur) | <mohamedhadjameur@gmail.com> | <mhadjameur@cerist.dz>
|