| | --- |
| | language: |
| | - ar |
| | thumbnail: url to a thumbnail used in social sharing |
| | tags: |
| | - ner |
| | - token-classification |
| | - Arabic-NER |
| | metrics: |
| | - accuracy |
| | - f1 |
| | - precision |
| | - recall |
| | widget: |
| | - text: النجم محمد صلاح لاعب المنتخب المصري يعيش في مصر بالتحديد من نجريج, الشرقية |
| | example_title: Mohamed Salah |
| | - text: انا ساكن في حدايق الزتون و بدرس في جامعه عين شمس |
| | example_title: Egyptian Dialect |
| | - text: يقع نهر الأمازون في قارة أمريكا الجنوبية |
| | example_title: Standard Arabic |
| | datasets: |
| | - Fine-grained-Arabic-Named-Entity-Corpora |
| | pipeline_tag: token-classification |
| | --- |
| | |
| |
|
| |
|
| |
|
| |
|
| | # Arabic Named Entity Recognition |
| |
|
| | This project is made to enrich the Arabic Named Entity Recognition(ANER). Arabic is a tough language to deal with and has alot of difficulties. |
| | We managed to made a model based on Arabert to support 50 entities. |
| |
|
| |
|
| |
|
| |
|
| |
|
| | # Dataset |
| |
|
| | - [Fine-grained Arabic Named Entity Corpora](https://fsalotaibi.kau.edu.sa/Pages-Arabic-NE-Corpora.aspx) |
| |
|
| |
|
| | # Evaluation results |
| |
|
| | The model achieves the following results: |
| |
|
| | | Dataset | WikiFANE Gold | WikiFANE Gold | WikiFANE Gold | NewsFANE Gold | NewsFANE Gold | NewsFANE Gold |
| | |:--------:|:-------:|:-------:|:------:|:------:|:---------:|:------:| |
| | | (metric) | (Recall) | (Precision) | (F1) | (Recall) | (Precision) | (F1) |
| | | | 87.0 | 90.5 | 88.7 | 78.1 | 77.4 | 77.7 |
| |
|
| |
|
| | # Usage |
| |
|
| | The model is available on the HuggingFace model page under the name: [boda/ANER](https://huggingface.co/boda/ANER). Checkpoints are available only in PyTorch at the time. |
| |
|
| | ### Use in python: |
| |
|
| | ```python |
| | from transformers import AutoTokenizer, AutoModelForTokenClassification |
| | |
| | tokenizer = AutoTokenizer.from_pretrained("boda/ANER") |
| | |
| | model = AutoModelForTokenClassification.from_pretrained("boda/ANER") |
| | ``` |
| |
|
| |
|
| | # Acknowledgments |
| |
|
| | Thanks to [Arabert](https://github.com/aub-mind/arabert) for providing the Arabic Bert model, which we used as a base model for our work. |
| |
|
| | We also would like to thank [Prof. Fahd Saleh S Alotaibi](https://fsalotaibi.kau.edu.sa/Pages-Arabic-NE-Corpora.aspx) at the Faculty of Computing and Information Technology King Abdulaziz University, for providing the dataset which we used to train our model with. |
| |
|
| | # Contacts |
| |
|
| | **Abdelrahman Atef** |
| |
|
| | - [LinkedIn](linkedin.com/in/boda-sadalla) |
| | - [Github](https://github.com/BodaSadalla98) |
| | - <bodasadallah@yahoo.com> |