ychenNLP
/

arabic-ner-ace

@@ -1,8 +1,26 @@
 ---
 license: mit
 ---
 # Arabic NER
 ```python
 >>> from transformers import pipeline, AutoModelForTokenClassification, AutoTokenizer
@@ -17,4 +35,15 @@ license: mit
 >>> output = ner_pip('قال وزير العدل التركي بكير بوزداغ إن أنقرة تريد 12 مشتبهاً بهم من فنلندا و 21 من السويد')
 >>> print(output)
 [{'entity_group': 'PER', 'score': 0.9996214, 'word': 'وزير', 'start': 4, 'end': 8}, {'entity_group': 'ORG', 'score': 0.9952383, 'word': 'العدل', 'start': 9, 'end': 14}, {'entity_group': 'GPE', 'score': 0.9996675, 'word': 'التركي', 'start': 15, 'end': 21}, {'entity_group': 'PER', 'score': 0.9978992, 'word': 'بكير بوزداغ', 'start': 22, 'end': 33}, {'entity_group': 'GPE', 'score': 0.9997154, 'word': 'انقرة', 'start': 37, 'end': 42}, {'entity_group': 'PER', 'score': 0.9946885, 'word': 'مشتبها بهم', 'start': 51, 'end': 62}, {'entity_group': 'GPE', 'score': 0.99967396, 'word': 'فنلندا', 'start': 66, 'end': 72}, {'entity_group': 'PER', 'score': 0.99694425, 'word': '21', 'start': 75, 'end': 77}, {'entity_group': 'GPE', 'score': 0.99963355, 'word': 'السويد', 'start': 81, 'end': 87}]
-```

 ---
+language: en,ar
+tags:
+- exbert
 license: mit
+datasets:
+- ACE2005
 ---
 # Arabic NER
+### Model
+[GigaBERTv4](https://huggingface.co/lanwuwei/GigaBERT-v4-Arabic-and-English)
+### Hyperparameters
+learning_rate=2e-5
+num_train_epochs=10
+weight_decay=0.01
+### ACE2005 Evaluation results
+| Language | Arabic | English  |
+|:----:|:-----------:|:----:|
+|      | 89.4   | 88.8 |
+### How to use
 ```python
 >>> from transformers import pipeline, AutoModelForTokenClassification, AutoTokenizer
 >>> output = ner_pip('قال وزير العدل التركي بكير بوزداغ إن أنقرة تريد 12 مشتبهاً بهم من فنلندا و 21 من السويد')
 >>> print(output)
 [{'entity_group': 'PER', 'score': 0.9996214, 'word': 'وزير', 'start': 4, 'end': 8}, {'entity_group': 'ORG', 'score': 0.9952383, 'word': 'العدل', 'start': 9, 'end': 14}, {'entity_group': 'GPE', 'score': 0.9996675, 'word': 'التركي', 'start': 15, 'end': 21}, {'entity_group': 'PER', 'score': 0.9978992, 'word': 'بكير بوزداغ', 'start': 22, 'end': 33}, {'entity_group': 'GPE', 'score': 0.9997154, 'word': 'انقرة', 'start': 37, 'end': 42}, {'entity_group': 'PER', 'score': 0.9946885, 'word': 'مشتبها بهم', 'start': 51, 'end': 62}, {'entity_group': 'GPE', 'score': 0.99967396, 'word': 'فنلندا', 'start': 66, 'end': 72}, {'entity_group': 'PER', 'score': 0.99694425, 'word': '21', 'start': 75, 'end': 77}, {'entity_group': 'GPE', 'score': 0.99963355, 'word': 'السويد', 'start': 81, 'end': 87}]
+```
+### BibTeX entry and citation info
+```bibtex
+@inproceedings{lan2020gigabert,
+  author     = {Lan, Wuwei and Chen, Yang and Xu, Wei and Ritter, Alan},
+    title      = {Giga{BERT}: Zero-shot Transfer Learning from {E}nglish to {A}rabic},
+    booktitle  = {Proceedings of The 2020 Conference on Empirical Methods on Natural Language Processing (EMNLP)},
+    year       = {2020}
+  }
+```