Ebtihal commited on
Commit
15f7242
·
1 Parent(s): b42f599

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +39 -11
README.md CHANGED
@@ -1,15 +1,43 @@
1
  Arabic Model AraBertMo_base_V7
 
2
  ---
3
- language:
4
- - ar
5
- tags:
6
- - Fill-Mask
7
- license: apache-2.0
8
- datasets:
9
- - OSCAR
10
  widget:
11
  - text: " السلام عليكم ورحمة[MASK] وبركاتة"
12
- example_title: "Example 1"
13
- - text: "مرحبا بك عزيزي الزائر [MASK] موقعنا"
14
- example_title: "Example 2"
15
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  Arabic Model AraBertMo_base_V7
2
+
3
  ---
4
+ language: ar
5
+ tags: Fill-Mask
6
+ datasets: OSCAR
 
 
 
 
7
  widget:
8
  - text: " السلام عليكم ورحمة[MASK] وبركاتة"
9
+ - text: " اهلا وسهلا بكم في [MASK] من سيربح المليون"
10
+ - text: " مرحبا بك عزيزي الزائر [MASK] موقعنا "
11
+
12
+ ---
13
+ # Arabic BERT Model
14
+ **AraBERTMo** is an Arabic pre-trained language model based on [Google's BERT architechture](https://github.com/google-research/bert).
15
+ AraBERTMo_base uses the same BERT-Base config.
16
+ AraBERTMo_base now comes in 10 new variants
17
+ All models are available on the `HuggingFace` model page under the [Ebtihal](https://huggingface.co/Ebtihal/) name.
18
+ Checkpoints are available in PyTorch formats.
19
+
20
+ ## Pretraining Corpus
21
+ `AraBertMo_base_V7' model was pre-trained on ~3 million words:
22
+ - [OSCAR](https://traces1.inria.fr/oscar/) - Arabic version "unshuffled_deduplicated_ar".
23
+
24
+ ## Training results
25
+ this model achieves the following results:
26
+
27
+ | Task | Num examples | Num Epochs | Batch Size | steps | Wall time | training loss|
28
+ |:----:|:----:|:----:|:----:|:-----:|:----:|:-----:|
29
+ | Fill-Mask| 50046| 7 | 64 | 5915 | 5h 23m 5s | 7.1381 |
30
+
31
+ ## Load Pretrained Model
32
+ You can use this model by installing `torch` or `tensorflow` and Huggingface library `transformers`. And you can use it directly by initializing it like this:
33
+ ```python
34
+ from transformers import AutoTokenizer, AutoModel
35
+ tokenizer = AutoTokenizer.from_pretrained("Ebtihal/AraBertMo_base_V7")
36
+ model = AutoModelForMaskedLM.from_pretrained("Ebtihal/AraBertMo_base_V7")
37
+ ```
38
+
39
+ ## This model was built for master's degree research in an organization:
40
+ - [University of kufa](https://uokufa.edu.iq/).
41
+ - [Faculty of Computer Science and Mathematics](https://mathcomp.uokufa.edu.iq/).
42
+ - **Department of Computer Science**
43
+