MSadeck riotu-lab commited on
Commit
4d9261b
·
verified ·
0 Parent(s):

Duplicate from riotu-lab/ArabianGPT-03B

Browse files

Co-authored-by: Robotics and Interne-of-Things <riotu-lab@users.noreply.huggingface.co>

.gitattributes ADDED
@@ -0,0 +1,35 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ *.7z filter=lfs diff=lfs merge=lfs -text
2
+ *.arrow filter=lfs diff=lfs merge=lfs -text
3
+ *.bin filter=lfs diff=lfs merge=lfs -text
4
+ *.bz2 filter=lfs diff=lfs merge=lfs -text
5
+ *.ckpt filter=lfs diff=lfs merge=lfs -text
6
+ *.ftz filter=lfs diff=lfs merge=lfs -text
7
+ *.gz filter=lfs diff=lfs merge=lfs -text
8
+ *.h5 filter=lfs diff=lfs merge=lfs -text
9
+ *.joblib filter=lfs diff=lfs merge=lfs -text
10
+ *.lfs.* filter=lfs diff=lfs merge=lfs -text
11
+ *.mlmodel filter=lfs diff=lfs merge=lfs -text
12
+ *.model filter=lfs diff=lfs merge=lfs -text
13
+ *.msgpack filter=lfs diff=lfs merge=lfs -text
14
+ *.npy filter=lfs diff=lfs merge=lfs -text
15
+ *.npz filter=lfs diff=lfs merge=lfs -text
16
+ *.onnx filter=lfs diff=lfs merge=lfs -text
17
+ *.ot filter=lfs diff=lfs merge=lfs -text
18
+ *.parquet filter=lfs diff=lfs merge=lfs -text
19
+ *.pb filter=lfs diff=lfs merge=lfs -text
20
+ *.pickle filter=lfs diff=lfs merge=lfs -text
21
+ *.pkl filter=lfs diff=lfs merge=lfs -text
22
+ *.pt filter=lfs diff=lfs merge=lfs -text
23
+ *.pth filter=lfs diff=lfs merge=lfs -text
24
+ *.rar filter=lfs diff=lfs merge=lfs -text
25
+ *.safetensors filter=lfs diff=lfs merge=lfs -text
26
+ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
27
+ *.tar.* filter=lfs diff=lfs merge=lfs -text
28
+ *.tar filter=lfs diff=lfs merge=lfs -text
29
+ *.tflite filter=lfs diff=lfs merge=lfs -text
30
+ *.tgz filter=lfs diff=lfs merge=lfs -text
31
+ *.wasm filter=lfs diff=lfs merge=lfs -text
32
+ *.xz filter=lfs diff=lfs merge=lfs -text
33
+ *.zip filter=lfs diff=lfs merge=lfs -text
34
+ *.zst filter=lfs diff=lfs merge=lfs -text
35
+ *tfevents* filter=lfs diff=lfs merge=lfs -text
README.md ADDED
@@ -0,0 +1,79 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ language:
4
+ - ar
5
+ pipeline_tag: text-generation
6
+ tags:
7
+ - 'arabic '
8
+ - text-generation
9
+ widget:
10
+ - text: "أعلنت وزارة الحج في المملكة العربية السعودية"
11
+ example_title: "مثال ١"
12
+ - text: "يبدو اليوم جميلا، سأقوم بتحضير"
13
+ example_title: "مثال ٢"
14
+ - text: "إن التقنيات الحديثة"
15
+ example_title: "مثال ٣"
16
+ ---
17
+ # ArabianGPT Model Overview
18
+
19
+ ## Disclaimer for the Use of Large Language Models (LLMs) for Text Generation
20
+
21
+ <p style="color: red;">We disclaim all responsibility for any harm, inaccuracies, or inappropriate content generated by ArabianGPT-0.3B, and users engage with and apply the model's outputs at their own risk.</p>
22
+
23
+ > **Important Note:** Currently, we offer a raw pre-trained model. Our team is actively working on releasing instruction-based LLMs that are fine-tuned and augmented with LRHF. The first set of pre-trained models has been made available for community exploration. While we do have models fine-tuned for specific tasks such as summarization and sentiment analysis, they are still in the development phase.
24
+
25
+ ## How you can use this Pre-Trained?
26
+ You are invited to utilize this pre-trained, native Arabic language model as an experimental tool to assess its capabilities, aid in its fine-tuning, and evaluate its performance across a variety of downstream tasks. We encourage you to review our technical report for a comprehensive understanding of the model's performance metrics and the specific downstream tasks it has been tested on. This will provide valuable insights into its applicability and effectiveness in diverse applications.
27
+
28
+
29
+ ## Introduction
30
+ ArabianGPT-0.3B, developed under the ArabianLLM initiatives, is a specialized GPT-2 model optimized for Arabic language modeling.
31
+ It's a product of the collaborative efforts at Prince Sultan University's Robotics and Internet of Things Lab, focusing on enhancing natural language modeling and generation in Arabic.
32
+ This model represents a significant stride in LLM research, specifically addressing the linguistic complexities and nuances of the Arabic language.
33
+
34
+ ## Key Features
35
+ - **Architecture**: GPT-2
36
+ - **Model Size**: 345 million parameters
37
+ - **Layers**: 24
38
+ - **Model Attention Layers (MAL)**: 16
39
+ - **Context Window Size**: 1024 tokens
40
+
41
+ ## Training
42
+ - **Dataset**: Scraped texts contains scientific articles, and general texts
43
+ - **Data Size**: 23 GB
44
+ - **Tokenizer**: Aranizer 64K
45
+ - **Tokens**: Over 3.3 billion
46
+ - **Hardware**: 4 NDIVIA A100 GPUs
47
+ - **Training Duration**: 45 days
48
+ - **Performance**: loss of 3.82
49
+
50
+
51
+ ## Role in ArabianLLM Initiatives
52
+ ArabianGPT-0.3B is crucial for advancing Arabic language processing, addressing challenges unique to Arabic morphology and dialects.
53
+
54
+ ## Usage
55
+ Suitable for Arabic text generation tasks. Example usage with Transformers Pipeline:
56
+ ```python
57
+ from transformers import pipeline
58
+
59
+ pipe = pipeline("text-generation", model="riotu-lab/ArabianGPT-03B", max_new_tokens=512)
60
+ text = ''
61
+ pipe.predict(text)
62
+ ```
63
+
64
+ ## Limitations and Ethical Considerations
65
+
66
+ - The model may have context understanding or text generation limitations in certain scenarios.
67
+ - Emphasis on ethical use to prevent misinformation or harmful content propagation.
68
+
69
+ ## Acknowledgments
70
+
71
+ Special thanks to Prince Sultan University, particularly the Robotics and Internet of Things Lab.
72
+
73
+ ## Contact Information
74
+
75
+ For inquiries: [riotu@psu.edu.sa](mailto:riotu@psu.edu.sa).
76
+
77
+ ## Disclaimer for the Use of Large Language Models (LLMs) for Text Generation
78
+
79
+ <p style="color: red;">We disclaim all responsibility for any harm, inaccuracies, or inappropriate content generated by ArabianGPT-0.3B, and users engage with and apply the model's outputs at their own risk.</p>
config.json ADDED
@@ -0,0 +1,39 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_name_or_path": "gpt2",
3
+ "activation_function": "gelu_new",
4
+ "architectures": [
5
+ "GPT2LMHeadModel"
6
+ ],
7
+ "attn_pdrop": 0.1,
8
+ "bos_token_id": null,
9
+ "embd_pdrop": 0.1,
10
+ "eos_token_id": 64000,
11
+ "initializer_range": 0.02,
12
+ "layer_norm_epsilon": 1e-05,
13
+ "model_type": "gpt2",
14
+ "n_ctx": 1024,
15
+ "n_embd": 1024,
16
+ "n_head": 16,
17
+ "n_inner": null,
18
+ "n_layer": 24,
19
+ "n_positions": 1024,
20
+ "reorder_and_upcast_attn": false,
21
+ "resid_pdrop": 0.1,
22
+ "scale_attn_by_inverse_layer_idx": false,
23
+ "scale_attn_weights": true,
24
+ "summary_activation": null,
25
+ "summary_first_dropout": 0.1,
26
+ "summary_proj_to_labels": true,
27
+ "summary_type": "cls_index",
28
+ "summary_use_proj": true,
29
+ "task_specific_params": {
30
+ "text-generation": {
31
+ "do_sample": true,
32
+ "max_length": 50
33
+ }
34
+ },
35
+ "torch_dtype": "float32",
36
+ "transformers_version": "4.26.1",
37
+ "use_cache": true,
38
+ "vocab_size": 64002
39
+ }
generation_config.json ADDED
@@ -0,0 +1,5 @@
 
 
 
 
 
 
1
+ {
2
+ "_from_model_config": true,
3
+ "eos_token_id": 64000,
4
+ "transformers_version": "4.26.1"
5
+ }
optimizer.pt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:c52f2dd9ad7753b66160394f51f6c28e079e194def34b82aba9bf289b89cdfaa
3
+ size 2951356165
pytorch_model.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:109bdb895735575bfb56d1314f83d9b56e60e92915c45c51fa104415f1adb8d9
3
+ size 1500868893
rng_state.pth ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:6955beb6b4d79272de0d5dc7c183a7aa1bbca6701f27e5fbc2a90e42a8a71161
3
+ size 17641
scaler.pt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:9823af53e106220ef46f06c7e5f6850dbdad37eaa61409b2eacabc3fb257e46e
3
+ size 557
scheduler.pt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:a7a9dc3aeeefb0c0aa7633fb094390c7cac2c64618f5577b903d68a26ca280d9
3
+ size 627
special_tokens_map.json ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ {
2
+ "eos_token": "<EOS>",
3
+ "pad_token": "<EOS>"
4
+ }
tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
 
tokenizer_config.json ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ {
2
+ "model_max_length": 1000000000000000019884624838656,
3
+ "tokenizer_class": "PreTrainedTokenizerFast"
4
+ }
trainer_state.json ADDED
The diff for this file is too large to render. See raw diff
 
training_args.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:5a82c962654672bda14ea9dd629fd693b93c772f2a5c0c3748f06df0c32968b3
3
+ size 3515