Duplicate from riotu-lab/ArabianGPT-03B

Browse files

Co-authored-by: Robotics and Interne-of-Things <riotu-lab@users.noreply.huggingface.co>

Files changed (14) hide show

.gitattributes +35 -0
README.md +79 -0
config.json +39 -0
generation_config.json +5 -0
optimizer.pt +3 -0
pytorch_model.bin +3 -0
rng_state.pth +3 -0
scaler.pt +3 -0
scheduler.pt +3 -0
special_tokens_map.json +4 -0
tokenizer.json +0 -0
tokenizer_config.json +4 -0
trainer_state.json +0 -0
training_args.bin +3 -0

.gitattributes ADDED Viewed

	@@ -0,0 +1,35 @@

+*.7z filter=lfs diff=lfs merge=lfs -text
+*.arrow filter=lfs diff=lfs merge=lfs -text
+*.bin filter=lfs diff=lfs merge=lfs -text
+*.bz2 filter=lfs diff=lfs merge=lfs -text
+*.ckpt filter=lfs diff=lfs merge=lfs -text
+*.ftz filter=lfs diff=lfs merge=lfs -text
+*.gz filter=lfs diff=lfs merge=lfs -text
+*.h5 filter=lfs diff=lfs merge=lfs -text
+*.joblib filter=lfs diff=lfs merge=lfs -text
+*.lfs.* filter=lfs diff=lfs merge=lfs -text
+*.mlmodel filter=lfs diff=lfs merge=lfs -text
+*.model filter=lfs diff=lfs merge=lfs -text
+*.msgpack filter=lfs diff=lfs merge=lfs -text
+*.npy filter=lfs diff=lfs merge=lfs -text
+*.npz filter=lfs diff=lfs merge=lfs -text
+*.onnx filter=lfs diff=lfs merge=lfs -text
+*.ot filter=lfs diff=lfs merge=lfs -text
+*.parquet filter=lfs diff=lfs merge=lfs -text
+*.pb filter=lfs diff=lfs merge=lfs -text
+*.pickle filter=lfs diff=lfs merge=lfs -text
+*.pkl filter=lfs diff=lfs merge=lfs -text
+*.pt filter=lfs diff=lfs merge=lfs -text
+*.pth filter=lfs diff=lfs merge=lfs -text
+*.rar filter=lfs diff=lfs merge=lfs -text
+*.safetensors filter=lfs diff=lfs merge=lfs -text
+saved_model/**/* filter=lfs diff=lfs merge=lfs -text
+*.tar.* filter=lfs diff=lfs merge=lfs -text
+*.tar filter=lfs diff=lfs merge=lfs -text
+*.tflite filter=lfs diff=lfs merge=lfs -text
+*.tgz filter=lfs diff=lfs merge=lfs -text
+*.wasm filter=lfs diff=lfs merge=lfs -text
+*.xz filter=lfs diff=lfs merge=lfs -text
+*.zip filter=lfs diff=lfs merge=lfs -text
+*.zst filter=lfs diff=lfs merge=lfs -text
+*tfevents* filter=lfs diff=lfs merge=lfs -text

README.md ADDED Viewed

	@@ -0,0 +1,79 @@

+---
+license: apache-2.0
+language:
+- ar
+pipeline_tag: text-generation
+tags:
+- 'arabic '
+- text-generation
+widget:
+- text: "أعلنت وزارة الحج في المملكة العربية السعودية"
+  example_title: "مثال ١"
+- text: "يبدو اليوم جميلا، سأقوم بتحضير"
+  example_title: "مثال ٢"
+- text: "إن التقنيات الحديثة"
+  example_title: "مثال ٣"
+---
+# ArabianGPT Model Overview
+## Disclaimer for the Use of Large Language Models (LLMs) for Text Generation
+<p style="color: red;">We disclaim all responsibility for any harm, inaccuracies, or inappropriate content generated by ArabianGPT-0.3B, and users engage with and apply the model's outputs at their own risk.</p>
+> **Important Note:** Currently, we offer a raw pre-trained model. Our team is actively working on releasing instruction-based LLMs that are fine-tuned and augmented with LRHF. The first set of pre-trained models has been made available for community exploration. While we do have models fine-tuned for specific tasks such as summarization and sentiment analysis, they are still in the development phase.
+## How you can use this Pre-Trained?
+You are invited to utilize this pre-trained, native Arabic language model as an experimental tool to assess its capabilities, aid in its fine-tuning, and evaluate its performance across a variety of downstream tasks. We encourage you to review our technical report for a comprehensive understanding of the model's performance metrics and the specific downstream tasks it has been tested on. This will provide valuable insights into its applicability and effectiveness in diverse applications.
+## Introduction
+ArabianGPT-0.3B, developed under the ArabianLLM initiatives, is a specialized GPT-2 model optimized for Arabic language modeling.
+It's a product of the collaborative efforts at Prince Sultan University's Robotics and Internet of Things Lab, focusing on enhancing natural language modeling and generation in Arabic.
+This model represents a significant stride in LLM research, specifically addressing the linguistic complexities and nuances of the Arabic language.
+## Key Features
+- **Architecture**: GPT-2
+- **Model Size**: 345 million parameters
+- **Layers**: 24
+- **Model Attention Layers (MAL)**: 16
+- **Context Window Size**: 1024 tokens
+## Training
+- **Dataset**: Scraped texts contains scientific articles, and general texts
+- **Data Size**: 23 GB
+- **Tokenizer**: Aranizer 64K
+- **Tokens**: Over 3.3 billion
+- **Hardware**: 4 NDIVIA A100 GPUs
+- **Training Duration**: 45 days
+- **Performance**:  loss of 3.82
+## Role in ArabianLLM Initiatives
+ArabianGPT-0.3B  is crucial for advancing Arabic language processing, addressing challenges unique to Arabic morphology and dialects.
+## Usage
+Suitable for Arabic text generation tasks. Example usage with Transformers Pipeline:
+```python
+from transformers import pipeline
+pipe = pipeline("text-generation", model="riotu-lab/ArabianGPT-03B", max_new_tokens=512)
+text = ''
+pipe.predict(text)
+```
+## Limitations and Ethical Considerations
+- The model may have context understanding or text generation limitations in certain scenarios.
+- Emphasis on ethical use to prevent misinformation or harmful content propagation.
+## Acknowledgments
+Special thanks to Prince Sultan University, particularly the Robotics and Internet of Things Lab.
+## Contact Information
+For inquiries: [riotu@psu.edu.sa](mailto:riotu@psu.edu.sa).
+## Disclaimer for the Use of Large Language Models (LLMs) for Text Generation
+<p style="color: red;">We disclaim all responsibility for any harm, inaccuracies, or inappropriate content generated by ArabianGPT-0.3B, and users engage with and apply the model's outputs at their own risk.</p>

config.json ADDED Viewed

	@@ -0,0 +1,39 @@

+{
+  "_name_or_path": "gpt2",
+  "activation_function": "gelu_new",
+  "architectures": [
+    "GPT2LMHeadModel"
+  ],
+  "attn_pdrop": 0.1,
+  "bos_token_id": null,
+  "embd_pdrop": 0.1,
+  "eos_token_id": 64000,
+  "initializer_range": 0.02,
+  "layer_norm_epsilon": 1e-05,
+  "model_type": "gpt2",
+  "n_ctx": 1024,
+  "n_embd": 1024,
+  "n_head": 16,
+  "n_inner": null,
+  "n_layer": 24,
+  "n_positions": 1024,
+  "reorder_and_upcast_attn": false,
+  "resid_pdrop": 0.1,
+  "scale_attn_by_inverse_layer_idx": false,
+  "scale_attn_weights": true,
+  "summary_activation": null,
+  "summary_first_dropout": 0.1,
+  "summary_proj_to_labels": true,
+  "summary_type": "cls_index",
+  "summary_use_proj": true,
+  "task_specific_params": {
+    "text-generation": {
+      "do_sample": true,
+      "max_length": 50
+    }
+  },
+  "torch_dtype": "float32",
+  "transformers_version": "4.26.1",
+  "use_cache": true,
+  "vocab_size": 64002
+}

generation_config.json ADDED Viewed

	@@ -0,0 +1,5 @@

+{
+  "_from_model_config": true,
+  "eos_token_id": 64000,
+  "transformers_version": "4.26.1"
+}

optimizer.pt ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:c52f2dd9ad7753b66160394f51f6c28e079e194def34b82aba9bf289b89cdfaa
+size 2951356165

pytorch_model.bin ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:109bdb895735575bfb56d1314f83d9b56e60e92915c45c51fa104415f1adb8d9
+size 1500868893

rng_state.pth ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:6955beb6b4d79272de0d5dc7c183a7aa1bbca6701f27e5fbc2a90e42a8a71161
+size 17641

scaler.pt ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:9823af53e106220ef46f06c7e5f6850dbdad37eaa61409b2eacabc3fb257e46e
+size 557

scheduler.pt ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:a7a9dc3aeeefb0c0aa7633fb094390c7cac2c64618f5577b903d68a26ca280d9
+size 627

special_tokens_map.json ADDED Viewed

	@@ -0,0 +1,4 @@

+{
+  "eos_token": "<EOS>",
+  "pad_token": "<EOS>"
+}

tokenizer.json ADDED Viewed

The diff for this file is too large to render. See raw diff

tokenizer_config.json ADDED Viewed

	@@ -0,0 +1,4 @@

+{
+  "model_max_length": 1000000000000000019884624838656,
+  "tokenizer_class": "PreTrainedTokenizerFast"
+}

trainer_state.json ADDED Viewed

The diff for this file is too large to render. See raw diff

training_args.bin ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:5a82c962654672bda14ea9dd629fd693b93c772f2a5c0c3748f06df0c32968b3
+size 3515