Push model using huggingface_hub.
Browse files- README.md +6 -85
- config.json +7 -0
- model.safetensors +3 -0
README.md
CHANGED
|
@@ -1,88 +1,9 @@
|
|
| 1 |
---
|
| 2 |
-
|
|
|
|
|
|
|
| 3 |
---
|
| 4 |
|
| 5 |
-
|
| 6 |
-
|
| 7 |
-
|
| 8 |
-
|
| 9 |
-
## Model Details
|
| 10 |
-
|
| 11 |
-
* **Developed by:** Orature AI (S.M Ali Naqvi, Zainab Haider, Haya Fatima, Ali M Asad, Hammad Sajid)
|
| 12 |
-
<!-- * **Supervised by:** Dr. Abdul Samad (Habib University) -->
|
| 13 |
-
* **Model type:** Decoder-only Transformer, GPT-like
|
| 14 |
-
* **Variant:** ALIF-Base-100M
|
| 15 |
-
* **Language(s) (NLP):** Urdu (ur)
|
| 16 |
-
* **License:** Apache 2.0
|
| 17 |
-
<!-- * **Finetuned from model (if applicable):** [e.g., `OratureAI/ALIF-Base-1B`] -->
|
| 18 |
-
<!-- * **Related Models:** Other models in the ALIF الف series by Orature AI. -->
|
| 19 |
-
<!-- * **Project Repository/Paper:** [Link to ALIF GitHub Repo or Paper arXiv/Website] -->
|
| 20 |
-
* **Architecture:** Transformer (GPT-Based)
|
| 21 |
-
* **Framework:** PyTorch
|
| 22 |
-
* **Tokeniezer:** SentencePiece Custom Tokenizer
|
| 23 |
-
* **Hyperparameters:**:
|
| 24 |
-
* **Vocabulary Size:** 32000
|
| 25 |
-
* **Embedding Size:** 768
|
| 26 |
-
* **Attention Heads:** 12
|
| 27 |
-
* **Layers:** 12
|
| 28 |
-
|
| 29 |
-
## How to Get Started with the Model
|
| 30 |
-
|
| 31 |
-
First you will need to download the modeling_gpt.py file from the repo. Once that's been done, you can define another file and use the following code to generate text from the model:
|
| 32 |
-
|
| 33 |
-
```python
|
| 34 |
-
from modeling_gpt import GPTLanguageModel
|
| 35 |
-
from transformers import AutoTokenizer
|
| 36 |
-
import torch
|
| 37 |
-
|
| 38 |
-
model_name = "OratureAI/[MODEL_NAME_ON_HF_HUB]" # e.g., OratureAI/ALIF-Instruct-1B
|
| 39 |
-
model = AutoModelForCausalLM.from_pretrained(model_name)
|
| 40 |
-
tokenizer = AutoTokenizer.from_pretrained(model_name)
|
| 41 |
-
|
| 42 |
-
# For text generation
|
| 43 |
-
prompt_urdu = "ایک دفعہ کا ذکر ہے کہ " # "Once upon a time, "
|
| 44 |
-
inputs = tokenizer.encode(prompt_urdu)
|
| 45 |
-
inputs_tensor = torch.tensor(inputs).unsqueeze(0) # Add batch dimension
|
| 46 |
-
|
| 47 |
-
# Generate text
|
| 48 |
-
outputs = model.generate(inputs_tensor, max_new_tokens=128, temperature=0.7)
|
| 49 |
-
outputs_tensor = torch.tensor(encoded).unsqueeze(0)
|
| 50 |
-
generated_text = tokenizer.decode(outputs_tensor[0].squeeze().tolist())
|
| 51 |
-
|
| 52 |
-
print(f"Prompt: {prompt_urdu}")
|
| 53 |
-
print(f"Generated Text: {generated_text}")
|
| 54 |
-
```
|
| 55 |
-
|
| 56 |
-
## Model Description
|
| 57 |
-
|
| 58 |
-
ALIF Base 100M is designed to generate coherent and contextually relevant Urdu text. It leverages a custom Urdu tokenizer trained on the ALIF-Urdu-Corpus and was pretrained on a large corpus of diverse Urdu text.
|
| 59 |
-
|
| 60 |
-
**Key Features:**
|
| 61 |
-
* Optimized for Urdu language nuances.
|
| 62 |
-
* Strong foundational capabilities for further fine-tuning (for base models)
|
| 63 |
-
* Capable of generating next tokens in a sequence, making it suitable for various text generation tasks.
|
| 64 |
-
* Part of a series aiming to provide efficient and accessible SLMs for Urdu.
|
| 65 |
-
|
| 66 |
-
## Intended Uses & Limitations
|
| 67 |
-
|
| 68 |
-
**Intended Uses:**
|
| 69 |
-
* **Text Generation:** Creative writing, content generation, story completion in Urdu.
|
| 70 |
-
* **Research:** Base for further research in Urdu NLP, low-resource language modeling.
|
| 71 |
-
* **Fine-tuning:** Can be fine-tuned for specific downstream tasks like sentiment analysis, summarization, or domain-specific chatbots in Urdu.
|
| 72 |
-
* **Educational Purposes:** Understanding SLM behavior for Urdu.
|
| 73 |
-
* **(For Instruct Models):** Conversational AI, Q&A, task completion in Urdu.
|
| 74 |
-
|
| 75 |
-
**Limitations:**
|
| 76 |
-
* The model is primarily trained on Urdu and may not perform well on other languages or code-switched text unless specifically designed for it (e.g., an Ur-En variant).
|
| 77 |
-
* As a base generative model (especially for non-instruct versions), it may generate plausible-sounding but incorrect or nonsensical information (hallucinations).
|
| 78 |
-
* The model may reflect biases present in the training data. The ALIF-Urdu-Corpus was curated from diverse sources, but biases (e.g., societal, gender, regional) may still exist.
|
| 79 |
-
* Performance on highly specific or technical domains may be limited without further fine-tuning.
|
| 80 |
-
* The model does not have real-time knowledge and its information is limited to its training data.
|
| 81 |
-
* Safety: While efforts are made to curate data, the model might generate offensive, harmful, or inappropriate content. Users should implement appropriate safeguards for downstream applications.
|
| 82 |
-
|
| 83 |
-
**Out-of-Scope Uses:**
|
| 84 |
-
* Generating high-stakes advice (medical, legal, financial) without human oversight.
|
| 85 |
-
* Impersonation or generating misleading information.
|
| 86 |
-
* Applications that could lead to harm or discrimination.
|
| 87 |
-
* Complex scientific, technical, mathematical, or legal reasoning without further fine-tuning.
|
| 88 |
-
* Any use that violates ethical guidelines or legal standards.
|
|
|
|
| 1 |
---
|
| 2 |
+
tags:
|
| 3 |
+
- model_hub_mixin
|
| 4 |
+
- pytorch_model_hub_mixin
|
| 5 |
---
|
| 6 |
|
| 7 |
+
This model has been pushed to the Hub using the [PytorchModelHubMixin](https://huggingface.co/docs/huggingface_hub/package_reference/mixins#huggingface_hub.PyTorchModelHubMixin) integration:
|
| 8 |
+
- Library: [More Information Needed]
|
| 9 |
+
- Docs: [More Information Needed]
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
config.json
ADDED
|
@@ -0,0 +1,7 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"block_size": 1024,
|
| 3 |
+
"n_embd": 768,
|
| 4 |
+
"n_head": 12,
|
| 5 |
+
"n_layer": 12,
|
| 6 |
+
"vocab_size": 32000
|
| 7 |
+
}
|
model.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:c48224872f903c46b56d28661b21f1add2e0347681fd992989886b10d59638cf
|
| 3 |
+
size 441815720
|