Update README.md
Browse files
README.md
CHANGED
|
@@ -13,26 +13,23 @@ This model has been pushed to the Hub using the [PytorchModelHubMixin](https://h
|
|
| 13 |
- Docs: [More Information Needed]
|
| 14 |
|
| 15 |
|
| 16 |
-
#
|
| 17 |
|
| 18 |
-
**
|
| 19 |
|
| 20 |
## Model Details
|
| 21 |
|
| 22 |
* **Developed by:** Orature AI (S.M Ali Naqvi, Zainab Haider, Haya Fatima, Ali M Asad, Hammad Sajid)
|
| 23 |
-
|
| 24 |
* **Model type:** Decoder-only Transformer, GPT-like
|
| 25 |
* **Variant:** ALIF-Base-100M
|
| 26 |
* **Language(s) (NLP):** Urdu (ur)
|
| 27 |
* **License:** Apache 2.0
|
| 28 |
-
<!-- * **Finetuned from model (if applicable):** [e.g., `OratureAI/ALIF-Base-1B`] -->
|
| 29 |
-
<!-- * **Related Models:** Other models in the ALIF الف series by Orature AI. -->
|
| 30 |
-
<!-- * **Project Repository/Paper:** [Link to ALIF GitHub Repo or Paper arXiv/Website] -->
|
| 31 |
* **Architecture:** Transformer (GPT-Based)
|
| 32 |
* **Framework:** PyTorch
|
| 33 |
* **Tokeniezer:** SentencePiece Custom Tokenizer
|
| 34 |
* **Hyperparameters:**:
|
| 35 |
-
* **Vocabulary Size:**
|
| 36 |
* **Embedding Size:** 768
|
| 37 |
* **Attention Heads:** 12
|
| 38 |
* **Layers:** 12
|
|
@@ -66,7 +63,7 @@ print(f"Generated Text: {generated_text}")
|
|
| 66 |
|
| 67 |
## Model Description
|
| 68 |
|
| 69 |
-
**ALIF Base 100M** is designed to
|
| 70 |
|
| 71 |
**Key Features:**
|
| 72 |
* Optimized for Urdu language nuances.
|
|
@@ -81,11 +78,10 @@ print(f"Generated Text: {generated_text}")
|
|
| 81 |
* **Research:** Base for further research in Urdu NLP, low-resource language modeling.
|
| 82 |
* **Fine-tuning:** Can be fine-tuned for specific downstream tasks like sentiment analysis, summarization, or domain-specific chatbots in Urdu.
|
| 83 |
* **Educational Purposes:** Understanding SLM behavior for Urdu.
|
| 84 |
-
*
|
| 85 |
-
|
| 86 |
**Limitations:**
|
| 87 |
* The model is primarily trained on Urdu and may not perform well on other languages or code-switched text unless specifically designed for it (e.g., an Ur-En variant).
|
| 88 |
-
* As a base generative model
|
| 89 |
* The model may reflect biases present in the training data. The ALIF-Urdu-Corpus was curated from diverse sources, but biases (e.g., societal, gender, regional) may still exist.
|
| 90 |
* Performance on highly specific or technical domains may be limited without further fine-tuning.
|
| 91 |
* The model does not have real-time knowledge and its information is limited to its training data.
|
|
|
|
| 13 |
- Docs: [More Information Needed]
|
| 14 |
|
| 15 |
|
| 16 |
+
# ALIF Base 100M
|
| 17 |
|
| 18 |
+
**ALIF Base 100M** is an Urdu generative language model from the **ALIF الف** series (a Final Year Project at Habib University), developed by **Orature AI**.
|
| 19 |
|
| 20 |
## Model Details
|
| 21 |
|
| 22 |
* **Developed by:** Orature AI (S.M Ali Naqvi, Zainab Haider, Haya Fatima, Ali M Asad, Hammad Sajid)
|
| 23 |
+
* **Supervised by:** Dr. Abdul Samad (Habib University)
|
| 24 |
* **Model type:** Decoder-only Transformer, GPT-like
|
| 25 |
* **Variant:** ALIF-Base-100M
|
| 26 |
* **Language(s) (NLP):** Urdu (ur)
|
| 27 |
* **License:** Apache 2.0
|
|
|
|
|
|
|
|
|
|
| 28 |
* **Architecture:** Transformer (GPT-Based)
|
| 29 |
* **Framework:** PyTorch
|
| 30 |
* **Tokeniezer:** SentencePiece Custom Tokenizer
|
| 31 |
* **Hyperparameters:**:
|
| 32 |
+
* **Vocabulary Size:** 32000
|
| 33 |
* **Embedding Size:** 768
|
| 34 |
* **Attention Heads:** 12
|
| 35 |
* **Layers:** 12
|
|
|
|
| 63 |
|
| 64 |
## Model Description
|
| 65 |
|
| 66 |
+
**ALIF Base 100M** is designed to generate coherent and contextually relevant Urdu text. It leverages a custom Urdu tokenizer trained on the ALIF-Urdu-Corpus and was pretrained on a large corpus of diverse Urdu text.
|
| 67 |
|
| 68 |
**Key Features:**
|
| 69 |
* Optimized for Urdu language nuances.
|
|
|
|
| 78 |
* **Research:** Base for further research in Urdu NLP, low-resource language modeling.
|
| 79 |
* **Fine-tuning:** Can be fine-tuned for specific downstream tasks like sentiment analysis, summarization, or domain-specific chatbots in Urdu.
|
| 80 |
* **Educational Purposes:** Understanding SLM behavior for Urdu.
|
| 81 |
+
*
|
|
|
|
| 82 |
**Limitations:**
|
| 83 |
* The model is primarily trained on Urdu and may not perform well on other languages or code-switched text unless specifically designed for it (e.g., an Ur-En variant).
|
| 84 |
+
* As a base generative model, it may generate plausible-sounding but incorrect or nonsensical information (hallucinations).
|
| 85 |
* The model may reflect biases present in the training data. The ALIF-Urdu-Corpus was curated from diverse sources, but biases (e.g., societal, gender, regional) may still exist.
|
| 86 |
* Performance on highly specific or technical domains may be limited without further fine-tuning.
|
| 87 |
* The model does not have real-time knowledge and its information is limited to its training data.
|