AliMuhammad73 commited on
Commit
5003c37
·
verified ·
1 Parent(s): 7bc80e5

Push model using huggingface_hub.

Browse files
Files changed (2) hide show
  1. README.md +1 -113
  2. config.json +0 -1
README.md CHANGED
@@ -2,120 +2,8 @@
2
  tags:
3
  - model_hub_mixin
4
  - pytorch_model_hub_mixin
5
- language:
6
- - ur
7
  ---
8
- <!-- ---
9
- tags:
10
- - model_hub_mixin
11
- - pytorch_model_hub_mixin
12
 
13
  This model has been pushed to the Hub using the [PytorchModelHubMixin](https://huggingface.co/docs/huggingface_hub/package_reference/mixins#huggingface_hub.PyTorchModelHubMixin) integration:
14
  - Library: [More Information Needed]
15
- - Docs: [More Information Needed] -->
16
- ---
17
- license: apache-2.0
18
-
19
-
20
- ---
21
-
22
- # [ALIF Base 100M]
23
-
24
- **[ALIF Base 100M]** is an Urdu generative language model from the **ALIF الف** series (a Final Year Project at Habib University), developed by **Orature AI**. This model is a [decoder-only Transformer / Naive GPT-2 based] architecture, specifically pretrained for the Urdu language.
25
-
26
- ## Model Details
27
-
28
- * **Developed by:** Orature AI (S.M Ali Naqvi, Zainab Haider, Haya Fatima, Ali M Asad, Hammad Sajid)
29
- <!-- * **Supervised by:** Dr. Abdul Samad (Habib University) -->
30
- * **Model type:** Decoder-only Transformer, GPT-like
31
- * **Variant:** ALIF-Base-100M
32
- * **Language(s) (NLP):** Urdu (ur)
33
- * **License:** Apache 2.0
34
- <!-- * **Finetuned from model (if applicable):** [e.g., `OratureAI/ALIF-Base-1B`] -->
35
- <!-- * **Related Models:** Other models in the ALIF الف series by Orature AI. -->
36
- <!-- * **Project Repository/Paper:** [Link to ALIF GitHub Repo or Paper arXiv/Website] -->
37
- * **Architecture:** Transformer (GPT-Based)
38
- * **Framework:** PyTorch
39
- * **Tokeniezer:** SentencePiece Custom Tokenizer
40
- * **Hyperparameters:**:
41
- * **Vocabulary Size:** 20000
42
- * **Embedding Size:** 768
43
- * **Attention Heads:** 12
44
- * **Layers:** 12
45
-
46
- ## How to Get Started with the Model
47
-
48
- First you will need to download the modeling_gpt.py file from the repo. Once that's been done, you can define another file and use the following code to generate text from the model:
49
-
50
- ```python
51
- from modeling_gpt import GPTLanguageModel
52
- from transformers import AutoTokenizer, AutoModelForCausalLM
53
- import torch
54
-
55
- model_name = "orature/ALIF-Base-100M"
56
- model = AutoModelForCausalLM.from_pretrained(model_name)
57
- tokenizer = AutoTokenizer.from_pretrained(model_name)
58
-
59
- # For text generation
60
- prompt_urdu = "ایک دفعہ کا ذکر ہے کہ " # "Once upon a time, "
61
- inputs = tokenizer.encode(prompt_urdu)
62
- inputs_tensor = torch.tensor(inputs).unsqueeze(0) # Add batch dimension
63
-
64
- # Generate text
65
- outputs = model.generate(inputs_tensor, max_new_tokens=128, temperature=0.7)
66
- outputs_tensor = torch.tensor(outputs).unsqueeze(0)
67
- generated_text = tokenizer.decode(outputs_tensor[0].squeeze().tolist())
68
-
69
- print(f"Prompt: {prompt_urdu}")
70
- print(f"Generated Text: {generated_text}")
71
- ```
72
-
73
- ## Model Description
74
-
75
- **ALIF Base 100M** is designed to [generate coherent and contextually relevant Urdu text / understand and follow instructions in Urdu]. It leverages a custom Urdu tokenizer trained on the ALIF-Urdu-Corpus and was pretrained on a large corpus of diverse Urdu text.
76
-
77
- **Key Features:**
78
- * Optimized for Urdu language nuances.
79
- * Strong foundational capabilities for further fine-tuning (for base models)
80
- * Capable of generating next tokens in a sequence, making it suitable for various text generation tasks.
81
- * Part of a series aiming to provide efficient and accessible SLMs for Urdu.
82
-
83
- ## Intended Uses & Limitations
84
-
85
- **Intended Uses:**
86
- * **Text Generation:** Creative writing, content generation, story completion in Urdu.
87
- * **Research:** Base for further research in Urdu NLP, low-resource language modeling.
88
- * **Fine-tuning:** Can be fine-tuned for specific downstream tasks like sentiment analysis, summarization, or domain-specific chatbots in Urdu.
89
- * **Educational Purposes:** Understanding SLM behavior for Urdu.
90
- * **(For Instruct Models):** Conversational AI, Q&A, task completion in Urdu.
91
-
92
- **Limitations:**
93
- * The model is primarily trained on Urdu and may not perform well on other languages or code-switched text unless specifically designed for it (e.g., an Ur-En variant).
94
- * As a base generative model (especially for non-instruct versions), it may generate plausible-sounding but incorrect or nonsensical information (hallucinations).
95
- * The model may reflect biases present in the training data. The ALIF-Urdu-Corpus was curated from diverse sources, but biases (e.g., societal, gender, regional) may still exist.
96
- * Performance on highly specific or technical domains may be limited without further fine-tuning.
97
- * The model does not have real-time knowledge and its information is limited to its training data.
98
- * Safety: While efforts are made to curate data, the model might generate offensive, harmful, or inappropriate content. Users should implement appropriate safeguards for downstream applications.
99
-
100
- **Out-of-Scope Uses:**
101
- * Generating high-stakes advice (medical, legal, financial) without human oversight.
102
- * Impersonation or generating misleading information.
103
- * Applications that could lead to harm or discrimination.
104
- * Complex scientific, technical, mathematical, or legal reasoning without further fine-tuning.
105
- * Any use that violates ethical guidelines or legal standards.
106
-
107
-
108
- <!-- ## Citation
109
-
110
- If you use this model in your research or applications, please cite the following paper:
111
-
112
- @misc{alif2025,
113
- title={},
114
- author={},
115
- year={},
116
- publisher={},
117
- howpublished={},
118
- note={},
119
- url={}
120
- }
121
- -->
 
2
  tags:
3
  - model_hub_mixin
4
  - pytorch_model_hub_mixin
 
 
5
  ---
 
 
 
 
6
 
7
  This model has been pushed to the Hub using the [PytorchModelHubMixin](https://huggingface.co/docs/huggingface_hub/package_reference/mixins#huggingface_hub.PyTorchModelHubMixin) integration:
8
  - Library: [More Information Needed]
9
+ - Docs: [More Information Needed]
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
config.json CHANGED
@@ -1,5 +1,4 @@
1
  {
2
- "model_type": "llama",
3
  "block_size": 1024,
4
  "n_embd": 768,
5
  "n_head": 12,
 
1
  {
 
2
  "block_size": 1024,
3
  "n_embd": 768,
4
  "n_head": 12,