DannyAI commited on
Commit
d4d48c0
·
verified ·
1 Parent(s): b926e35

update to Readme file

Browse files
Files changed (1) hide show
  1. README.md +136 -30
README.md CHANGED
@@ -19,9 +19,6 @@ metrics:
19
  - bertscore
20
  ---
21
 
22
- <!-- This model card has been generated automatically according to the information the Trainer had access to. You
23
- should probably proofread and complete it, then remove this comment. -->
24
-
25
  [<img src="https://raw.githubusercontent.com/axolotl-ai-cloud/axolotl/main/image/axolotl-badge-web.png" alt="Built with Axolotl" width="200" height="32"/>](https://github.com/axolotl-ai-cloud/axolotl)
26
  <details><summary>See axolotl config</summary>
27
 
@@ -98,41 +95,86 @@ hub_private_repo: false
98
 
99
  # phi4_lora_axolotl
100
 
101
- This model is a fine-tuned version of [microsoft/Phi-4-mini-instruct](https://huggingface.co/microsoft/Phi-4-mini-instruct) on the DannyAI/African-History-QA-Dataset dataset.
102
- It achieves the following results on the evaluation set:
103
- - Loss: 1.7479
104
- - Ppl: 5.7428
105
- - Memory/max Active (gib): 14.84
106
- - Memory/max Allocated (gib): 14.84
107
- - Memory/device Reserved (gib): 31.79
108
 
109
- ## Model description
110
 
111
- More information needed
112
 
113
- ## Intended uses & limitations
 
 
 
 
 
 
114
 
115
- More information needed
116
 
117
- ## Training and evaluation data
118
 
119
- More information needed
120
 
121
- ## Training procedure
122
 
123
- ### Training hyperparameters
 
 
 
 
 
 
 
124
 
125
- The following hyperparameters were used during training:
126
- - learning_rate: 2e-05
127
- - train_batch_size: 2
128
- - eval_batch_size: 2
129
- - seed: 42
130
- - gradient_accumulation_steps: 4
131
- - total_train_batch_size: 8
132
- - optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
133
- - lr_scheduler_type: cosine
134
- - lr_scheduler_warmup_steps: 20
135
- - training_steps: 650
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
136
 
137
  ### Training results
138
 
@@ -154,10 +196,74 @@ The following hyperparameters were used during training:
154
  | 2.5727 | 50.0 | 650 | 1.7479 | 5.7428 | 14.84 | 14.84 | 31.79 |
155
 
156
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
157
  ### Framework versions
158
 
159
  - PEFT 0.18.1
160
  - Transformers 4.57.6
161
  - Pytorch 2.9.1+cu128
162
  - Datasets 4.5.0
163
- - Tokenizers 0.22.2
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
19
  - bertscore
20
  ---
21
 
 
 
 
22
  [<img src="https://raw.githubusercontent.com/axolotl-ai-cloud/axolotl/main/image/axolotl-badge-web.png" alt="Built with Axolotl" width="200" height="32"/>](https://github.com/axolotl-ai-cloud/axolotl)
23
  <details><summary>See axolotl config</summary>
24
 
 
95
 
96
  # phi4_lora_axolotl
97
 
98
+ This is a LoRA fine-tuned version of **microsoft/Phi-4-mini-instruct** for African History using the **DannyAI/African-History-QA-Dataset** dataset.
99
+ It achieves a loss value of 1.7479 on the validation set
 
 
 
 
 
100
 
101
+ ## Model Details
102
 
103
+ ### Model Description
104
 
105
+ - **Developed by:** Daniel Ihenacho
106
+ - **Funded by:** Daniel Ihenacho
107
+ - **Shared by:** Daniel Ihenacho
108
+ - **Model type:** Text Generation
109
+ - **Language(s) (NLP):** English
110
+ - **License:** mit
111
+ - **Finetuned from model:** microsoft/Phi-4-mini-instruct
112
 
113
+ ## Uses
114
 
115
+ This can be used for QA datasets about African History
116
 
117
+ ### Out-of-Scope Use
118
 
119
+ Can be used beyond African History but should not.
120
 
121
+ ## How to Get Started with the Model
122
+
123
+ ```python
124
+ from transformers import pipeline
125
+ from transformers import (
126
+ AutoTokenizer,
127
+ AutoModelForCausalLM)
128
+ from peft import PeftModel
129
 
130
+
131
+ model_id = "microsoft/Phi-4-mini-instruct"
132
+
133
+ tokeniser = AutoTokenizer.from_pretrained(model_id)
134
+
135
+ # load base model
136
+ model = AutoModelForCausalLM.from_pretrained(
137
+ model_id,
138
+ device_map = "auto",
139
+ torch_dtype = torch.bfloat16,
140
+ trust_remote_code = False
141
+ )
142
+
143
+ # Load the fine-tuned LoRA model
144
+ lora_id = "DannyAI/phi4_lora_axolotl"
145
+ lora_model = PeftModel.from_pretrained(
146
+ model,lora_id
147
+ )
148
+
149
+ generator = pipeline(
150
+ "text-generation",
151
+ model=lora_model,
152
+ tokenizer=tokeniser,
153
+ )
154
+ question = "What is the significance of African feminist scholarly activism in contemporary resistance movements?"
155
+ def generate_answer(question)->str:
156
+ """Generates an answer for the given question using the fine-tuned LoRA model.
157
+ """
158
+ messages = [
159
+ {"role": "system", "content": "You are a helpful AI assistant specialised in African history which gives concise answers to questions asked."},
160
+ {"role": "user", "content": question}
161
+ ]
162
+
163
+ output = generator(
164
+ messages,
165
+ max_new_tokens=2048,
166
+ temperature=0.1,
167
+ do_sample=False,
168
+ return_full_text=False
169
+ )
170
+ return output[0]['generated_text'].strip()
171
+ ```
172
+ ```
173
+ # Example output
174
+ African feminist scholarly activism is significant in contemporary resistance movements as it provides a critical framework for understanding and addressing the specific challenges faced by African women in the context of global capitalism, neocolonialism, and patriarchal structures.
175
+ ```
176
+
177
+ ## Training Details
178
 
179
  ### Training results
180
 
 
196
  | 2.5727 | 50.0 | 650 | 1.7479 | 5.7428 | 14.84 | 14.84 | 31.79 |
197
 
198
 
199
+
200
+ ### Training hyperparameters
201
+
202
+ The following hyperparameters were used during training:
203
+ - learning_rate: 2e-05
204
+ - train_batch_size: 2
205
+ - eval_batch_size: 2
206
+ - seed: 42
207
+ - gradient_accumulation_steps: 4
208
+ - total_train_batch_size: 8
209
+ - optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
210
+ - lr_scheduler_type: cosine
211
+ - lr_scheduler_warmup_steps: 20
212
+ - training_steps: 650
213
+
214
+ ### Lora Configuration
215
+ - r: 8
216
+ - lora_alpha: 16
217
+ - target_modules: ["q_proj", "v_proj", "k_proj", "o_proj"]
218
+ - lora_dropout: 0.05 # dataset is small, hence a low dropout value
219
+ - bias: "none"
220
+ - task_type: "CAUSAL_LM"
221
+
222
+ ## Evaluation
223
+
224
+ #### Metrics
225
+ | Models | Bert Score | TinyMMLU| TinyTrufulQA
226
+ |------|--------------|----------------|----------------|
227
+ | Base model | 0.88868 | 0.6837 |0.49745|
228
+ | Fine tuned Model | 0.88981 | 0.67371 |0.46626|
229
+
230
+ ## Compute Infrastructure
231
+
232
+ [Runpod](https://console.runpod.io/).
233
+
234
+ ### Hardware
235
+
236
+ Runpod A40 GPU instance
237
+
238
  ### Framework versions
239
 
240
  - PEFT 0.18.1
241
  - Transformers 4.57.6
242
  - Pytorch 2.9.1+cu128
243
  - Datasets 4.5.0
244
+ - Tokenizers 0.22.2
245
+
246
+
247
+ ## Citation
248
+
249
+ If you use this dataset, please cite:
250
+ ```
251
+ @Model{
252
+ Ihenacho2026phi4_lora_axolotl,
253
+ author = {Daniel Ihenacho},
254
+ title = {phi4_lora_axolotl},
255
+ year = {2026},
256
+ publisher = {Hugging Face Models},
257
+ url = {https://huggingface.co/DannyAI/phi4_lora_axolotl},
258
+ urldate = {2026-01-27},
259
+ }
260
+ ```
261
+
262
+ ## Model Card Authors
263
+
264
+ Daniel Ihenacho
265
+
266
+ ## Model Card Contact
267
+
268
+ - [LinkedIn](https://www.linkedin.com/in/daniel-ihenacho-637467223)
269
+ - [GitHub](https://github.com/daniau23)