RobinMillford commited on
Commit
8da1f6a
·
verified ·
1 Parent(s): b63a55a

Updated Readme.md

Browse files
Files changed (1) hide show
  1. README.md +96 -12
README.md CHANGED
@@ -1,22 +1,106 @@
1
  ---
2
  base_model: unsloth/phi-4-unsloth-bnb-4bit
3
  tags:
4
- - text-generation-inference
5
- - transformers
6
- - unsloth
7
- - llama
8
- - trl
9
  license: apache-2.0
10
  language:
11
- - en
12
  ---
13
 
14
- # Uploaded model
15
 
16
- - **Developed by:** RobinMillford
17
- - **License:** apache-2.0
18
- - **Finetuned from model :** unsloth/phi-4-unsloth-bnb-4bit
19
 
20
- This llama model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.
 
21
 
22
- [<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  base_model: unsloth/phi-4-unsloth-bnb-4bit
3
  tags:
4
+ - text-generation
5
+ - transformers
6
+ - unsloth
7
+ - llama
8
+ - trl
9
  license: apache-2.0
10
  language:
11
+ - en
12
  ---
13
 
14
+ # 🧮 Phi-4 Math Reasoning Model (LoRA Finetuned)
15
 
16
+ ## 📌 Model Overview
17
+ This model is a **LoRA fine-tuned version** of [unsloth/phi-4-unsloth-bnb-4bit](https://huggingface.co/unsloth/phi-4-unsloth-bnb-4bit).
18
+ It has been fine-tuned specifically for **math reasoning tasks**, capable of solving step-by-step arithmetic, algebra, and logic problems.
19
 
20
+ The base model is **Phi-4**, a 14B-parameter LLaMA variant optimized with [Unsloth](https://github.com/unslothai/unsloth) for **2x faster training** using Hugging Face’s [TRL](https://huggingface.co/docs/trl) library.
21
+ This version uses **bnb-4bit quantization**, making it memory efficient and suitable for single-GPU setups such as **Tesla T4 (16GB)** or consumer GPUs.
22
 
23
+ ---
24
+
25
+ ## ⚡ Key Features
26
+ - 🧠 Fine-tuned for **math reasoning and step-by-step solutions**
27
+ - ⚡ Efficient: 4-bit quantized, runs on **a single GPU** or even CPU (slower)
28
+ - 🚀 Trained with **Unsloth + TRL** for fast and memory-efficient fine-tuning
29
+ - 📚 Based on **Phi-4 (14B LLaMA model)**
30
+
31
+ ---
32
+
33
+ ## 📥 Installation
34
+ Ensure you have the latest versions of the required libraries:
35
+ ```bash
36
+ pip install unsloth transformers accelerate bitsandbytes
37
+ ```
38
+
39
+
40
+
41
+ 🖥️ Usage (Colab / Local GPU)
42
+
43
+ ```bash
44
+ import torch
45
+ from unsloth import FastLanguageModel
46
+ from transformers import TextStreamer
47
+
48
+ # Load the LoRA fine-tuned model
49
+ model_name = "RobinMillford/phi-4-math-reasoning-lora"
50
+ model, tokenizer = FastLanguageModel.from_pretrained(
51
+ model_name=model_name,
52
+ max_seq_length=2048,
53
+ dtype=torch.float16, # fp16 recommended for GPU
54
+ load_in_4bit=True, # load in 4-bit quantized mode
55
+ device_map="auto" # automatically place layers on GPU/CPU
56
+ )
57
+
58
+ # Prepare for inference
59
+ FastLanguageModel.for_inference(model)
60
+
61
+ # Example: Generate a step-by-step solution
62
+ streamer = TextStreamer(tokenizer)
63
+ inputs = tokenizer(
64
+ "Solve step by step: Q: What is 24 * 17 ? A:",
65
+ return_tensors="pt"
66
+ ).to("cuda")
67
+
68
+ _ = model.generate(**inputs, streamer=streamer, max_new_tokens=500)
69
+ ```
70
+
71
+
72
+
73
+ 📊 Example Output
74
+
75
+ Prompt:
76
+
77
+ Solve step by step: Q: What is 45 + 67 ?
78
+
79
+
80
+
81
+ Response:
82
+
83
+ Step 1: Add the ones digits: 5 + 7 = 12. Write down 2 and carry over 1.
84
+ Step 2: Add the tens digits plus carry: 4 + 6 + 1 = 11.
85
+ Step 3: Combine the results: 112.
86
+ Answer: 112
87
+
88
+
89
+
90
+
91
+ ⚠️ Disclaimer
92
+
93
+
94
+ This model is intended for research and educational purposes only.
95
+
96
+ It may not be fully accurate for complex math reasoning tasks. Always verify critical calculations independently.
97
+
98
+
99
+
100
+ ## ❤️ Made With
101
+ - [Unsloth](https://github.com/unslothai/unsloth)
102
+ - [Transformers](https://huggingface.co/docs/transformers)
103
+ - [TRL](https://huggingface.co/docs/trl)
104
+ - [Kaggle Notebook](https://www.kaggle.com/code/yaminh/finetuning-a-llm-for-math-reasoning-sbs)
105
+
106
+ ---