SkyAsl commited on
Commit
3ae034a
·
verified ·
1 Parent(s): 230069e

Upload README.md

Browse files
Files changed (1) hide show
  1. README.md +120 -0
README.md ADDED
@@ -0,0 +1,120 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # 🧠 Phi-4 Reasoning -- Rust Dataset LoRA (Merged)
2
+
3
+ This repository contains a fine-tuned version of
4
+ **unsloth/phi-4-reasoning**, trained with **LoRA** on the
5
+ **Tesslate/Rust_Dataset**.\
6
+ The goal of this project is to enhance the model's reasoning,
7
+ explanation, and step-by-step thinking abilities specifically for
8
+ **Rust-related tasks**.
9
+
10
+ ## 🚀 Model Purpose
11
+
12
+ This model was fine-tuned to:
13
+
14
+ - Improve **Rust coding explanations**\
15
+ - Generate **high-quality reasoning traces**\
16
+ - Provide **step-by-step problem solving**\
17
+ - Give **detailed and structured answers**\
18
+ - Handle **`<think>`{=html}...`</think>`{=html} hidden reasoning
19
+ tags**
20
+
21
+ The training format follows:
22
+
23
+ <|user|>
24
+ {prompt}
25
+ <|assistant|>
26
+ <think>
27
+ {reasoning}
28
+ </think>
29
+ {response}
30
+
31
+ ## 🧩 Base Model
32
+
33
+ **unsloth/phi-4-reasoning**
34
+
35
+ - 14B parameter reasoning-optimized model\
36
+ - Uses internal `<think>` reasoning\
37
+ - Strong on step-by-step chain-of-thought tasks
38
+
39
+ ## 🛠 Fine-Tuning Details
40
+
41
+ Setting Value
42
+ ---------------- -----------------------------------------
43
+ Method LoRA (PEFT)
44
+ Rank (r) 16
45
+ Alpha 32
46
+ Dropout 0.05
47
+ Target Modules q/k/v/o proj, mlp (up/down/gate)
48
+ Max Length 2048
49
+ Precision 4-bit QLoRA (merged later to BF16/FP16)
50
+ Batch Size 4
51
+ Grad Accum 8
52
+ LR 2e-4
53
+ Scheduler cosine
54
+ Epochs 2
55
+
56
+ ## 📚 Dataset
57
+
58
+ **Tesslate/Rust_Dataset**
59
+
60
+ Includes:
61
+
62
+ - Rust prompts\
63
+ - Step-by-step reasoning\
64
+ - Final answers
65
+
66
+ This dataset improves the model's ability to produce structured and
67
+ accurate explanations for Rust programming tasks.
68
+
69
+ ## 🔧 How to Use
70
+
71
+ ### Load model normally:
72
+
73
+ ``` python
74
+ from transformers import AutoTokenizer, AutoModelForCausalLM
75
+
76
+ model_id = "YOUR_USERNAME/YOUR_MODEL_NAME"
77
+
78
+ tokenizer = AutoTokenizer.from_pretrained(model_id)
79
+ model = AutoModelForCausalLM.from_pretrained(model_id)
80
+
81
+ prompt = "Explain ownership in Rust with examples."
82
+
83
+ inputs = tokenizer(prompt, return_tensors="pt")
84
+ outputs = model.generate(**inputs, max_new_tokens=300)
85
+
86
+ print(tokenizer.decode(outputs[0], skip_special_tokens=True))
87
+ ```
88
+
89
+ ## 🔍 Notes on Reasoning Tags
90
+
91
+ This model preserves **hidden reasoning structure**:
92
+
93
+ - `<think>` content is **internal chain-of-thought**\
94
+ - The final output is **placed after the reasoning block**
95
+
96
+ ⚠️ Users should NOT expect the `<think>` content to be revealed; the
97
+ model is aligned to hide reasoning by default.
98
+
99
+ ## 📦 Files Included
100
+
101
+ - `config.json`\
102
+ - `generation_config.json`\
103
+ - `pytorch_model.bin`\
104
+ - `tokenizer.json`
105
+
106
+ If this is a LoRA-only repo (not merged), then the repo contains:
107
+
108
+ - `adapter_config.json`\
109
+ - `adapter_model.bin`
110
+
111
+ ## 🔒 License
112
+
113
+ This model inherits the license of the base model:\
114
+ **Microsoft Phi License / Reasoning Model Terms**
115
+
116
+ ## ✨ Acknowledgements
117
+
118
+ - **Unsloth** for optimized model training\
119
+ - **HuggingFace Transformers & PEFT** team\
120
+ - **Tesslate** for providing the Rust dataset