SkyAsl commited on
Commit
dd096c4
·
verified ·
1 Parent(s): 684b364

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +26 -26
README.md CHANGED
@@ -21,7 +21,7 @@ tags:
21
 
22
  This repository contains a fine-tuned version of
23
  **unsloth/phi-4-reasoning**, trained with **LoRA** on the
24
- **Tesslate/Rust_Dataset**.\
25
  The goal of this project is to enhance the model's reasoning,
26
  explanation, and step-by-step thinking abilities specifically for
27
  **Rust-related tasks**.
@@ -30,10 +30,10 @@ explanation, and step-by-step thinking abilities specifically for
30
 
31
  This model was fine-tuned to:
32
 
33
- - Improve **Rust coding explanations**\
34
- - Generate **high-quality reasoning traces**\
35
- - Provide **step-by-step problem solving**\
36
- - Give **detailed and structured answers**\
37
  - Handle **`<think>`{=html}...`</think>`{=html} hidden reasoning
38
  tags**
39
 
@@ -88,26 +88,26 @@ print(tokenizer.decode(output[0], skip_special_tokens=False))
88
 
89
  **unsloth/phi-4-reasoning**
90
 
91
- - 14B parameter reasoning-optimized model\
92
- - Uses internal `<think>` reasoning\
93
  - Strong on step-by-step chain-of-thought tasks
94
 
95
  ## 🛠 Fine-Tuning Details
96
 
97
- | Setting | Value |
98
- ---------------- -----------------------------------------
99
- Method LoRA (PEFT)
100
- Rank (r) 16
101
- Alpha 32
102
- Dropout 0.05
103
- Target Modules q/k/v/o proj, mlp (up/down/gate)
104
- Max Length 512
105
- Precision 4-bit QLoRA
106
- Batch Size 16
107
- Grad Accum 8
108
- LR 2e-4
109
- Scheduler cosine
110
- Epochs 1
111
 
112
  ## Evaluation
113
  | Epoch | Training Loss | Validation Loss |
@@ -120,8 +120,8 @@ print(tokenizer.decode(output[0], skip_special_tokens=False))
120
 
121
  Includes:
122
 
123
- - Rust prompts\
124
- - Step-by-step reasoning\
125
  - Final answers
126
 
127
  This dataset improves the model's ability to produce structured and
@@ -131,7 +131,7 @@ accurate explanations for Rust programming tasks.
131
 
132
  This model preserves **hidden reasoning structure**:
133
 
134
- - `<think>` content is **internal chain-of-thought**\
135
  - The final output is **placed after the reasoning block**
136
 
137
  ⚠️ Users should NOT expect the `<think>` content to be revealed; the
@@ -139,6 +139,6 @@ model is aligned to hide reasoning by default.
139
 
140
  ## ✨ Acknowledgements
141
 
142
- - **Unsloth** for optimized model training\
143
- - **HuggingFace Transformers & PEFT** team\
144
  - **Tesslate** for providing the Rust dataset
 
21
 
22
  This repository contains a fine-tuned version of
23
  **unsloth/phi-4-reasoning**, trained with **LoRA** on the
24
+ **Tesslate/Rust_Dataset**.
25
  The goal of this project is to enhance the model's reasoning,
26
  explanation, and step-by-step thinking abilities specifically for
27
  **Rust-related tasks**.
 
30
 
31
  This model was fine-tuned to:
32
 
33
+ - Improve **Rust coding explanations**
34
+ - Generate **high-quality reasoning traces**
35
+ - Provide **step-by-step problem solving**
36
+ - Give **detailed and structured answers**
37
  - Handle **`<think>`{=html}...`</think>`{=html} hidden reasoning
38
  tags**
39
 
 
88
 
89
  **unsloth/phi-4-reasoning**
90
 
91
+ - 14B parameter reasoning-optimized model
92
+ - Uses internal `<think>` reasoning
93
  - Strong on step-by-step chain-of-thought tasks
94
 
95
  ## 🛠 Fine-Tuning Details
96
 
97
+ | Setting | Value |
98
+ |----------------|-----------------------------------------|
99
+ | Method | LoRA (PEFT) |
100
+ | Rank (r) | 16 |
101
+ | Alpha | 32 |
102
+ | Dropout | 0.05 |
103
+ | Target Modules | q/k/v/o proj, mlp (up/down/gate) |
104
+ | Max Length | 512 |
105
+ | Precision | 4-bit QLoRA |
106
+ | Batch Size | 16 |
107
+ | Grad Accum | 8 |
108
+ | LR | 2e-4 |
109
+ | Scheduler | cosine |
110
+ | Epochs | 1 |
111
 
112
  ## Evaluation
113
  | Epoch | Training Loss | Validation Loss |
 
120
 
121
  Includes:
122
 
123
+ - Rust prompts
124
+ - Step-by-step reasoning
125
  - Final answers
126
 
127
  This dataset improves the model's ability to produce structured and
 
131
 
132
  This model preserves **hidden reasoning structure**:
133
 
134
+ - `<think>` content is **internal chain-of-thought**
135
  - The final output is **placed after the reasoning block**
136
 
137
  ⚠️ Users should NOT expect the `<think>` content to be revealed; the
 
139
 
140
  ## ✨ Acknowledgements
141
 
142
+ - **Unsloth** for optimized model training
143
+ - **HuggingFace Transformers & PEFT** team
144
  - **Tesslate** for providing the Rust dataset