Update README.md

Browse files

Files changed (1) hide show

README.md +26 -26

README.md CHANGED Viewed

@@ -21,7 +21,7 @@ tags:
 This repository contains a fine-tuned version of
 **unsloth/phi-4-reasoning**, trained with **LoRA** on the
-**Tesslate/Rust_Dataset**.\
 The goal of this project is to enhance the model's reasoning,
 explanation, and step-by-step thinking abilities specifically for
 **Rust-related tasks**.
@@ -30,10 +30,10 @@ explanation, and step-by-step thinking abilities specifically for
 This model was fine-tuned to:
--   Improve **Rust coding explanations**\
--   Generate **high-quality reasoning traces**\
--   Provide **step-by-step problem solving**\
--   Give **detailed and structured answers**\
 -   Handle **`<think>`{=html}...`</think>`{=html} hidden reasoning
     tags**
@@ -88,26 +88,26 @@ print(tokenizer.decode(output[0], skip_special_tokens=False))
 **unsloth/phi-4-reasoning**
--   14B parameter reasoning-optimized model\
--   Uses internal `<think>` reasoning\
 -   Strong on step-by-step chain-of-thought tasks
 ## 🛠 Fine-Tuning Details
- | Setting        | Value |
-  ---------------- -----------------------------------------
-  Method           LoRA (PEFT)
-  Rank (r)         16
-  Alpha            32
-  Dropout          0.05
-  Target Modules   q/k/v/o proj, mlp (up/down/gate)
-  Max Length       512
-  Precision        4-bit QLoRA
-  Batch Size       16
-  Grad Accum       8
-  LR               2e-4
-  Scheduler        cosine
-  Epochs           1
 ## Evaluation
 | Epoch | Training Loss | Validation Loss |
@@ -120,8 +120,8 @@ print(tokenizer.decode(output[0], skip_special_tokens=False))
 Includes:
--   Rust prompts\
--   Step-by-step reasoning\
 -   Final answers
 This dataset improves the model's ability to produce structured and
@@ -131,7 +131,7 @@ accurate explanations for Rust programming tasks.
 This model preserves **hidden reasoning structure**:
--   `<think>` content is **internal chain-of-thought**\
 -   The final output is **placed after the reasoning block**
 ⚠️ Users should NOT expect the `<think>` content to be revealed; the
@@ -139,6 +139,6 @@ model is aligned to hide reasoning by default.
 ## ✨ Acknowledgements
--   **Unsloth** for optimized model training\
--   **HuggingFace Transformers & PEFT** team\
 -   **Tesslate** for providing the Rust dataset

 This repository contains a fine-tuned version of
 **unsloth/phi-4-reasoning**, trained with **LoRA** on the
+**Tesslate/Rust_Dataset**.
 The goal of this project is to enhance the model's reasoning,
 explanation, and step-by-step thinking abilities specifically for
 **Rust-related tasks**.
 This model was fine-tuned to:
+-   Improve **Rust coding explanations**
+-   Generate **high-quality reasoning traces**
+-   Provide **step-by-step problem solving**
+-   Give **detailed and structured answers**
 -   Handle **`<think>`{=html}...`</think>`{=html} hidden reasoning
     tags**
 **unsloth/phi-4-reasoning**
+-   14B parameter reasoning-optimized model
+-   Uses internal `<think>` reasoning
 -   Strong on step-by-step chain-of-thought tasks
 ## 🛠 Fine-Tuning Details
+  | Setting | Value |
+  |----------------|-----------------------------------------|
+  | Method | LoRA (PEFT) |
+  | Rank (r) | 16 |
+  | Alpha | 32 |
+  | Dropout | 0.05 |
+  | Target Modules | q/k/v/o proj, mlp (up/down/gate) |
+  | Max Length | 512 |
+  | Precision | 4-bit QLoRA |
+  | Batch Size | 16 |
+  | Grad Accum | 8 |
+  | LR | 2e-4 |
+  | Scheduler | cosine |
+  | Epochs | 1 |
 ## Evaluation
 | Epoch | Training Loss | Validation Loss |
 Includes:
+-   Rust prompts
+-   Step-by-step reasoning
 -   Final answers
 This dataset improves the model's ability to produce structured and
 This model preserves **hidden reasoning structure**:
+-   `<think>` content is **internal chain-of-thought**
 -   The final output is **placed after the reasoning block**
 ⚠️ Users should NOT expect the `<think>` content to be revealed; the
 ## ✨ Acknowledgements
+-   **Unsloth** for optimized model training
+-   **HuggingFace Transformers & PEFT** team
 -   **Tesslate** for providing the Rust dataset