GGUF
math
reasoning
qwen
llama-cpp
lora
chain-of-thought
conversational
WYK commited on
Commit
19750fa
·
verified ·
1 Parent(s): f4cf341

Update Readme.md and model card

Browse files
Files changed (1) hide show
  1. README.md +40 -1
README.md CHANGED
@@ -1,7 +1,46 @@
1
  ---
2
  license: apache-2.0
 
 
 
 
 
 
 
 
3
  datasets:
4
  - nvidia/Nemotron-SFT-Math-v3
5
  base_model:
6
  - Qwen/Qwen3.5-4B
7
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: apache-2.0
3
+ tags:
4
+ - math
5
+ - reasoning
6
+ - qwen
7
+ - llama-cpp
8
+ - gguf
9
+ - lora
10
+ - chain-of-thought
11
  datasets:
12
  - nvidia/Nemotron-SFT-Math-v3
13
  base_model:
14
  - Qwen/Qwen3.5-4B
15
+ ---
16
+
17
+ # Qwen3.5-4B Math Fine-Tuned (Nemotron-SFT-Math-v3)
18
+
19
+ This model is a fine-tuned version of `Qwen3.5-4B`, explicitly optimized for complex mathematical reasoning and Chain-of-Thought (CoT) problem solving. It was fine-tuned using the `Nemotron-Math-v3` dataset with Parameter-Efficient Fine-Tuning (PEFT/LoRA).
20
+
21
+ ## Model Details
22
+
23
+ - **Base Model**: `Qwen/Qwen3.5-4B`
24
+ - **Fine-Tuning Dataset**: `nvidia/Nemotron-SFT-Math-v3`
25
+ - **Methodology**: LoRA (Rank = 64, Alpha = 32 or Alpha = 16). The `lora_alpha` scaling is specifically tuned to prevent catastrophic forgetting, ensuring the model retains conversational abilities while significantly enhancing mathematical logic.
26
+ - **Quantization**: Safetensor format (`F16`) and GGUF formats (`Q8_0`)
27
+
28
+ ## Recommended Generation Parameters
29
+
30
+ Because this model leverages extensive Chain-of-Thought reasoning to solve math problems, the following generation parameters are highly recommended for the best performance:
31
+
32
+ ```json
33
+ {
34
+ "temperature": 1.0,
35
+ "top_p": 0.95,
36
+ "repetition_penalty": 1.1
37
+ }
38
+ ```
39
+
40
+ *Note: A `repetition_penalty` of `1.1` is crucial to prevent the base model from occasionally falling into infinite generation loops on extremely long context windows.*
41
+
42
+ ## Use Cases
43
+
44
+ - Resolving complex math word problems (GSM8K).
45
+ - Higher-level mathematical reasoning (MATH, AIME).
46
+ - Step-by-step logic tracking and proofs.