Andy-ML-And-AI commited on
Commit
c6ca504
·
verified ·
1 Parent(s): a1b4b3a

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +73 -6
README.md CHANGED
@@ -6,17 +6,84 @@ tags:
6
  - unsloth
7
  - qwen3
8
  - trl
 
 
 
9
  license: apache-2.0
10
  language:
11
  - en
 
 
 
 
 
 
 
12
  ---
13
 
14
- # Uploaded model
15
 
16
- - **Developed by:** Andy-ML-And-AI
17
- - **License:** apache-2.0
18
- - **Finetuned from model :** unsloth/qwen3-8b-unsloth-bnb-4bit
19
 
20
- This qwen3 model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
21
 
22
- [<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)
 
 
 
 
 
6
  - unsloth
7
  - qwen3
8
  - trl
9
+ - qlora
10
+ - reasoning
11
+ - code
12
  license: apache-2.0
13
  language:
14
  - en
15
+ datasets:
16
+ - Sashvat/HyperThink-X-Nvidia-Opencode-Reasoning-200K
17
+ metrics:
18
+ - humaneval
19
+ - gsm8k
20
+ library_name: adapter
21
+ pipeline_tag: text-generation
22
  ---
23
 
24
+ # CTD-Qwen3-8B (Code Till Death)
25
 
26
+ CTD-Qwen3-8B is a LoRA fine-tune of the Qwen3-8B base model\[cite: 1\].
 
 
27
 
28
+ ---
29
+
30
+ ## 🛠 Experimental Setup
31
+
32
+ - Base model: Qwen3-8B\[cite: 1\]
33
+ - Hardware: dual Tesla T4 (16GB VRAM each)\[cite: 1\]
34
+ - 4-bit QLoRA with rank = 16 and alpha = 16\[cite: 1\]
35
+ - All linear layers:
36
+ - Attention: q, k, v, o
37
+ - MLP: gate, up, down
38
+ - Training time: ~1 hour 17 minutes
39
+ - Total steps: 50\[cite: 1\]
40
+
41
+ ---
42
+
43
+ ## 🧠 Dataset & Objective
44
+
45
+ Training on a specific 30k subset of the
46
+ **Sashvat/HyperThink-X-Nvidia-Opencode-Reasoning-200K** dataset\[cite: 1\].
47
+
48
+ - Uses chat template with assistant response in the *thinking* field\[cite: 1\]
49
+ - Objective: encourage *thinking over direct response*\[cite: 1\]
50
+ - Sequence length limited to 4096 tokens (for code complexity + VRAM constraints)\[cite: 1\]
51
+
52
+ ---
53
+
54
+ ## 📉 Training Logs
55
+
56
+ With only 50 steps, the loss shows expected variance given model + dataset complexity\[cite: 1\].
57
+
58
+ | Step | Training Loss |
59
+ |------|--------------|
60
+ | 10 | 0.8177 |
61
+ | 25 | 0.7358 |
62
+ | 50 | 0.6785 |
63
+
64
+ - Global batch size: 8 (1 device × 8 gradient steps)\[cite: 1\]
65
+
66
+ ---
67
+
68
+ ## 📊 Evaluation (Ongoing)
69
+
70
+ Currently running benchmarks using the **lm-eval** library:
71
+
72
+ - HumanEval (Coding)
73
+ - GSM8K (Math)
74
+
75
+ Comparisons are being made against the base model.
76
+ These evaluations are for internal use within the *Andy Labs* organization.
77
+
78
+ ---
79
+
80
+ ## 🔁 Reproduction
81
+
82
+ ```python
83
+ from unsloth import FastLanguageModel
84
 
85
+ model, tokenizer = FastLanguageModel.from_pretrained(
86
+ model_name = "Andy-ML-And-AI/CTD-Qwen3-8B",
87
+ max_seq_length = 4096,
88
+ load_in_4bit = True,
89
+ )