VladHong commited on
Commit
ec12c1f
·
verified ·
1 Parent(s): b0a568d

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +64 -3
README.md CHANGED
@@ -1,3 +1,64 @@
1
- ---
2
- license: apache-2.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ datasets:
4
+ - VladHong/Lewis_Instruct
5
+ base_model:
6
+ - Unsloth/Qwen3-4B-Instruct-2507
7
+ tags:
8
+ - qwen3
9
+ - qlora
10
+ - unsloth
11
+ - toy
12
+ language:
13
+ - en
14
+ ---
15
+ # Qwen3-4B Instruct Lewis
16
+
17
+ > ⚠️ **Toy model — not intended for serious or production use.** This is an experimental
18
+ > fine-tune trained on a tiny dataset for learning purposes only.
19
+
20
+ Finetuned from [Unsloth/Qwen3-4B-Instruct-2507](https://huggingface.co/Unsloth/Qwen3-4B-Instruct-2507)
21
+ using QLoRA + Unsloth on the [VladHong/Lewis_Instruct](https://huggingface.co/datasets/VladHong/Lewis_Instruct) dataset.
22
+
23
+ ## Example Conversation
24
+
25
+ > **User:** What should I do with a talking rabbit?
26
+ >
27
+ > **qwen3-4b-lewis:** I don't know, but I think it's time to go.
28
+ >
29
+ > **User:** Why?
30
+ >
31
+ > **qwen3-4b-lewis:** Because I'm afraid the rabbit will tell the Queen about us!
32
+
33
+ ## Training Data
34
+
35
+ | Dataset | Rows (raw) | Rows (after similarity filtering) |
36
+ |---|---|---|
37
+ | VladHong/Lewis_Instruct | 618 | 561 |
38
+
39
+ Similarity filtering used a 0.3 Jaccard threshold. `<think>` blocks were stripped from
40
+ all assistant turns before training.
41
+
42
+ ## Training Details
43
+
44
+ | Parameter | Value |
45
+ |---|---|
46
+ | Method | QLoRA (4-bit NF4) + Unsloth |
47
+ | LoRA rank | 16 |
48
+ | LoRA alpha | 16 |
49
+ | Epochs | 1 |
50
+ | Steps | 71 |
51
+ | Batch size | 2 per device × 4 gradient accumulation = 8 effective |
52
+ | Learning rate | 1e-4 (cosine schedule) |
53
+ | Max seq length | 2048 |
54
+ | Optimizer | AdamW 8-bit |
55
+ | Hardware | Tesla T4 (14.56 GB VRAM) |
56
+ | Training time | ~39.85 min |
57
+ | Trainable params | 33M / 4.05B (0.81%) |
58
+ | Peak VRAM | ~4.18 GB |
59
+
60
+ Training used `train_on_responses_only` — loss computed on assistant completions only.
61
+
62
+ ## License Note
63
+
64
+ Base model is Apache 2.0. Review upstream dataset terms before any use beyond personal experimentation.