Upload merged Qwen3-4B-Instruct-2507 model (auto-generated README)

Files changed (3) hide show

README.md CHANGED Viewed

@@ -66,8 +66,8 @@ DB category weights used during training-data preparation:
 - Base model: Qwen/Qwen3-4B-Instruct-2507
 - Method: LoRA (full precision base)
 - Max sequence length: 2048
-- Epochs: 1
-- Learning rate: 1e-06
 - LoRA: r=64, alpha=128, dropout=0.0
 - Per-device train batch size: 2
 - Gradient accumulation: 4

 - Base model: Qwen/Qwen3-4B-Instruct-2507
 - Method: LoRA (full precision base)
 - Max sequence length: 2048
+- Epochs: 2
+- Learning rate: 2e-06
 - LoRA: r=64, alpha=128, dropout=0.0
 - Per-device train batch size: 2
 - Gradient accumulation: 4

model-00001-of-00002.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:eab3108edca686017f155c73cd62ace868792d856d7ca1704163b7e4cc1889ce
 size 4967215360

 version https://git-lfs.github.com/spec/v1
+oid sha256:ce2ce8c52da46bf7471c0d2eef812a191d5be71b1f82e2539a20dafd00148e4f
 size 4967215360

model-00002-of-00002.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:a2140205c13477496e31d7fdf52ea5b13dc9570dc7085d7a99269fd4369dc8b9
 size 3077766632

 version https://git-lfs.github.com/spec/v1
+oid sha256:353c3f776c17902ac161c9d1ab001e4929ffcffd4c9626ce12a4a4362de0ac7f
 size 3077766632