AF0815 commited on
Commit
9751884
·
verified ·
1 Parent(s): d928909

Upload merged Qwen3-4B-Instruct-2507 model (auto-generated README)

Browse files
README.md CHANGED
@@ -66,8 +66,8 @@ DB category weights used during training-data preparation:
66
  - Base model: Qwen/Qwen3-4B-Instruct-2507
67
  - Method: LoRA (full precision base)
68
  - Max sequence length: 2048
69
- - Epochs: 1
70
- - Learning rate: 1e-06
71
  - LoRA: r=64, alpha=128, dropout=0.0
72
  - Per-device train batch size: 2
73
  - Gradient accumulation: 4
 
66
  - Base model: Qwen/Qwen3-4B-Instruct-2507
67
  - Method: LoRA (full precision base)
68
  - Max sequence length: 2048
69
+ - Epochs: 2
70
+ - Learning rate: 2e-06
71
  - LoRA: r=64, alpha=128, dropout=0.0
72
  - Per-device train batch size: 2
73
  - Gradient accumulation: 4
model-00001-of-00002.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:eab3108edca686017f155c73cd62ace868792d856d7ca1704163b7e4cc1889ce
3
  size 4967215360
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:ce2ce8c52da46bf7471c0d2eef812a191d5be71b1f82e2539a20dafd00148e4f
3
  size 4967215360
model-00002-of-00002.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:a2140205c13477496e31d7fdf52ea5b13dc9570dc7085d7a99269fd4369dc8b9
3
  size 3077766632
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:353c3f776c17902ac161c9d1ab001e4929ffcffd4c9626ce12a4a4362de0ac7f
3
  size 3077766632