AF0815 commited on
Commit
8ada862
·
verified ·
1 Parent(s): f1057d7

Upload merged Qwen3-4B-Instruct-2507 model (auto-generated README)

Browse files
README.md CHANGED
@@ -17,6 +17,7 @@ tags:
17
  - agentbench
18
  - alfworld
19
  - dbbench
 
20
  ---
21
 
22
  # qwen3-4b-agentbench-dbalf-lora
@@ -45,29 +46,29 @@ Loss is applied to **all assistant turns** in the trajectory, enabling the model
45
 
46
  - DBBench dataset: `u-10bei/dbbench_sft_dataset_react_v4`
47
  - ALFWorld dataset: `u-10bei/sft_alfworld_trajectory_dataset_v5`
48
- - Mixing ratio (pre-merge target): **DB:ALF = 2:1**
49
 
50
  ### DB Oversampling (category-aware)
51
- Enabled: **False**
52
 
53
  DB category weights used during training-data preparation:
54
 
55
- - counting: 1
56
- - comparison: 1
57
  - ranking: 1
58
  - select: 1
59
- - insert: 1
60
  - update: 1
61
- - other: 1
62
 
63
  ## Training Configuration
64
 
65
  - Base model: Qwen/Qwen3-4B-Instruct-2507
66
  - Method: LoRA (full precision base)
67
  - Max sequence length: 2048
68
- - Epochs: 1
69
- - Learning rate: 2e-06
70
- - LoRA: r=32, alpha=64, dropout=0.05
71
  - Per-device train batch size: 2
72
  - Gradient accumulation: 4
73
 
 
17
  - agentbench
18
  - alfworld
19
  - dbbench
20
+ - db-oversampling
21
  ---
22
 
23
  # qwen3-4b-agentbench-dbalf-lora
 
46
 
47
  - DBBench dataset: `u-10bei/dbbench_sft_dataset_react_v4`
48
  - ALFWorld dataset: `u-10bei/sft_alfworld_trajectory_dataset_v5`
49
+ - Mixing ratio (pre-merge target): **DB:ALF = 1:0**
50
 
51
  ### DB Oversampling (category-aware)
52
+ Enabled: **True**
53
 
54
  DB category weights used during training-data preparation:
55
 
56
+ - counting: 2
57
+ - comparison: 2
58
  - ranking: 1
59
  - select: 1
60
+ - insert: 2
61
  - update: 1
62
+ - other: 2
63
 
64
  ## Training Configuration
65
 
66
  - Base model: Qwen/Qwen3-4B-Instruct-2507
67
  - Method: LoRA (full precision base)
68
  - Max sequence length: 2048
69
+ - Epochs: 1.2
70
+ - Learning rate: 3e-06
71
+ - LoRA: r=32, alpha=64, dropout=0.0
72
  - Per-device train batch size: 2
73
  - Gradient accumulation: 4
74
 
model-00001-of-00002.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:b2031c7b5bda1d270c62e8ee556b5ddd1a2d990d7899eff7f4ef8c2d10f863d1
3
  size 4967215360
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:9f7543658eb9e5119afcfd0eae26db7c8892e5f5d56417c552d3377bea4caf8f
3
  size 4967215360
model-00002-of-00002.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:e33faeee578db08f9d17cd701de66faf60543891a5cd000f0b21c899f52ded10
3
  size 3077766632
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:8ae0228ca45d1a58e552ca0c34de7d27287f73be9bd6354f9adf1efa4fc235b3
3
  size 3077766632