Upload merged Qwen3-4B-Instruct-2507 model (auto-generated README)
Browse files
README.md
CHANGED
|
@@ -33,6 +33,12 @@ Loss is applied to **all assistant turns** in the multi-turn trajectory,
|
|
| 33 |
enabling the model to learn environment observation, action selection,
|
| 34 |
tool use, and recovery from errors.
|
| 35 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 36 |
## Training Configuration
|
| 37 |
|
| 38 |
- Base model: Qwen/Qwen3-4B-Instruct-2507
|
|
|
|
| 33 |
enabling the model to learn environment observation, action selection,
|
| 34 |
tool use, and recovery from errors.
|
| 35 |
|
| 36 |
+
The training process ic consist of two steps.
|
| 37 |
+
First, training for LoRA in order to be adapted to Database SQL,
|
| 38 |
+
Then Secondary, that for ALF is performed separately.
|
| 39 |
+
Finally, each LoRA adapter is merged into base model sequentially,
|
| 40 |
+
LoRA for DB and then that for ALF.
|
| 41 |
+
|
| 42 |
## Training Configuration
|
| 43 |
|
| 44 |
- Base model: Qwen/Qwen3-4B-Instruct-2507
|