| license: apache-2.0 | |
| base_model: oretti/merged_qwen3_4b_1 | |
| datasets: | |
| - u-10bei/dbbench_sft_dataset_react_v4 | |
| tags: | |
| - peft | |
| - lora | |
| - sft | |
| - transformers | |
| - 4bit | |
| # Fine-tuned model (merged) | |
| - **Base model:** `oretti/merged_qwen3_4b_1` | |
| - **Dataset:** `u-10bei/dbbench_sft_dataset_react_v4` | |
| - **Method:** SFT (assistant-only loss) | |
| - **Format:** Merged full model | |
| ## Training config (key) | |
| - max_seq_len: 1024 | |
| - epochs: 2 | |
| - per_device_train_bs: 1 | |
| - grad_accum: 4 | |
| - lr: 1e-05 | |
| - warmup_ratio: 0.1 | |
| - weight_decay: 0.05 | |
| - lora_r: 64 | |
| - lora_alpha: 128 | |
| - lora_dropout: 0.0 | |
| - target_modules: q_proj,k_proj,v_proj,o_proj,gate_proj,up_proj,down_proj | |
| > Generated by `train_sft.py`. | |