--- license: apache-2.0 base_model: oretti/merged_qwen3_4b_1 datasets: - u-10bei/dbbench_sft_dataset_react_v4 tags: - peft - lora - sft - transformers - 4bit --- # Fine-tuned model (merged) - **Base model:** `oretti/merged_qwen3_4b_1` - **Dataset:** `u-10bei/dbbench_sft_dataset_react_v4` - **Method:** SFT (assistant-only loss) - **Format:** Merged full model ## Training config (key) - max_seq_len: 1024 - epochs: 2 - per_device_train_bs: 1 - grad_accum: 4 - lr: 1e-05 - warmup_ratio: 0.1 - weight_decay: 0.05 - lora_r: 64 - lora_alpha: 128 - lora_dropout: 0.0 - target_modules: q_proj,k_proj,v_proj,o_proj,gate_proj,up_proj,down_proj > Generated by `train_sft.py`.