metadata
license: apache-2.0
base_model: oretti/merged_qwen3_4b_1
datasets:
- u-10bei/dbbench_sft_dataset_react_v4
tags:
- peft
- lora
- sft
- transformers
- 4bit
Fine-tuned model (merged)
- Base model:
oretti/merged_qwen3_4b_1 - Dataset:
u-10bei/dbbench_sft_dataset_react_v4 - Method: SFT (assistant-only loss)
- Format: Merged full model
Training config (key)
- max_seq_len: 1024
- epochs: 2
- per_device_train_bs: 1
- grad_accum: 4
- lr: 1e-05
- warmup_ratio: 0.1
- weight_decay: 0.05
- lora_r: 64
- lora_alpha: 128
- lora_dropout: 0.0
- target_modules: q_proj,k_proj,v_proj,o_proj,gate_proj,up_proj,down_proj
Generated by
train_sft.py.