qwen3-4b-structeval-sft-v4-lr3e5-merged
SFT LoRA adapter (sonodd/qwen3-4b-structeval-sft-v4-lr3e5) ใใใผในใขใใซ (Qwen/Qwen3-4B-Instruct-2507) ใซใใผใธใใใใซใขใใซใงใใ
็จ้
DPO ใใผใใใใฏใฎ DPO_BASE_MODEL ใซๆๅฎใใฆ SFT โ DPO ใใคใใฉใคใณ ใๅฎ่กใใใใใซไฝฟ็จใใพใใ
# DPO ใใผใใใใฏ cell-10
os.environ["DPO_BASE_MODEL"] = "sonodd/qwen3-4b-structeval-sft-v4-lr3e5-merged"
os.environ["DPO_SFT_ADAPTER_ID"] = "" # ใใผใธๆธใฟใชใฎใง็ฉบ
ๆงๆ
- Base model: Qwen/Qwen3-4B-Instruct-2507
- SFT adapter: sonodd/qwen3-4b-structeval-sft-v4-lr3e5
- Merge method: merge_and_unload() (float16)
- Downloads last month
- 14