qwen3-4b-structeval-sft-v4-lr3e5-merged

SFT LoRA adapter (sonodd/qwen3-4b-structeval-sft-v4-lr3e5) ใ‚’ใƒ™ใƒผใ‚นใƒขใƒ‡ใƒซ (Qwen/Qwen3-4B-Instruct-2507) ใซใƒžใƒผใ‚ธใ—ใŸใƒ•ใƒซใƒขใƒ‡ใƒซใงใ™ใ€‚

็”จ้€”

DPO ใƒŽใƒผใƒˆใƒ–ใƒƒใ‚ฏใฎ DPO_BASE_MODEL ใซๆŒ‡ๅฎšใ—ใฆ SFT โ†’ DPO ใƒ‘ใ‚คใƒ—ใƒฉใ‚คใƒณ ใ‚’ๅฎŸ่กŒใ™ใ‚‹ใŸใ‚ใซไฝฟ็”จใ—ใพใ™ใ€‚

# DPO ใƒŽใƒผใƒˆใƒ–ใƒƒใ‚ฏ cell-10
os.environ["DPO_BASE_MODEL"]     = "sonodd/qwen3-4b-structeval-sft-v4-lr3e5-merged"
os.environ["DPO_SFT_ADAPTER_ID"] = ""  # ใƒžใƒผใ‚ธๆธˆใฟใชใฎใง็ฉบ

ๆง‹ๆˆ

  • Base model: Qwen/Qwen3-4B-Instruct-2507
  • SFT adapter: sonodd/qwen3-4b-structeval-sft-v4-lr3e5
  • Merge method: merge_and_unload() (float16)
Downloads last month
14
Safetensors
Model size
4B params
Tensor type
F16
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for sonodd/qwen3-4b-structeval-sft-v4-lr3e5-merged

Finetuned
(1153)
this model
Finetunes
1 model