Qwen3.6-27B AIHub Korean LLM Leaderboard โ sft_v5 (LoRA)
sft_v3์ 1,225๊ฐ newly critique-revised ์ํ๋ก ์ถ๊ฐ staged trainingํ ์ด๋ํฐ.
Training (Staged)
- Init from: Dino-LeeTaeHun/finetune-v3
- Base: Qwen3.6-27B (multimodal)
- Method: QLoRA 4bit + DoRA
- rank/alpha: 256/256
- modules_to_save: ์์
- target_modules: q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
- LR: 2e-5 (๋ฎ์ถค, staged training์ผ๋ก ์์ ์ฑ ์ฐ์ )
- epochs: 1
- batchรaccum: 2ร8 (effective 16)
- max_seq: 16384, packing on
- NEFTune ฮฑ: 5
- trainable: 1.28 B / 28.63 B (4.46%)
- Samples: 1,137 train / 52 val / 36 test (1,225๊ฐ newly revised์์ ๋ถํ )
Why staged?
v3 ํ์ต ์ 4,514๊ฐ ์ค 3,289๊ฐ๋ง critique-revised, 1,225๊ฐ๋ ์๋ณธ์ด์์. Critique-revise ์๋ฃ ํ 4,514๊ฐ ๋ชจ๋ ๋์ผ ๋ฐ์ดํฐ๋ก v4 ์ฌํ์ต ์๋ ์ 3,289๊ฐ๊ฐ 4 epoch ํ์ต๋์ด overfitting ๋ฐ์ (eval_loss 0.9014 โ 0.9349). 1,225๊ฐ newly revised๋ง์ผ๋ก v3 ์์ ์ถ๊ฐ ํ์ตํ v5๊ฐ ๊ฐ์ฅ ๊นจ๋ํ ๊ฒฐ๊ณผ.
Eval loss
| Stage | eval_loss |
|---|---|
| v3 final (epoch 2) | 0.9035 |
| v4 step 300 (์คํจ) | 0.9349 โ ๏ธ regression |
| v5 final (epoch 1) | 0.8583 โ |
Data
Dino-LeeTaeHun/finetune-data-v3 (private) (4,514 fully critique-revised by Claude Opus)
Usage
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
import torch
base = AutoModelForCausalLM.from_pretrained(
"Qwen/Qwen3.6-27B-A3B",
torch_dtype=torch.bfloat16,
device_map="auto",
)
model = PeftModel.from_pretrained(base, "Dino-LeeTaeHun/finetune-v5")
tok = AutoTokenizer.from_pretrained("Dino-LeeTaeHun/finetune-v5")
License
Internal research. Non-commercial.
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐ Ask for provider support