Qwen3.6-27B AIHub Korean LLM Leaderboard โ€” sft_v5 (LoRA)

sft_v3์— 1,225๊ฐœ newly critique-revised ์ƒ˜ํ”Œ๋กœ ์ถ”๊ฐ€ staged trainingํ•œ ์–ด๋Œ‘ํ„ฐ.

Training (Staged)

  • Init from: Dino-LeeTaeHun/finetune-v3
  • Base: Qwen3.6-27B (multimodal)
  • Method: QLoRA 4bit + DoRA
  • rank/alpha: 256/256
  • modules_to_save: ์—†์Œ
  • target_modules: q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
  • LR: 2e-5 (๋‚ฎ์ถค, staged training์œผ๋กœ ์•ˆ์ •์„ฑ ์šฐ์„ )
  • epochs: 1
  • batchร—accum: 2ร—8 (effective 16)
  • max_seq: 16384, packing on
  • NEFTune ฮฑ: 5
  • trainable: 1.28 B / 28.63 B (4.46%)
  • Samples: 1,137 train / 52 val / 36 test (1,225๊ฐœ newly revised์—์„œ ๋ถ„ํ• )

Why staged?

v3 ํ•™์Šต ์‹œ 4,514๊ฐœ ์ค‘ 3,289๊ฐœ๋งŒ critique-revised, 1,225๊ฐœ๋Š” ์›๋ณธ์ด์—ˆ์Œ. Critique-revise ์™„๋ฃŒ ํ›„ 4,514๊ฐœ ๋ชจ๋‘ ๋™์ผ ๋ฐ์ดํ„ฐ๋กœ v4 ์žฌํ•™์Šต ์‹œ๋„ ์‹œ 3,289๊ฐœ๊ฐ€ 4 epoch ํ•™์Šต๋˜์–ด overfitting ๋ฐœ์ƒ (eval_loss 0.9014 โ†’ 0.9349). 1,225๊ฐœ newly revised๋งŒ์œผ๋กœ v3 ์œ„์— ์ถ”๊ฐ€ ํ•™์Šตํ•œ v5๊ฐ€ ๊ฐ€์žฅ ๊นจ๋—ํ•œ ๊ฒฐ๊ณผ.

Eval loss

Stage eval_loss
v3 final (epoch 2) 0.9035
v4 step 300 (์‹คํŒจ) 0.9349 โš ๏ธ regression
v5 final (epoch 1) 0.8583 โœ…

Data

Dino-LeeTaeHun/finetune-data-v3 (private) (4,514 fully critique-revised by Claude Opus)

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
import torch

base = AutoModelForCausalLM.from_pretrained(
    "Qwen/Qwen3.6-27B-A3B",
    torch_dtype=torch.bfloat16,
    device_map="auto",
)
model = PeftModel.from_pretrained(base, "Dino-LeeTaeHun/finetune-v5")
tok = AutoTokenizer.from_pretrained("Dino-LeeTaeHun/finetune-v5")

License

Internal research. Non-commercial.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support