Qwen3.6-27B AIHub Korean LLM Leaderboard — sft_v3 (LoRA)
QLoRA + DoRA adapter trained on Critique-Revised Korean reasoning data, targeting the AIHub Korean LLM Leaderboard (5 benchmarks).
Training
- Base: Qwen3.6-27B (multimodal)
- Method: QLoRA 4bit + DoRA
- rank/alpha: 256/256
- modules_to_save: 없음 (embed/lm_head 학습 X)
- target_modules: q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
- LR: 5e-5, cosine, warmup 0.03
- epochs: 2
- batch×accum: 2×8 (effective 16)
- max_seq: 16384, packing on
- NEFTune α: 5
- trainable: 1.28 B / 28.63 B (4.46%)
Data
Dino-LeeTaeHun/aihub-leaderboard-critique-revise-v3 (4,514 samples, critique-revised by Claude Opus)
Eval
| Benchmark | sft_v3 | v2 | Quetta-V3 (1위) |
|---|---|---|---|
| KMMLU-Pro | 0.636 | 0.534 | 0.676 |
| CLIcK | 0.770 | 0.668 | 0.794 |
| HLE (Ko) | 0.044 | ? | 0.070 |
| MuSR (Ko) | 0.759 | ? | 0.604 |
| Com2-main | 0.593 | ? | 0.654 |
| Average | 0.560 | 0.560 |
Usage
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
import torch
base = AutoModelForCausalLM.from_pretrained(
"Qwen/Qwen3.6-27B-A3B",
torch_dtype=torch.bfloat16,
device_map="auto",
)
model = PeftModel.from_pretrained(base, "Dino-LeeTaeHun/qwen36-27b-aihub-leaderboard-v3-lora")
tok = AutoTokenizer.from_pretrained("Dino-LeeTaeHun/qwen36-27b-aihub-leaderboard-v3-lora")
License
Internal research. Non-commercial.
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support