Qwen3.6-27B AIHub Korean LLM Leaderboard — sft_v3 (LoRA)

QLoRA + DoRA adapter trained on Critique-Revised Korean reasoning data, targeting the AIHub Korean LLM Leaderboard (5 benchmarks).

Training

  • Base: Qwen3.6-27B (multimodal)
  • Method: QLoRA 4bit + DoRA
  • rank/alpha: 256/256
  • modules_to_save: 없음 (embed/lm_head 학습 X)
  • target_modules: q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
  • LR: 5e-5, cosine, warmup 0.03
  • epochs: 2
  • batch×accum: 2×8 (effective 16)
  • max_seq: 16384, packing on
  • NEFTune α: 5
  • trainable: 1.28 B / 28.63 B (4.46%)

Data

Dino-LeeTaeHun/aihub-leaderboard-critique-revise-v3 (4,514 samples, critique-revised by Claude Opus)

Eval

Benchmark sft_v3 v2 Quetta-V3 (1위)
KMMLU-Pro 0.636 0.534 0.676
CLIcK 0.770 0.668 0.794
HLE (Ko) 0.044 ? 0.070
MuSR (Ko) 0.759 ? 0.604
Com2-main 0.593 ? 0.654
Average 0.560 0.560

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
import torch

base = AutoModelForCausalLM.from_pretrained(
    "Qwen/Qwen3.6-27B-A3B",
    torch_dtype=torch.bfloat16,
    device_map="auto",
)
model = PeftModel.from_pretrained(base, "Dino-LeeTaeHun/qwen36-27b-aihub-leaderboard-v3-lora")
tok = AutoTokenizer.from_pretrained("Dino-LeeTaeHun/qwen36-27b-aihub-leaderboard-v3-lora")

License

Internal research. Non-commercial.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support