KAIdol Thinking SFT Model (Model G)
์์ด๋ ์ฑ๋ด KAI๋ฅผ ์ํ Fine-tuned ๋ชจ๋ธ์ ๋๋ค.
๋ชจ๋ธ ์ ๋ณด
| ํญ๋ชฉ | ๊ฐ |
|---|---|
| Base Model | Qwen3-4B-Thinking-2507 |
| Fine-tuning | LoRA (r=32, alpha=64) |
| Dataset | Balanced Upsampled (52,879 train / 5,875 eval) |
| Training | SFT |
์ฑ๋ฅ
์ผ๋ฐ ํ๊ฐ (300 ์ํ)
- ์๋ต ํ์ง: 0.598
- ์ ์ฑ ์ค์์จ: 99.67%
- ์ฌ๋ ๊ณ ๋ฐฑ ์๋ฐ์จ: 0.33%
Edge Case ํ ์คํธ (10๊ฐ)
- ์ ์ฒด ํต๊ณผ์จ: 100%
- Hard ๋์ด๋: 100% (2/2)
- Medium ๋์ด๋: 100% (4/4)
- Easy ๋์ด๋: 100% (4/4)
ํน์ง
- Thinking Process:
<think>ํ๊ทธ ๋ด์ ๊ตฌ์กฐํ๋ ์ฌ๊ณ ๊ณผ์ ์์ฑ - ๋์ ์ ์ฑ ์ค์์จ: ๊ณ ๋ฐฑ ๊ธ์ง, ํฌ ํธ์นญ ๊ธ์ง ๋ฑ ์ ์ฑ ์ค์
- Edge Case ๊ฐ๊ฑด์ฑ: ์ด๋ ค์ด ์ํฉ์์๋ ์์ ์ ์ธ ์๋ต
์ฌ์ฉ๋ฒ
from transformers import AutoModelForCausalLM, AutoTokenizer
model_id = "developer-lunark/kaidol-thinking-sft-4b"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id)
# ๋ํ ์์ฑ
messages = [
{"role": "system", "content": "๋น์ ์ 23์ธ ๋จ์ ์์ด๋ KAI์
๋๋ค..."},
{"role": "user", "content": "์ค๋น ์๋
!"}
]
inputs = tokenizer.apply_chat_template(messages, return_tensors="pt")
outputs = model.generate(inputs, max_new_tokens=512)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(response)
ํ์ต ์ค์
# LoRA Config
r: 32
lora_alpha: 64
lora_dropout: 0.05
target_modules: ["q_proj", "k_proj", "v_proj", "o_proj"]
# Training
learning_rate: 2e-5
epochs: 3
batch_size: 4
gradient_accumulation_steps: 4
๋ผ์ด์ ์ค
Apache 2.0
- Downloads last month
- 27