KAIdol Thinking SFT Model (Model G)

์•„์ด๋Œ ์ฑ—๋ด‡ KAI๋ฅผ ์œ„ํ•œ Fine-tuned ๋ชจ๋ธ์ž…๋‹ˆ๋‹ค.

๋ชจ๋ธ ์ •๋ณด

ํ•ญ๋ชฉ ๊ฐ’
Base Model Qwen3-4B-Thinking-2507
Fine-tuning LoRA (r=32, alpha=64)
Dataset Balanced Upsampled (52,879 train / 5,875 eval)
Training SFT

์„ฑ๋Šฅ

์ผ๋ฐ˜ ํ‰๊ฐ€ (300 ์ƒ˜ํ”Œ)

  • ์‘๋‹ต ํ’ˆ์งˆ: 0.598
  • ์ •์ฑ… ์ค€์ˆ˜์œจ: 99.67%
  • ์‚ฌ๋ž‘ ๊ณ ๋ฐฑ ์œ„๋ฐ˜์œจ: 0.33%

Edge Case ํ…Œ์ŠคํŠธ (10๊ฐœ)

  • ์ „์ฒด ํ†ต๊ณผ์œจ: 100%
  • Hard ๋‚œ์ด๋„: 100% (2/2)
  • Medium ๋‚œ์ด๋„: 100% (4/4)
  • Easy ๋‚œ์ด๋„: 100% (4/4)

ํŠน์ง•

  1. Thinking Process: <think> ํƒœ๊ทธ ๋‚ด์— ๊ตฌ์กฐํ™”๋œ ์‚ฌ๊ณ ๊ณผ์ • ์ƒ์„ฑ
  2. ๋†’์€ ์ •์ฑ… ์ค€์ˆ˜์œจ: ๊ณ ๋ฐฑ ๊ธˆ์ง€, ํŒฌ ํ˜ธ์นญ ๊ธˆ์ง€ ๋“ฑ ์ •์ฑ… ์ค€์ˆ˜
  3. Edge Case ๊ฐ•๊ฑด์„ฑ: ์–ด๋ ค์šด ์ƒํ™ฉ์—์„œ๋„ ์•ˆ์ •์ ์ธ ์‘๋‹ต

์‚ฌ์šฉ๋ฒ•

from transformers import AutoModelForCausalLM, AutoTokenizer

model_id = "developer-lunark/kaidol-thinking-sft-4b"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id)

# ๋Œ€ํ™” ์ƒ์„ฑ
messages = [
    {"role": "system", "content": "๋‹น์‹ ์€ 23์„ธ ๋‚จ์ž ์•„์ด๋Œ KAI์ž…๋‹ˆ๋‹ค..."},
    {"role": "user", "content": "์˜ค๋น  ์•ˆ๋…•!"}
]

inputs = tokenizer.apply_chat_template(messages, return_tensors="pt")
outputs = model.generate(inputs, max_new_tokens=512)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(response)

ํ•™์Šต ์„ค์ •

# LoRA Config
r: 32
lora_alpha: 64
lora_dropout: 0.05
target_modules: ["q_proj", "k_proj", "v_proj", "o_proj"]

# Training
learning_rate: 2e-5
epochs: 3
batch_size: 4
gradient_accumulation_steps: 4

๋ผ์ด์„ ์Šค

Apache 2.0

Downloads last month
27
Safetensors
Model size
4B params
Tensor type
BF16
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for developer-lunark/kaidol-thinking-sft-4b

Adapters
2 models