developer-lunark's picture
Upload README.md with huggingface_hub
465d1e4 verified
---
license: apache-2.0
language:
- ko
library_name: transformers
tags:
- kaidol
- chatbot
- idol
- thinking
- qwen
- lora
pipeline_tag: text-generation
base_model: Qwen/Qwen3-4B-Thinking
---
# KAIdol Thinking SFT Model (Model G)
์•„์ด๋Œ ์ฑ—๋ด‡ KAI๋ฅผ ์œ„ํ•œ Fine-tuned ๋ชจ๋ธ์ž…๋‹ˆ๋‹ค.
## ๋ชจ๋ธ ์ •๋ณด
| ํ•ญ๋ชฉ | ๊ฐ’ |
|------|-----|
| Base Model | Qwen3-4B-Thinking-2507 |
| Fine-tuning | LoRA (r=32, alpha=64) |
| Dataset | Balanced Upsampled (52,879 train / 5,875 eval) |
| Training | SFT |
## ์„ฑ๋Šฅ
### ์ผ๋ฐ˜ ํ‰๊ฐ€ (300 ์ƒ˜ํ”Œ)
- ์‘๋‹ต ํ’ˆ์งˆ: 0.598
- ์ •์ฑ… ์ค€์ˆ˜์œจ: 99.67%
- ์‚ฌ๋ž‘ ๊ณ ๋ฐฑ ์œ„๋ฐ˜์œจ: 0.33%
### Edge Case ํ…Œ์ŠคํŠธ (10๊ฐœ)
- ์ „์ฒด ํ†ต๊ณผ์œจ: 100%
- Hard ๋‚œ์ด๋„: 100% (2/2)
- Medium ๋‚œ์ด๋„: 100% (4/4)
- Easy ๋‚œ์ด๋„: 100% (4/4)
## ํŠน์ง•
1. **Thinking Process**: `<think>` ํƒœ๊ทธ ๋‚ด์— ๊ตฌ์กฐํ™”๋œ ์‚ฌ๊ณ ๊ณผ์ • ์ƒ์„ฑ
2. **๋†’์€ ์ •์ฑ… ์ค€์ˆ˜์œจ**: ๊ณ ๋ฐฑ ๊ธˆ์ง€, ํŒฌ ํ˜ธ์นญ ๊ธˆ์ง€ ๋“ฑ ์ •์ฑ… ์ค€์ˆ˜
3. **Edge Case ๊ฐ•๊ฑด์„ฑ**: ์–ด๋ ค์šด ์ƒํ™ฉ์—์„œ๋„ ์•ˆ์ •์ ์ธ ์‘๋‹ต
## ์‚ฌ์šฉ๋ฒ•
```python
from transformers import AutoModelForCausalLM, AutoTokenizer
model_id = "developer-lunark/kaidol-thinking-sft-4b"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id)
# ๋Œ€ํ™” ์ƒ์„ฑ
messages = [
{"role": "system", "content": "๋‹น์‹ ์€ 23์„ธ ๋‚จ์ž ์•„์ด๋Œ KAI์ž…๋‹ˆ๋‹ค..."},
{"role": "user", "content": "์˜ค๋น  ์•ˆ๋…•!"}
]
inputs = tokenizer.apply_chat_template(messages, return_tensors="pt")
outputs = model.generate(inputs, max_new_tokens=512)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(response)
```
## ํ•™์Šต ์„ค์ •
```yaml
# LoRA Config
r: 32
lora_alpha: 64
lora_dropout: 0.05
target_modules: ["q_proj", "k_proj", "v_proj", "o_proj"]
# Training
learning_rate: 2e-5
epochs: 3
batch_size: 4
gradient_accumulation_steps: 4
```
## ๋ผ์ด์„ ์Šค
Apache 2.0