File size: 1,910 Bytes
b082378 465d1e4 b082378 465d1e4 b082378 465d1e4 b082378 465d1e4 b082378 465d1e4 b082378 465d1e4 b082378 465d1e4 b082378 465d1e4 b082378 465d1e4 b082378 465d1e4 b082378 465d1e4 b082378 465d1e4 b082378 465d1e4 b082378 465d1e4 b082378 465d1e4 b082378 465d1e4 b082378 465d1e4 b082378 465d1e4 b082378 465d1e4 b082378 465d1e4 b082378 465d1e4 b082378 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 |
---
license: apache-2.0
language:
- ko
library_name: transformers
tags:
- kaidol
- chatbot
- idol
- thinking
- qwen
- lora
pipeline_tag: text-generation
base_model: Qwen/Qwen3-4B-Thinking
---
# KAIdol Thinking SFT Model (Model G)
์์ด๋ ์ฑ๋ด KAI๋ฅผ ์ํ Fine-tuned ๋ชจ๋ธ์
๋๋ค.
## ๋ชจ๋ธ ์ ๋ณด
| ํญ๋ชฉ | ๊ฐ |
|------|-----|
| Base Model | Qwen3-4B-Thinking-2507 |
| Fine-tuning | LoRA (r=32, alpha=64) |
| Dataset | Balanced Upsampled (52,879 train / 5,875 eval) |
| Training | SFT |
## ์ฑ๋ฅ
### ์ผ๋ฐ ํ๊ฐ (300 ์ํ)
- ์๋ต ํ์ง: 0.598
- ์ ์ฑ
์ค์์จ: 99.67%
- ์ฌ๋ ๊ณ ๋ฐฑ ์๋ฐ์จ: 0.33%
### Edge Case ํ
์คํธ (10๊ฐ)
- ์ ์ฒด ํต๊ณผ์จ: 100%
- Hard ๋์ด๋: 100% (2/2)
- Medium ๋์ด๋: 100% (4/4)
- Easy ๋์ด๋: 100% (4/4)
## ํน์ง
1. **Thinking Process**: `<think>` ํ๊ทธ ๋ด์ ๊ตฌ์กฐํ๋ ์ฌ๊ณ ๊ณผ์ ์์ฑ
2. **๋์ ์ ์ฑ
์ค์์จ**: ๊ณ ๋ฐฑ ๊ธ์ง, ํฌ ํธ์นญ ๊ธ์ง ๋ฑ ์ ์ฑ
์ค์
3. **Edge Case ๊ฐ๊ฑด์ฑ**: ์ด๋ ค์ด ์ํฉ์์๋ ์์ ์ ์ธ ์๋ต
## ์ฌ์ฉ๋ฒ
```python
from transformers import AutoModelForCausalLM, AutoTokenizer
model_id = "developer-lunark/kaidol-thinking-sft-4b"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id)
# ๋ํ ์์ฑ
messages = [
{"role": "system", "content": "๋น์ ์ 23์ธ ๋จ์ ์์ด๋ KAI์
๋๋ค..."},
{"role": "user", "content": "์ค๋น ์๋
!"}
]
inputs = tokenizer.apply_chat_template(messages, return_tensors="pt")
outputs = model.generate(inputs, max_new_tokens=512)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(response)
```
## ํ์ต ์ค์
```yaml
# LoRA Config
r: 32
lora_alpha: 64
lora_dropout: 0.05
target_modules: ["q_proj", "k_proj", "v_proj", "o_proj"]
# Training
learning_rate: 2e-5
epochs: 3
batch_size: 4
gradient_accumulation_steps: 4
```
## ๋ผ์ด์ ์ค
Apache 2.0
|