File size: 1,910 Bytes
b082378
 
 
 
 
 
 
 
 
 
 
465d1e4
b082378
465d1e4
b082378
 
465d1e4
b082378
465d1e4
b082378
465d1e4
b082378
465d1e4
 
 
 
 
 
b082378
465d1e4
b082378
465d1e4
 
 
 
b082378
465d1e4
 
 
 
 
b082378
465d1e4
b082378
465d1e4
 
 
b082378
465d1e4
b082378
 
 
 
465d1e4
b082378
465d1e4
b082378
465d1e4
b082378
465d1e4
 
b082378
 
465d1e4
 
 
b082378
 
 
465d1e4
b082378
465d1e4
 
 
 
 
 
b082378
465d1e4
 
 
 
 
b082378
 
465d1e4
b082378
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
---
license: apache-2.0
language:
- ko
library_name: transformers
tags:
- kaidol
- chatbot
- idol
- thinking
- qwen
- lora
pipeline_tag: text-generation
base_model: Qwen/Qwen3-4B-Thinking
---

# KAIdol Thinking SFT Model (Model G)

์•„์ด๋Œ ์ฑ—๋ด‡ KAI๋ฅผ ์œ„ํ•œ Fine-tuned ๋ชจ๋ธ์ž…๋‹ˆ๋‹ค.

## ๋ชจ๋ธ ์ •๋ณด

| ํ•ญ๋ชฉ | ๊ฐ’ |
|------|-----|
| Base Model | Qwen3-4B-Thinking-2507 |
| Fine-tuning | LoRA (r=32, alpha=64) |
| Dataset | Balanced Upsampled (52,879 train / 5,875 eval) |
| Training | SFT |

## ์„ฑ๋Šฅ

### ์ผ๋ฐ˜ ํ‰๊ฐ€ (300 ์ƒ˜ํ”Œ)
- ์‘๋‹ต ํ’ˆ์งˆ: 0.598
- ์ •์ฑ… ์ค€์ˆ˜์œจ: 99.67%
- ์‚ฌ๋ž‘ ๊ณ ๋ฐฑ ์œ„๋ฐ˜์œจ: 0.33%

### Edge Case ํ…Œ์ŠคํŠธ (10๊ฐœ)
- ์ „์ฒด ํ†ต๊ณผ์œจ: 100%
- Hard ๋‚œ์ด๋„: 100% (2/2)
- Medium ๋‚œ์ด๋„: 100% (4/4)
- Easy ๋‚œ์ด๋„: 100% (4/4)

## ํŠน์ง•

1. **Thinking Process**: `<think>` ํƒœ๊ทธ ๋‚ด์— ๊ตฌ์กฐํ™”๋œ ์‚ฌ๊ณ ๊ณผ์ • ์ƒ์„ฑ
2. **๋†’์€ ์ •์ฑ… ์ค€์ˆ˜์œจ**: ๊ณ ๋ฐฑ ๊ธˆ์ง€, ํŒฌ ํ˜ธ์นญ ๊ธˆ์ง€ ๋“ฑ ์ •์ฑ… ์ค€์ˆ˜
3. **Edge Case ๊ฐ•๊ฑด์„ฑ**: ์–ด๋ ค์šด ์ƒํ™ฉ์—์„œ๋„ ์•ˆ์ •์ ์ธ ์‘๋‹ต

## ์‚ฌ์šฉ๋ฒ•

```python
from transformers import AutoModelForCausalLM, AutoTokenizer

model_id = "developer-lunark/kaidol-thinking-sft-4b"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id)

# ๋Œ€ํ™” ์ƒ์„ฑ
messages = [
    {"role": "system", "content": "๋‹น์‹ ์€ 23์„ธ ๋‚จ์ž ์•„์ด๋Œ KAI์ž…๋‹ˆ๋‹ค..."},
    {"role": "user", "content": "์˜ค๋น  ์•ˆ๋…•!"}
]

inputs = tokenizer.apply_chat_template(messages, return_tensors="pt")
outputs = model.generate(inputs, max_new_tokens=512)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(response)
```

## ํ•™์Šต ์„ค์ •

```yaml
# LoRA Config
r: 32
lora_alpha: 64
lora_dropout: 0.05
target_modules: ["q_proj", "k_proj", "v_proj", "o_proj"]

# Training
learning_rate: 2e-5
epochs: 3
batch_size: 4
gradient_accumulation_steps: 4
```

## ๋ผ์ด์„ ์Šค

Apache 2.0