File size: 8,564 Bytes
336b32d
 
c5b2769
336b32d
48760fe
07701c8
48760fe
6b096a7
48760fe
8436359
07701c8
336b32d
 
6b096a7
 
 
251d40e
ddb23c6
6b096a7
5520b92
6b096a7
336b32d
6b096a7
 
251d40e
 
336b32d
6b096a7
7720230
 
 
6b096a7
7720230
47f5f02
7720230
 
6b096a7
7720230
 
ddb23c6
7720230
 
62b7aeb
ddb23c6
 
07701c8
336b32d
07701c8
336b32d
e1d68fd
5520b92
 
ddb23c6
336b32d
07701c8
336b32d
 
5520b92
07701c8
5520b92
07701c8
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
8867f43
07701c8
 
 
 
 
 
 
 
 
 
 
 
5520b92
99a8255
07701c8
ddb23c6
07701c8
 
 
 
 
336b32d
07701c8
336b32d
99a8255
 
 
 
 
6b096a7
 
99a8255
 
654930c
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
6b096a7
654930c
 
 
 
 
 
 
 
 
 
99a8255
654930c
99a8255
654930c
 
 
 
 
 
 
 
 
f60067c
d2d8efc
 
f60067c
07701c8
a148233
7720230
 
18de4d0
 
 
7720230
 
 
 
18de4d0
7720230
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
---
library_name: transformers
license: apache-2.0
---
<!-- <p align="center">
  <img src="https://github.com/MLP-Lab/KORMo-tutorial/blob/main/tutorial/attachment/kormo_logo.png?raw=true" style="width: 100%; max-width: 1100px;">
</p> -->

<p align="center">
  <img src="https://github.com/MLP-Lab/KORMo-tutorial/blob/main/tutorial/attachment/kormo_logo.svg?raw=true" style="width: 40%; max-width: 1100px;">
</p>




## πŸš€ Update News
- **2025-10-13**: Official release of KORMo-10B-sft.
---
## πŸ’‘ About KORMo
**KORMo-10B** is a **10.8B parameter fully open LLM** capable of handling both **Korean and English**.  
The model, training code, and training data are all **fully open**, allowing anyone to reproduce and extend them.

- **Model Size**: 10.8B parameters  
- **Languages**: Korean / English  
- **Training Data**: Synthetic data + public datasets (approximately 3T tokens)
- **License**: Apache 2.0

```md
KORMoλŠ” λΉ„μ˜μ–΄κΆŒ 졜초의 Fully Open Source LLM으둜, 곡읡적 ν™œμš©μ„ λͺ©ν‘œλ‘œ νƒ„μƒν–ˆμŠ΅λ‹ˆλ‹€.
μš°λ¦¬λŠ” λˆ„κ΅¬λ‚˜ 세계 μˆ˜μ€€μ˜ μ–Έμ–΄λͺ¨λΈμ„ 직접 λ§Œλ“€κ³  λ°œμ „μ‹œν‚¬ 수 μžˆλŠ” ν™˜κ²½μ„ λ§Œλ“€κ³ μž ν•©λ‹ˆλ‹€.
KORMo의 μ£Όμš” νŠΉμ§•μ€ λ‹€μŒκ³Ό κ°™μŠ΅λ‹ˆλ‹€:

1. From scratch ν•™μŠ΅μœΌλ‘œ μ„€κ³„λœ 10BκΈ‰ ν•œβ€“μ˜ μΆ”λ‘  μ–Έμ–΄λͺ¨λΈμž…λ‹ˆλ‹€.
2. ν•™μŠ΅ 데이터, μ½”λ“œ, λͺ¨λΈ μ²΄ν¬ν¬μΈνŠΈμ™€ νŠœν† λ¦¬μ–Όμ„ 100% κ³΅κ°œν•˜μ—¬, λˆ„κ΅¬λ‚˜ SOTA에 κ·Όμ ‘ν•œ λͺ¨λΈμ„ 직접 μž¬ν˜„ν•˜κ³  ν™•μž₯ν•  수 μžˆμŠ΅λ‹ˆλ‹€.
3. 총 3.7T 토큰 규λͺ¨μ˜ ν•™μŠ΅ 데이터λ₯Ό κ³΅κ°œν•©λ‹ˆλ‹€. 특히 μ§€κΈˆκΉŒμ§€ ν•œ λ²ˆλ„ 곡개된 적 μ—†λŠ” μ΄ˆκ³ ν’ˆμ§ˆ μ „μ£ΌκΈ° ν•œκ΅­μ–΄ 데이터(μ‚¬μ „ν•™μŠ΅, μ‚¬ν›„ν•™μŠ΅, μΌλ°˜ν˜•, μΆ”λ‘ ν˜•, κ°•ν™”ν•™μŠ΅ λ“±)λ₯Ό μ œκ³΅ν•©λ‹ˆλ‹€.
4. 이 λͺ¨λ“  μž‘μ—…μ€ KAIST λ¬Έν™”κΈ°μˆ λŒ€ν•™μ› MLPμ—°κ΅¬μ‹€μ˜ 학뢀·석사생 8λͺ…이 ν˜‘λ ₯ν•˜μ—¬ μ§„ν–‰ν–ˆμœΌλ©°, 45μž₯에 λ‹¬ν•˜λŠ” λ…Όλ¬ΈμœΌλ‘œ μ •λ¦¬ν–ˆμŠ΅λ‹ˆλ‹€.

μ§€κΈˆκΉŒμ§€ ν•œκ΅­μ–΄ λͺ¨λΈμ„ 써보면, 벀치마크 μ μˆ˜λŠ” 쒋은데 μ‹€μ‚¬μš©μ—μ„œλŠ” μ–΄λ”˜κ°€ μ΄μƒν•˜κ±°λ‚˜,
νŠœλ‹λ§Œ ν•˜λ©΄ λͺ¨λΈμ΄ λ§κ°€μ§€λŠ” κ²½ν—˜μ„ ν•˜μ…¨μ„ κ²λ‹ˆλ‹€. λ‹΅λ‹΅ν•˜μ…¨μ£ ?

KORMoλŠ” 그런 문제λ₯Ό μ •λ©΄μœΌλ‘œ ν•΄κ²°ν•©λ‹ˆλ‹€.
λͺ¨λ“  쀑간 λͺ¨λΈκ³Ό μ‚¬ν›„ν•™μŠ΅ 데이터λ₯Ό ν•¨κ»˜ κ³΅κ°œν•˜κΈ° λ•Œλ¬Έμ—, μ‚¬μš©μžλŠ” 베이슀 λͺ¨λΈ μœ„μ— μžμ‹ λ§Œμ˜ 데이터λ₯Ό μ–Ήμ–΄ μ›ν•˜λŠ” λ°©ν–₯으둜 κ°•ν™”ν•™μŠ΅Β·νŠœλ‹μ„ μ§„ν–‰ν•  수 μžˆμŠ΅λ‹ˆλ‹€.
πŸ‘‰ "쒋은 ν•œκ΅­μ–΄ λͺ¨λΈμ„ κ°–κ³  μ‹Άλ‹€λ©΄, 이제 직접 λ§Œλ“€μ–΄λ³΄μ„Έμš”. μ½”λž© 무료 GPUλ‘œλ„ νŠœλ‹λ©λ‹ˆλ‹€! πŸ€—"
```

---

## πŸ”— Links

- πŸ“– **Technical Report**: [πŸ‘‰ Paper](https://huggingface.co/papers/2510.09426) , [πŸ‘‰ ν•œκ΅­μ–΄ μš”μ•½ppt](https://github.com/MLP-Lab/KORMo-tutorial/blob/main/20251009_MLP_KORMo(Korean).pdf)
- πŸ€— **Hugging Face**: [πŸ‘‰ Model Download](https://huggingface.co/KORMo-Team)  
- πŸ’» **GitHub Repository**: [πŸ‘‰ Training and Inference Code](https://github.com/MLP-Lab/KORMo-tutorial)
- πŸ”‰ **Tutorial**: [πŸ‘‰ Instruction Tuning over google colab](https://colab.research.google.com/github/MLP-Lab/KORMo-tutorial/blob/main/tutorial/02.sft_qlora.ipynb) [πŸ‘‰ Youtube Tutorial](https://www.youtube.com/@MLPLab)

---


## πŸ“ˆ Benchmark Performance

### πŸ“Š Quantitative Evaluation

| Benchmark | **KORMo-10B** | smolLM3-3B | olmo2-7B | olmo2-13B | kanana1.5-8B | qwen3-8B | llama3.1-8B | gemma3-4B | gemma3-12B |
|:-----------|---------------:|-----------:|---------:|---------:|------------:|--------:|-----------:|---------:|----------:|
| **πŸ‡ΊπŸ‡Έ English Benchmarks** |||||||||||
| arc_challenge | 58.96 | 55.55 | 59.13 | 61.01 | 56.48 | 63.82 | 54.61 | 53.58 | 63.82 |
| arc_easy | 85.48 | 83.21 | 85.06 | 86.57 | 82.74 | 87.50 | 84.01 | 82.83 | 87.37 |
| boolq | 83.46 | 82.17 | 84.50 | 86.48 | 84.53 | 87.71 | 81.87 | 80.70 | 86.61 |
| copa | 93.00 | 91.00 | 92.00 | 93.00 | 88.00 | 92.00 | 93.00 | 89.00 | 95.00 |
| gpqa_main | 30.13 | 26.79 | 26.34 | 29.24 | 29.24 | 30.13 | 23.44 | 30.13 | 35.71 |
| hellaswag | 60.25 | 56.78 | 61.52 | 65.02 | 59.93 | 59.54 | 60.96 | 57.56 | 63.67 |
| mmlu | 67.96 | 61.37 | 62.81 | 66.85 | 63.73 | 76.95 | 65.03 | 59.60 | 73.58 |
| mmlu_global | 63.44 | 57.52 | 59.88 | 63.99 | 60.21 | 75.05 | 61.30 | 57.23 | 70.23 |
| mmlu_pro | 40.18 | 34.94 | 27.29 | 32.50 | 34.93 | 56.58 | 36.23 | 27.79 | 37.07 |
| mmlu_redux | 69.00 | 62.95 | 63.53 | 68.37 | 65.88 | 78.19 | 65.86 | 60.86 | 75.25 |
| openbookqa | 39.00 | 36.40 | 39.00 | 39.60 | 36.80 | 39.20 | 39.00 | 37.00 | 40.20 |
| piqa | 81.12 | 78.45 | 80.79 | 82.64 | 80.30 | 79.05 | 80.90 | 79.49 | 82.59 |
| social_iqa | 52.81 | 50.72 | 55.89 | 57.57 | 57.01 | 56.96 | 53.12 | 51.84 | 56.45 |
| **English Avg.** | **63.45** | 59.83 | 61.36 | 64.06 | 61.52 | 67.90 | 61.49 | 59.05 | 66.73 |
| **πŸ‡°πŸ‡· Korean Benchmarks** |||||||||||
| click | 55.29 | 46.97 | 37.79 | 41.80 | 62.76 | 60.70 | 49.22 | 49.62 | 62.21 |
| csatqa | 38.00 | 26.67 | 19.33 | 24.67 | 44.67 | 52.00 | 28.67 | 28.67 | 31.33 |
| haerae | 68.29 | 55.82 | 31.62 | 37.58 | 80.75 | 67.19 | 53.25 | 60.68 | 74.34 |
| k2_eval | 84.89 | 75.23 | 49.54 | 63.43 | 84.72 | 84.72 | 76.62 | 76.39 | 85.42 |
| kobest | 75.05 | 69.13 | 57.27 | 59.02 | 81.93 | 80.05 | 70.55 | 69.33 | 77.70 |
| kobalt | 22.86 | 15.86 | 11.43 | 13.14 | 26.29 | 26.57 | 17.43 | 15.57 | 23.86 |
| kmmlu | 46.48 | 38.52 | 33.05 | 31.24 | 48.86 | 56.93 | 40.75 | 39.84 | 51.60 |
| mmlu_global (ko) | 55.16 | 44.15 | 34.00 | 36.95 | 52.65 | 61.95 | 46.34 | 46.33 | 59.68 |
| kr_clinical_qa | 77.32 | 53.97 | 48.33 | 46.22 | 65.84 | 80.00 | 63.54 | 60.00 | 77.22 |
| **Korean Avg.** | **58.15** | 47.37 | 35.82 | 39.34 | 60.94 | 63.35 | 49.60 | 49.60 | 60.37 |


### πŸ“ Qualitative Evaluation (LLM-as-a-Judge)

| Benchmark | KORMo-10B | smolLM3-3B | olmo2-7B | olmo2-13B | kanana1.5-8B | qwen3-8B | llama3.1-8B | exaone3.5-8B | gemma3-12B |
|:----------|---------:|----------:|---------:|---------:|------------:|--------:|------------:|-------------:|-----------:|
| MT-Bench (EN) | 8.32 | 7.15 | 7.32 | 7.64 | 8.45 | 8.70 | 6.32 | 8.15 | 8.70 |
| KO-MT-Bench (KO) | 8.54 | - | - | - | 8.02 | 8.16 | 4.27 | 8.13 | 8.51 |
| LogicKor (KO) | 8.96 | - | - | - | 8.94 | 8.63 | 6.45 | 9.20 | 8.46 |
| **Average** | **8.61** | - | - | - | **8.47** | **8.50** | **5.68** | **8.49** | **8.56** |

---

## πŸ“¦ Installation

```bash
git clone https://github.com/MLP-Lab/KORMo-tutorial.git
cd KORMo-tutorial
bash setup/create_uv_venv.sh
source .venv_kormo/bin/activate
```

---
## πŸš€ Inference Example

```python
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

model_name = "KORMo-Team/KORMo-10B-sft"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype=torch.bfloat16,
    device_map="auto",
    trust_remote_code=True
)

messages = [
    {"role": "user", "content": "What happens inside a black hole?"}
]

chat_prompt = tokenizer.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True,
    enable_thinking=False
)

inputs = tokenizer(chat_prompt, return_tensors="pt").to(model.device)

with torch.inference_mode():
    output_ids = model.generate(
        **inputs,
        max_new_tokens=1024,
    )

response = tokenizer.decode(output_ids[0][inputs['input_ids'].shape[1]:], skip_special_tokens=True)
print("Assistant:", response)
```

## 🧠 Enabling Thinking Mode

If you want to enable the **thinking** mode, simply set `enable_thinking=True`:

```python
chat_prompt = tokenizer.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True,
    enable_thinking=True
)
```
---

## Limitation
The model has not yet been safety-tuned or preference-aligned, which may lead to suboptimal performance or undesired repetitions in complex reasoning tasks.

## Contact
- KyungTae Lim, Professor at KAIST. `ktlim@kaist.ac.kr`


## Acknowledgments 
- This work was supported by Institute of Information & communications Technology Planning & Evaluation (IITP) grant funded by the Korea government(MSIT) (RS-2025-02653113, High-Performance Research AI Computing Infrastructure Support at the 2 PFLOPS Scale)

## Citation

```text
@misc{KORMo,
  author = {Minjun Kim, Hyeonseok Lim, Hangyeol Yoo, Inho Won, Seungwoo Song, Minkyung Cho, Junghun Yuk, Changsu Choi, Dongjae Shin, Huije Lee, Hoyun Song, Alice Oh, and KyungTae Lim},
  title = {KORMo: Korean Open Reasoning Model for Everyone},
  year = {2025},
  publisher = {GitHub},
  journal = {Technical Report},
  paperLink = {\url{https://arxiv.org/abs/2510.09426}},
 },
}
```