File size: 4,343 Bytes
ce2298b 15adac9 ce2298b 15adac9 ce2298b 15adac9 ce2298b 15adac9 b6548e7 ce2298b b6548e7 ce2298b b6548e7 ce2298b b6548e7 ce2298b 15adac9 ce2298b b6548e7 91bed77 b6548e7 ce2298b 15adac9 a6dba8b ce2298b 28b6916 ce2298b 15adac9 ce2298b 15adac9 c2c4a81 15adac9 ce2298b 15adac9 c2c4a81 15adac9 ce2298b 15adac9 ce2298b 15adac9 ce2298b 91bed77 15adac9 ce2298b 15adac9 ce2298b 15adac9 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 | ---
library_name: diffusers
license: apache-2.0
---
<!-- <p align="center">
<img src="https://github.com/MLP-Lab/KORMo-tutorial/blob/main/tutorial/attachment/kormo_logo.png?raw=true" style="width: 100%; max-width: 1100px;">
</p> -->
<p align="center">
<img src="https://github.com/MLP-Lab/KORMo-tutorial/blob/main/tutorial/attachment/kormo_logo.svg?raw=true" style="width: 40%; max-width: 1100px;">
</p>
## ๐ Update News
- **2026-03-05**: Official release of KORMo-Diffusion.
- **2026-03-02**: Official release of KORMo-VL.
- **2025-10-13**: Official release of KORMo-10B-sft.
---
## ๐ก About KORMo-VL-Diffusion
**KORMo-VL** is a vision-language model developed **from scratch by the KAIST MLP Lab (https://sites.google.com/view/aailab)**, built on top of **KORMo-10B**.
The system consists of two components:
* **Vision-Language Model (VLM)**
* **Image Generation Model**
The KORMo-VL-Diffusion model, designed for image generation, was trained from scratch with a high proportion of images reflecting Korean daily environments and culture.
<span style="color:red">Unfortunately, due to limited GPU resources during the research process, we are sharing the intermediate results of the model at this stage.</span>
---
KORMo-VL์ KAIST MLP ์ฐ๊ตฌ์ค์์ **from scratch๋ก ๊ฐ๋ฐํ ์๊ฐ-์ธ์ด ๋ชจ๋ธ**๋ก, KORMo-10B๋ฅผ ๊ธฐ๋ฐ์ผ๋ก (1) ์๊ฐ-์ธ์ด ๋ชจ๋ธ๊ณผ (2) ์ด๋ฏธ์ง ์์ฑ ๋ชจ๋ธ๋ก ๊ตฌ์ฑ๋์ด ์์ต๋๋ค.
์ด ์ค **์ด๋ฏธ์ง ์์ฑ์ ์ํ KORMo-VL-Diffusion** ๋ชจ๋ธ์ ํ๊ตญ์ ์ํ ํ๊ฒฝ๊ณผ ๋ฌธํ๋ฅผ ๋ฐ์ํ๊ธฐ ์ํด ๊ตญ๋ด ํ๊ฒฝ ์ด๋ฏธ์ง๋ฅผ ๊ฐ๋ฅํ ๋์ ๋น์จ๋ก ์ฌ์ฉํ์ฌ **from scratch๋ถํฐ ํ์ต๋ ๋ชจ๋ธ**์
๋๋ค.
<span style="color:red">๋ค๋ง ์ฐ๊ตฌ ์งํ ์ค GPU ์์์ ์ถ๊ฐ๋ก ํ๋ณดํ์ง ๋ชปํด **ํ์ฌ๋ ์ค๊ฐ ๊ฒฐ๊ณผ๋ฌผ์ ๊ณต์ ํ๊ฒ ๋์์ต๋๋ค.**</span>
* **LLM:** KORMo-VL
* **Model Structure:** Qwen-Image๋ฅผ ๊ตฌ์กฐ๋ฅผ ์ฐธ์กฐํด ์ฌ๊ฐ๋ฐํจ (20B ์ ๋์ Diffusion๋ถ๋ถ์ ๋ณํํด scratch๋ถํฐ ํ์ต)
* **Languages:** Korean / English
* **Training Data:** Synthetic data + public datasets (e.g., AI Hub, details to be released)
ํฅํ ํด๋น ๋ชจ๋ธ์ ์ถฉ๋ถํ ํ์ตํ ์ ์๋ ํ๊ฒฝ์ด ๋ง๋ จ๋๋ค๋ฉด **์์ฑ๋ ๋ชจ๋ธ๋ก ๋ฐ์ ์ํค๋ ๊ฒ์ ๋ชฉํ๋ก ํ๊ณ ์์ต๋๋ค.**
์ค๊ฐ ๊ฒฐ๊ณผ๋ฌผ ์์์ ์ถ๊ฐ ํ๋์ด๋ ์ฐ๊ตฌ๋ฅผ ์งํํ๊ณ ์ถ์ ๋ถ๋ค์ **์์ ๋กญ๊ฒ ํ์ฉํด ๋ณด์๊ธฐ ๋ฐ๋๋๋ค.**
## ๐ T2I Performance
### English Prompt
| Prompt | Generated Image |
| :--- | :--- |
| **Prompt:** Dense forest | <img src="https://huggingface.co/KORMo-VL/KORMo-VL-Diffusion/resolve/main/example_images/Dense%20forest.webp" width="400"> |
| **Prompt:** Black pattern mug | <img src="https://huggingface.co/KORMo-VL/KORMo-VL-Diffusion/resolve/main/example_images/black%20pattern%20mug%20cpup.webp" width="400"> |
### Korean Prompt
| Prompt | Generated Image |
| :--- | :--- |
| **Prompt:** ์ธ์ฐฝํ ์ฒ | <img src="https://huggingface.co/KORMo-VL/KORMo-VL-Diffusion/resolve/main/example_images/Dense%20forest.webp" width="400"> |
| **Prompt:** ๊ฒ์ ๋ฌด๋ฌ์ ๋จธ๊ทธ์ปต | <img src="https://huggingface.co/KORMo-VL/KORMo-VL-Diffusion/resolve/main/example_images/%EA%B2%80%EC%9D%80%20%EB%AC%B4%EB%8A%AC%EC%9D%98%20%EB%A8%B8%EA%B7%B8%EC%BB%B5.webp" width="400"> |
## KORMo-VL-Diffusion Demo
`prompt: ์๋ฆ๋ค์ด ์ ์์ ๊ฝ๋ค`
<video width="640" height="360" controls>
<source src="https://huggingface.co/KORMo-VL/KORMo-VL-Diffusion/resolve/main/kormo_diffusion_assets/kormo_t2i.mp4" type="video/mp4">
</video>
## ๐ฆ Installation
```bash
uv pip install transformers==4.57.1 pillow torchvision diffusers
```
---
## ๐ Inference Example
```
github repo ํ์ฉ ์์
```
---
## Contact
- KyungTae Lim, Professor at KAIST. `ktlim@kaist.ac.kr`
## Contributor (https://sites.google.com/view/aailab)
- Junghun Yuk
- INho won
- HANGYEOL YOO
- Junmyeong Lee
- KyungTae Lim
## Citation
```text
@misc{KORMo,
author = {Minjun Kim, Hyeonseok Lim, Hangyeol Yoo, Inho Won, Seungwoo Song, Minkyung Cho, Junghun Yuk, Changsu Choi, Dongjae Shin, Huije Lee, Hoyun Song, Alice Oh, and KyungTae Lim},
title = {KORMo: Korean Open Reasoning Model for Everyone},
year = {2025},
publisher = {GitHub},
journal = {Technical Report},
paperLink = {\url{https://arxiv.org/abs/2510.09426}},
},
}
``` |