Update README.md

a6dba8b verified 2 days ago

4.34 kB

	---
	library_name: diffusers
	license: apache-2.0
	---
	<!-- <p align="center">
	<img src="https://github.com/MLP-Lab/KORMo-tutorial/blob/main/tutorial/attachment/kormo_logo.png?raw=true" style="width: 100%; max-width: 1100px;">
	</p> -->

	<p align="center">
	<img src="https://github.com/MLP-Lab/KORMo-tutorial/blob/main/tutorial/attachment/kormo_logo.svg?raw=true" style="width: 40%; max-width: 1100px;">
	</p>


	## 🚀 Update News
	- 2026-03-05: Official release of KORMo-Diffusion.
	- 2026-03-02: Official release of KORMo-VL.
	- 2025-10-13: Official release of KORMo-10B-sft.
	---
	## 💡 About KORMo-VL-Diffusion

	KORMo-VL is a vision-language model developed from scratch by the KAIST MLP Lab (https://sites.google.com/view/aailab), built on top of KORMo-10B.
	The system consists of two components:

	* Vision-Language Model (VLM)
	* Image Generation Model

	The KORMo-VL-Diffusion model, designed for image generation, was trained from scratch with a high proportion of images reflecting Korean daily environments and culture.
	<span style="color:red">Unfortunately, due to limited GPU resources during the research process, we are sharing the intermediate results of the model at this stage.</span>

	---

	KORMo-VL은 KAIST MLP 연구실에서 from scratch로 개발한 시각-언어 모델로, KORMo-10B를 기반으로 (1) 시각-언어 모델과 (2) 이미지 생성 모델로 구성되어 있습니다.

	이 중 이미지 생성을 위한 KORMo-VL-Diffusion 모델은 한국의 생활 환경과 문화를 반영하기 위해 국내 환경 이미지를 가능한 높은 비율로 사용하여 from scratch부터 학습된 모델입니다.
	<span style="color:red">다만 연구 진행 중 GPU 자원을 추가로 확보하지 못해 현재는 중간 결과물을 공유하게 되었습니다.</span>

	* LLM: KORMo-VL
	* Model Structure: Qwen-Image를 구조를 참조해 재개발함 (20B 정도의 Diffusion부분을 변형해 scratch부터 학습)
	* Languages: Korean / English
	* Training Data: Synthetic data + public datasets (e.g., AI Hub, details to be released)

	향후 해당 모델을 충분히 학습할 수 있는 환경이 마련된다면 완성된 모델로 발전시키는 것을 목표로 하고 있습니다.
	중간 결과물 위에서 추가 튜닝이나 연구를 진행하고 싶은 분들은 자유롭게 활용해 보시기 바랍니다.



	## 📈 T2I Performance
	### English Prompt
	\| Prompt \| Generated Image \|
	\| :--- \| :--- \|
	\| Prompt: Dense forest \| <img src="https://huggingface.co/KORMo-VL/KORMo-VL-Diffusion/resolve/main/example_images/Dense%20forest.webp" width="400"> \|
	\| Prompt: Black pattern mug \| <img src="https://huggingface.co/KORMo-VL/KORMo-VL-Diffusion/resolve/main/example_images/black%20pattern%20mug%20cpup.webp" width="400"> \|

	### Korean Prompt
	\| Prompt \| Generated Image \|
	\| :--- \| :--- \|
	\| Prompt: 울창한 숲 \| <img src="https://huggingface.co/KORMo-VL/KORMo-VL-Diffusion/resolve/main/example_images/Dense%20forest.webp" width="400"> \|
	\| Prompt: 검은 무늬의 머그컵 \| <img src="https://huggingface.co/KORMo-VL/KORMo-VL-Diffusion/resolve/main/example_images/%EA%B2%80%EC%9D%80%20%EB%AC%B4%EB%8A%AC%EC%9D%98%20%EB%A8%B8%EA%B7%B8%EC%BB%B5.webp" width="400"> \|



	## KORMo-VL-Diffusion Demo

	`prompt: 아름다운 정원의 꽃들`

	<video width="640" height="360" controls>
	<source src="https://huggingface.co/KORMo-VL/KORMo-VL-Diffusion/resolve/main/kormo_diffusion_assets/kormo_t2i.mp4" type="video/mp4">
	</video>


	## 📦 Installation

	```bash
	uv pip install transformers==4.57.1 pillow torchvision diffusers
	```

	---
	## 🚀 Inference Example
	```
	github repo 활용 예정
	```

	---


	## Contact
	- KyungTae Lim, Professor at KAIST. `ktlim@kaist.ac.kr`

	## Contributor (https://sites.google.com/view/aailab)
	- Junghun Yuk
	- INho won
	- HANGYEOL YOO
	- Junmyeong Lee
	- KyungTae Lim

	## Citation

	```text
	@misc{KORMo,
	author = {Minjun Kim, Hyeonseok Lim, Hangyeol Yoo, Inho Won, Seungwoo Song, Minkyung Cho, Junghun Yuk, Changsu Choi, Dongjae Shin, Huije Lee, Hoyun Song, Alice Oh, and KyungTae Lim},
	title = {KORMo: Korean Open Reasoning Model for Everyone},
	year = {2025},
	publisher = {GitHub},
	journal = {Technical Report},
	paperLink = {\url{https://arxiv.org/abs/2510.09426}},
	},
	}
	```