KORMo-Team
/

KORMo-10B-base

Text Generation

Model card Files Files and versions

mjkmain commited on Oct 13, 2025

Commit

7cda0d0

·

verified ·

1 Parent(s): 5c21734

Update README.md

Files changed (1) hide show

README.md +4 -5

README.md CHANGED Viewed

@@ -26,9 +26,9 @@ The model, training code, and training data are all **fully open**, allowing any
 - 🧪 **License**: Apache 2.0 (commercial use permitted)
 ```md
-KORMo: The First Fully Open-Source LLM from a Non-English Region
-KORMo was created with a public-interest mission — to make world-class language models accessible to everyone.
 Our goal is to empower anyone to build and advance their own large language models at a global standard.
 Key Features:
@@ -36,7 +36,7 @@ Key Features:
 1. A 10B-parameter Korean–English reasoning model trained entirely from scratch.
 2. 100% open resources — including all training data, code, intermediate checkpoints, and tutorials — allowing anyone to reproduce and extend a near-SOTA model on their own.
 3. 3 trillion tokens of training data released publicly, featuring never-before-shared, high-quality full-cycle Korean datasets (for pretraining, post-training, general, reasoning, and reinforcement learning).
-4. A collaborative effort by eight undergraduate and master’s students at the KAIST Graduate School of Culture Technology (MLP Lab), documented in a 45-page research paper.
 If you’ve ever used a Korean language model that performs well on benchmarks but feels strange in real use, or if fine-tuning only made it worse, you’re not alone.
@@ -109,6 +109,7 @@ By releasing every intermediate model and post-training dataset, we give users t
 git clone https://github.com/MLP-Lab/KORMo-tutorial.git
 cd KORMo-tutorial
 bash setup/create_uv_venv.sh
 ```
 ---
@@ -168,5 +169,3 @@ chat_prompt = tokenizer.apply_chat_template(
 ## Contact
 - KyungTae Lim, Professor at KAIST. `ktlim@kaist.ac.kr`
-## Contributor

 - 🧪 **License**: Apache 2.0 (commercial use permitted)
 ```md
+The First Fully Open-Source LLM from a Non-English Region
+KORMo was created with a public-interest mission: to make world-class language models accessible to everyone.
 Our goal is to empower anyone to build and advance their own large language models at a global standard.
 Key Features:
 1. A 10B-parameter Korean–English reasoning model trained entirely from scratch.
 2. 100% open resources — including all training data, code, intermediate checkpoints, and tutorials — allowing anyone to reproduce and extend a near-SOTA model on their own.
 3. 3 trillion tokens of training data released publicly, featuring never-before-shared, high-quality full-cycle Korean datasets (for pretraining, post-training, general, reasoning, and reinforcement learning).
+4. A collaborative effort by eight master’s students at the KAIST Graduate School of Culture Technology (MLP Lab), documented in a 45-page research paper.
 If you’ve ever used a Korean language model that performs well on benchmarks but feels strange in real use, or if fine-tuning only made it worse, you’re not alone.
 git clone https://github.com/MLP-Lab/KORMo-tutorial.git
 cd KORMo-tutorial
 bash setup/create_uv_venv.sh
+source .venv_kormo/bin/activate
 ```
 ---
 ## Contact
 - KyungTae Lim, Professor at KAIST. `ktlim@kaist.ac.kr`