--- title: README emoji: ๐ colorFrom: green colorTo: blue sdk: static pinned: false ---
An open-source hub for Korean language data and model research
--- ## ๐ง Open Models - **KORMo-Team/KORMo-tokenizer** โ A tokenizer optimized for bilingual (KoreanโEnglish) language representation - **KORMo-Team/KORMo-10B-base** โ The KORMo-10B pretrained model trained on large-scale Korean and English corpora - **KORMo-Team/KORMo-10B-sft** โ A fine-tuned model enhanced with long-context reasoning and instruction-following data - **KORMo-Team/KORMo-10B-inst** โ Final instruction-tuned model with reasoning enhancement and RL (Coming soon; currently awaiting GPU availability) > ๐ก You can explore the full training history and checkpoints in each modelโs **`Revisions` tab** on Hugging Face. --- ## ๐ Links - **Technical Report** โ https://arxiv.org/pdf/2510.09426 - **Technical Report(Slide-Korean)** โ https://github.com/MLP-Lab/KORMo-tutorial/blob/main/20251009_MLP_KORMo(Korean).pdf - **Tutorial on Github** โ https://github.com/MLP-Lab/KORMo-tutorial - **Tutorial on youtube** โ https://www.youtube.com/@MLPLab --- ### ๐ About KORMo KORMo is an open research initiative dedicated to advancing Korean language understanding and generation through large-scale, fully open-source models and datasets. We aim to make Korean NLP research transparent, reproducible, and accessible to the global community.