| | --- |
| | license: mit |
| | tags: |
| | - Korean |
| | - Language Model |
| | - Autoregressive |
| | - MDLM |
| | - Diffusion |
| | - PyTorch Lightning |
| | - Huggingface |
| | --- |
| | |
| | # ๐ฌ MDLM AR Model (Korean) - Hanbin42 |
| |
|
| | ์ด ๋ชจ๋ธ์ [MDLM (Masked Diffusion Language Model)](https://arxiv.org/abs/2406.07524) ๊ตฌ์กฐ๋ฅผ ๊ธฐ๋ฐ์ผ๋ก ํ **Autoregressive Korean Language Model**์
๋๋ค. |
| | `Hanbin42/my-mdlm-ar-model`์ `skt/kogpt2-base-v2` ํ ํฌ๋์ด์ ์ `parkseongjun/psjkodata` ํ๊ตญ์ด ๋ฐ์ดํฐ์
์ผ๋ก ํ์ต๋์์ต๋๋ค. |
| |
|
| | --- |
| |
|
| | ## ๐ง Model Details |
| |
|
| | - **Backbone**: Autoregressive (AR) |
| | - **Diffusion Type**: Absorbing State |
| | - **Input Length**: 1024 tokens |
| | - **Vocab Size**: 51200 (KoGPT2 ๊ธฐ์ค) |
| | - **Training Steps**: 50,000 |
| | - **Sampling Steps**: 128 (DDPM-style) |
| | - **Precision**: bfloat16 |
| | - **EMA**: Enabled (0.9999) |
| |
|
| | --- |
| |
|
| | ## ๐ฆ Files |
| |
|
| | | File | Description | |
| | |-------------|-------------------------------------| |
| | | `best.ckpt` | PyTorch Lightning ๋ชจ๋ธ ์ฒดํฌํฌ์ธํธ | |
| | | `config.yaml` | ํ์ต ์ ์ฌ์ฉํ ํ์ดํผํ๋ผ๋ฏธํฐ ์ค์ | |
| | | `README.md` | ๋ชจ๋ธ ์ค๋ช
๋ฌธ์ | |
| |
|
| | --- |
| |
|
| | ## ๐ How to Use |
| |
|
| | ```python |
| | import torch |
| | from lightning.pytorch import LightningModule |
| | from diffusion import Diffusion # ์ด ํ๋ก์ ํธ ๊ธฐ์ค์ผ๋ก ์ ์๋จ |
| | |
| | model = Diffusion.load_from_checkpoint("best.ckpt", config=..., tokenizer=...) |
| | model.eval() |
| | |