my-mdlm-ar-model / README.md
Hanbin42's picture
Update README.md
98e7482 verified
metadata
license: mit
tags:
  - Korean
  - Language Model
  - Autoregressive
  - MDLM
  - Diffusion
  - PyTorch Lightning
  - Huggingface

๐Ÿ’ฌ MDLM AR Model (Korean) - Hanbin42

์ด ๋ชจ๋ธ์€ MDLM (Masked Diffusion Language Model) ๊ตฌ์กฐ๋ฅผ ๊ธฐ๋ฐ˜์œผ๋กœ ํ•œ Autoregressive Korean Language Model์ž…๋‹ˆ๋‹ค.
Hanbin42/my-mdlm-ar-model์€ skt/kogpt2-base-v2 ํ† ํฌ๋‚˜์ด์ €์™€ parkseongjun/psjkodata ํ•œ๊ตญ์–ด ๋ฐ์ดํ„ฐ์…‹์œผ๋กœ ํ•™์Šต๋˜์—ˆ์Šต๋‹ˆ๋‹ค.


๐Ÿง  Model Details

  • Backbone: Autoregressive (AR)
  • Diffusion Type: Absorbing State
  • Input Length: 1024 tokens
  • Vocab Size: 51200 (KoGPT2 ๊ธฐ์ค€)
  • Training Steps: 50,000
  • Sampling Steps: 128 (DDPM-style)
  • Precision: bfloat16
  • EMA: Enabled (0.9999)

๐Ÿ“ฆ Files

File Description
best.ckpt PyTorch Lightning ๋ชจ๋ธ ์ฒดํฌํฌ์ธํŠธ
config.yaml ํ•™์Šต ์‹œ ์‚ฌ์šฉํ•œ ํ•˜์ดํผํŒŒ๋ผ๋ฏธํ„ฐ ์„ค์ •
README.md ๋ชจ๋ธ ์„ค๋ช… ๋ฌธ์„œ

๐Ÿš€ How to Use

import torch
from lightning.pytorch import LightningModule
from diffusion import Diffusion  # ์ด ํ”„๋กœ์ ํŠธ ๊ธฐ์ค€์œผ๋กœ ์ •์˜๋จ

model = Diffusion.load_from_checkpoint("best.ckpt", config=..., tokenizer=...)
model.eval()