--- license: mit library_name: transformers pipeline_tag: text-generation --- We introduce LLaDA (Large Language Diffusion with mAsking), a diffusion model with an unprecedented 8B scale, trained entirely from scratch, rivaling LLaMA3 8B in performance, as described in [the paper](https://hf.co/papers/2502.09992). Project page: https://ml-gsai.github.io/LLaDA-demo/. For code and sample usage, see https://github.com/ML-GSAI/SMDM.