CDLM-LLaDA / README.md
minseo25's picture
Improve model card: Add metadata and update paper/GitHub links (#1)
5580119 verified
---
license: mit
pipeline_tag: text-generation
library_name: peft
---
# CDLM-LLaDA LoRA adapter for LLaDA-8B-Instruct
This repository hosts the LoRA adapter for the LLaDA-8B-Instruct diffusion LLM (dLLM), produced with the CDLM (Consistency Diffusion Language Models) method. CDLM integrates consistency modeling and a block-wise causal attention mask so the student model becomes fully KV-cache compatible while retaining the strong local bidirectional modeling within each block. In practice, the adapter enables significantly faster inference with competitive quality.
- GitHub: https://github.com/SqueezeAILab/CDLM
- Paper: [CDLM: Consistency Diffusion Language Models For Faster Sampling](https://huggingface.co/papers/2511.19269)
## Model details
- Base model: GSAI-ML/LLaDA-8B-Instruct
- Method: CDLM (consistency distillation + block-wise causal masking for KV-cache compatibility)
- Format: PEFT LoRA adapter (`adapter_model.safetensors`, `adapter_config.json`)
- Intended use: attach this adapter to the base LLaDA-8B-Instruct model for accelerated inference via the CDLM decoding path
## How to use
This is a LoRA adapter, not a full model. You must load the base model and then attach this adapter. For best speedups, use the CDLM inference path in the accompanying codebase.
## License
This adapter is released under the MIT License. The base model is governed by its own license; please ensure compliance with the base model’s terms.
## Citation
```bibtex
@article{kim2025cdlm,
title = {CDLM: Consistency Diffusion Language Models for Faster Sampling},
author = {Kim, Minseo and Xu, Chenfeng and Hooper, Coleman and Singh, Harman
and Athiwaratkun, Ben and Zhang, Ce and Keutzer, Kurt and Gholami, Amir},
journal = {arXiv preprint arXiv:2511.19269},
year = {2025},
url = {https://arxiv.org/abs/2511.19269}
}
```