| license: mit | |
| pipeline_tag: text-generation | |
| library_name: peft | |
| # CDLM-LLaDA LoRA adapter for LLaDA-8B-Instruct | |
| This repository hosts the LoRA adapter for the LLaDA-8B-Instruct diffusion LLM (dLLM), produced with the CDLM (Consistency Diffusion Language Models) method. CDLM integrates consistency modeling and a block-wise causal attention mask so the student model becomes fully KV-cache compatible while retaining the strong local bidirectional modeling within each block. In practice, the adapter enables significantly faster inference with competitive quality. | |
| - GitHub: https://github.com/SqueezeAILab/CDLM | |
| - Paper: [CDLM: Consistency Diffusion Language Models For Faster Sampling](https://huggingface.co/papers/2511.19269) | |
| ## Model details | |
| - Base model: GSAI-ML/LLaDA-8B-Instruct | |
| - Method: CDLM (consistency distillation + block-wise causal masking for KV-cache compatibility) | |
| - Format: PEFT LoRA adapter (`adapter_model.safetensors`, `adapter_config.json`) | |
| - Intended use: attach this adapter to the base LLaDA-8B-Instruct model for accelerated inference via the CDLM decoding path | |
| ## How to use | |
| This is a LoRA adapter, not a full model. You must load the base model and then attach this adapter. For best speedups, use the CDLM inference path in the accompanying codebase. | |
| ## License | |
| This adapter is released under the MIT License. The base model is governed by its own license; please ensure compliance with the base model’s terms. | |
| ## Citation | |
| ```bibtex | |
| @article{kim2025cdlm, | |
| title = {CDLM: Consistency Diffusion Language Models for Faster Sampling}, | |
| author = {Kim, Minseo and Xu, Chenfeng and Hooper, Coleman and Singh, Harman | |
| and Athiwaratkun, Ben and Zhang, Ce and Keutzer, Kurt and Gholami, Amir}, | |
| journal = {arXiv preprint arXiv:2511.19269}, | |
| year = {2025}, | |
| url = {https://arxiv.org/abs/2511.19269} | |
| } | |
| ``` |