CDLM-LLaDA / README.md

Improve model card: Add metadata and update paper/GitHub links (#1)

5580119 verified 3 months ago

1.87 kB

	---
	license: mit
	pipeline_tag: text-generation
	library_name: peft
	---

	# CDLM-LLaDA LoRA adapter for LLaDA-8B-Instruct

	This repository hosts the LoRA adapter for the LLaDA-8B-Instruct diffusion LLM (dLLM), produced with the CDLM (Consistency Diffusion Language Models) method. CDLM integrates consistency modeling and a block-wise causal attention mask so the student model becomes fully KV-cache compatible while retaining the strong local bidirectional modeling within each block. In practice, the adapter enables significantly faster inference with competitive quality.

	- GitHub: https://github.com/SqueezeAILab/CDLM
	- Paper: [CDLM: Consistency Diffusion Language Models For Faster Sampling](https://huggingface.co/papers/2511.19269)


	## Model details

	- Base model: GSAI-ML/LLaDA-8B-Instruct
	- Method: CDLM (consistency distillation + block-wise causal masking for KV-cache compatibility)
	- Format: PEFT LoRA adapter (`adapter_model.safetensors`, `adapter_config.json`)
	- Intended use: attach this adapter to the base LLaDA-8B-Instruct model for accelerated inference via the CDLM decoding path


	## How to use

	This is a LoRA adapter, not a full model. You must load the base model and then attach this adapter. For best speedups, use the CDLM inference path in the accompanying codebase.

	## License

	This adapter is released under the MIT License. The base model is governed by its own license; please ensure compliance with the base model’s terms.


	## Citation

	```bibtex
	@article{kim2025cdlm,
	title = {CDLM: Consistency Diffusion Language Models for Faster Sampling},
	author = {Kim, Minseo and Xu, Chenfeng and Hooper, Coleman and Singh, Harman
	and Athiwaratkun, Ben and Zhang, Ce and Keutzer, Kurt and Gholami, Amir},
	journal = {arXiv preprint arXiv:2511.19269},
	year = {2025},
	url = {https://arxiv.org/abs/2511.19269}
	}
	```