| --- |
| license: apple-amlr |
| base_model: |
| - Qwen/Qwen2.5-Coder-7B-Instruct |
| pipeline_tag: text-generation |
| tags: |
| - code |
| - diffusion |
| - Dream |
| - diffusion language model |
| --- |
| |
|
|
| ### CADD-Base-7B |
|
|
| CADD-Base-7B is a masked diffusion language model for code generation, augmented with **Continuously Augmented Discrete Diffusion (CADD)** --- a continuous flow-matching signal that guides the discrete denoising process. |
|
|
| **Key idea:** At each diffusion step, a continuous embedding `z_continuous` is added to masked-token embeddings, following a linear flow-matching trajectory from noise to clean embeddings. This is orthogonal to the discrete unmasking strategy --- any MDM algorithm can be combined with CADD. |
|
|
| #### Usage |
|
|
| ```python |
| import torch |
| from transformers import AutoModel, AutoTokenizer |
| |
| model_path = "apple/CADD-Base-7B" |
| model = AutoModel.from_pretrained(model_path, torch_dtype=torch.bfloat16, trust_remote_code=True) |
| tokenizer = AutoTokenizer.from_pretrained(model_path, trust_remote_code=True) |
| model = model.to("cuda").eval() |
| |
| prompt = "def fibonacci(n):\n" |
| input_ids = tokenizer(prompt, return_tensors="pt").input_ids.to("cuda") |
| |
| output = model.diffusion_generate( |
| input_ids, |
| max_new_tokens=512, |
| steps=512, |
| temperature=0.1, |
| alg="entropy", |
| alg_temp=0.0, |
| use_cadd=True, |
| cadd_sampling_mode="weighted", |
| ) |
| |
| print(tokenizer.decode(output[0], skip_special_tokens=True)) |
| ``` |
|
|
| #### CADD Sampling Parameters |
|
|
| | Parameter | Type | Default | Description | |
| |:---|:---:|:---:|:---| |
| | `use_cadd` | bool | `True` | Enable CADD continuous augmentation | |
| | `cadd_sampling_mode` | str | `"argmax"` | How to estimate z_0 from logits: `"weighted"` or `"argmax"` | |
| | `alg` | str | `"origin"` | Unmasking strategy: `"entropy"`, `"origin"`, `"maskgit_plus"`, `"topk_margin"` | |
| | `temperature` | float | `1.0` | Sampling temperature for token prediction | |
| | `steps` | int | `512` | Number of diffusion steps | |
| |
| #### More details: |
| |
| - Paper: [Continuously Augmented Discrete Diffusion Model for Categorical Generative Modeling](https://arxiv.org/abs/2510.01329) (ICLR 2026) |
| - GitHub: https://github.com/apple/ml-CADD |
| |
| #### Citation |
| |
| ```bibtex |
| @article{zheng2025continuously, |
| title={Continuously augmented discrete diffusion model for categorical generative modeling}, |
| author={Zheng, Huangjie and Gong, Shansan and Zhang, Ruixiang and Chen, Tianrong and Gu, Jiatao and Zhou, Mingyuan and Jaitly, Navdeep and Zhang, Yizhe}, |
| journal={arXiv preprint arXiv:2510.01329}, |
| year={2025} |
| } |
| ``` |
| |
| #### Acknowledgment |
| |
| To power this HuggingFace model release, we build upon and improve [DiffuCoder](https://github.com/apple/ml-diffucoder), reusing [Dream](https://huggingface.co/Dream-org/Dream-v0-Base-7B)'s modeling architecture and generation utils. |