Improve model card metadata and content
#1
by nielsr HF Staff - opened
README.md
CHANGED
|
@@ -1,9 +1,11 @@
|
|
| 1 |
---
|
| 2 |
-
license: apache-2.0
|
| 3 |
-
datasets:
|
| 4 |
-
- Zigeng/DMax-LLaDA-2.0-Mini-Code-Trajectories
|
| 5 |
base_model:
|
| 6 |
- inclusionAI/LLaDA2.0-mini
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 7 |
---
|
| 8 |
|
| 9 |
<div align="center">
|
|
@@ -12,7 +14,7 @@ base_model:
|
|
| 12 |
<a href="https://github.com/czg1225/DMax/blob/main/LICENSE">
|
| 13 |
<img alt="Apache" src="https://img.shields.io/badge/License-Apache-4E94CE.svg">
|
| 14 |
</a>
|
| 15 |
-
<a href="https://
|
| 16 |
<img src="https://img.shields.io/badge/Paper-Arxiv-darkred.svg" alt="Paper">
|
| 17 |
</a>
|
| 18 |
<a href="https://github.com/czg1225/DMax">
|
|
@@ -21,10 +23,7 @@ base_model:
|
|
| 21 |
</div>
|
| 22 |
</div>
|
| 23 |
|
| 24 |
-
|
| 25 |
-
> [Zigeng Chen](https://czg1225.github.io/chenzigeng99/), [Gongfan Fang](https://fangggf.github.io/), [Xinyin Ma](https://horseee.github.io/), [Ruonan Yu](https://scholar.google.com/citations?user=UHP95egAAAAJ&hl=en), [Xinchao Wang](https://sites.google.com/site/sitexinchaowang/)
|
| 26 |
-
> [xML Lab](https://sites.google.com/view/xml-nus), National University of Singapore
|
| 27 |
-
|
| 28 |
|
| 29 |
## 💪 Highlights
|
| 30 |
|
|
@@ -33,7 +32,7 @@ base_model:
|
|
| 33 |
- **Soft Parallel Decoding**: Uses interpolation between mask and token embeddings to propagate confidence priors from previous steps.
|
| 34 |
|
| 35 |
<div align="center">
|
| 36 |
-
<img src="assets/tradeoff.png" width="100%" />
|
| 37 |
<br>
|
| 38 |
<em>Superior Parallelism-Accuracy Trade-off, Increased TPF with Maintained Accuracy.</em>
|
| 39 |
</div>
|
|
@@ -66,7 +65,14 @@ model = model.to(torch.bfloat16)
|
|
| 66 |
model.eval()
|
| 67 |
tokenizer = AutoTokenizer.from_pretrained("Zigeng/DMax-Coder-16B", trust_remote_code=True)
|
| 68 |
|
| 69 |
-
prompt = "Write a python function to find the first repeated character in a given string." + "
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 70 |
|
| 71 |
input_ids = tokenizer.apply_chat_template(
|
| 72 |
[{"role": "user", "content": prompt}],
|
|
@@ -91,7 +97,16 @@ print(generated_answer)
|
|
| 91 |
print("nfe:",nfe,"token length",len(generated_tokens[0]))
|
| 92 |
```
|
| 93 |
|
| 94 |
-
## 📖
|
| 95 |
-
|
| 96 |
-
|
| 97 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
---
|
|
|
|
|
|
|
|
|
|
| 2 |
base_model:
|
| 3 |
- inclusionAI/LLaDA2.0-mini
|
| 4 |
+
datasets:
|
| 5 |
+
- Zigeng/DMax-LLaDA-2.0-Mini-Code-Trajectories
|
| 6 |
+
license: apache-2.0
|
| 7 |
+
library_name: transformers
|
| 8 |
+
pipeline_tag: text-generation
|
| 9 |
---
|
| 10 |
|
| 11 |
<div align="center">
|
|
|
|
| 14 |
<a href="https://github.com/czg1225/DMax/blob/main/LICENSE">
|
| 15 |
<img alt="Apache" src="https://img.shields.io/badge/License-Apache-4E94CE.svg">
|
| 16 |
</a>
|
| 17 |
+
<a href="https://huggingface.co/papers/2604.08302">
|
| 18 |
<img src="https://img.shields.io/badge/Paper-Arxiv-darkred.svg" alt="Paper">
|
| 19 |
</a>
|
| 20 |
<a href="https://github.com/czg1225/DMax">
|
|
|
|
| 23 |
</div>
|
| 24 |
</div>
|
| 25 |
|
| 26 |
+
DMax is a new paradigm for efficient diffusion language models (dLLMs) that enables aggressive decoding parallelism while preserving generation quality. This repository contains the **DMax-Coder-16B** model, specialized for highly parallel code generation.
|
|
|
|
|
|
|
|
|
|
| 27 |
|
| 28 |
## 💪 Highlights
|
| 29 |
|
|
|
|
| 32 |
- **Soft Parallel Decoding**: Uses interpolation between mask and token embeddings to propagate confidence priors from previous steps.
|
| 33 |
|
| 34 |
<div align="center">
|
| 35 |
+
<img src="https://github.com/czg1225/DMax/raw/main/assets/tradeoff.png" width="100%" />
|
| 36 |
<br>
|
| 37 |
<em>Superior Parallelism-Accuracy Trade-off, Increased TPF with Maintained Accuracy.</em>
|
| 38 |
</div>
|
|
|
|
| 65 |
model.eval()
|
| 66 |
tokenizer = AutoTokenizer.from_pretrained("Zigeng/DMax-Coder-16B", trust_remote_code=True)
|
| 67 |
|
| 68 |
+
prompt = "Write a python function to find the first repeated character in a given string." + "
|
| 69 |
+
|
| 70 |
+
Please enclose your code within delimiters as follows:
|
| 71 |
+
```python
|
| 72 |
+
# YOUR CODE HERE
|
| 73 |
+
```
|
| 74 |
+
|
| 75 |
+
"
|
| 76 |
|
| 77 |
input_ids = tokenizer.apply_chat_template(
|
| 78 |
[{"role": "user", "content": prompt}],
|
|
|
|
| 97 |
print("nfe:",nfe,"token length",len(generated_tokens[0]))
|
| 98 |
```
|
| 99 |
|
| 100 |
+
## 📖 Citation
|
| 101 |
+
|
| 102 |
+
```bibtex
|
| 103 |
+
@misc{chen2026dmaxaggressiveparalleldecoding,
|
| 104 |
+
title={DMax: Aggressive Parallel Decoding for dLLMs},
|
| 105 |
+
author={Zigeng Chen and Gongfan Fang and Xinyin Ma and Ruonan Yu and Xinchao Wang},
|
| 106 |
+
year={2026},
|
| 107 |
+
eprint={2604.08302},
|
| 108 |
+
archivePrefix={arXiv},
|
| 109 |
+
primaryClass={cs.LG},
|
| 110 |
+
url={https://arxiv.org/abs/2604.08302},
|
| 111 |
+
}
|
| 112 |
+
```
|