Text Generation
Transformers
Safetensors
llada2_moe
conversational
custom_code
nielsr HF Staff commited on
Commit
f5b5fd6
·
verified ·
1 Parent(s): 8489383

Improve model card metadata and content

Browse files

This PR improves the model card by:
- Adding the `pipeline_tag: text-generation` for better discoverability.
- Adding `library_name: transformers` metadata as the model is compatible with the library via `trust_remote_code=True`.
- Linking the model to its associated paper [DMax: Aggressive Parallel Decoding for dLLMs](https://huggingface.co/papers/2604.08302).
- Updating the README with highlights, a "Quick Start" usage example, and the BibTeX citation from the official repository.

Files changed (1) hide show
  1. README.md +29 -14
README.md CHANGED
@@ -1,9 +1,11 @@
1
  ---
2
- license: apache-2.0
3
- datasets:
4
- - Zigeng/DMax-LLaDA-2.0-Mini-Code-Trajectories
5
  base_model:
6
  - inclusionAI/LLaDA2.0-mini
 
 
 
 
 
7
  ---
8
 
9
  <div align="center">
@@ -12,7 +14,7 @@ base_model:
12
  <a href="https://github.com/czg1225/DMax/blob/main/LICENSE">
13
  <img alt="Apache" src="https://img.shields.io/badge/License-Apache-4E94CE.svg">
14
  </a>
15
- <a href="https://arxiv.org/pdf/2604.08302">
16
  <img src="https://img.shields.io/badge/Paper-Arxiv-darkred.svg" alt="Paper">
17
  </a>
18
  <a href="https://github.com/czg1225/DMax">
@@ -21,10 +23,7 @@ base_model:
21
  </div>
22
  </div>
23
 
24
- > **DMax: Aggressive Parallel Decoding for dLLMs**
25
- > [Zigeng Chen](https://czg1225.github.io/chenzigeng99/), [Gongfan Fang](https://fangggf.github.io/), [Xinyin Ma](https://horseee.github.io/), [Ruonan Yu](https://scholar.google.com/citations?user=UHP95egAAAAJ&hl=en), [Xinchao Wang](https://sites.google.com/site/sitexinchaowang/)
26
- > [xML Lab](https://sites.google.com/view/xml-nus), National University of Singapore
27
-
28
 
29
  ## 💪 Highlights
30
 
@@ -33,7 +32,7 @@ base_model:
33
  - **Soft Parallel Decoding**: Uses interpolation between mask and token embeddings to propagate confidence priors from previous steps.
34
 
35
  <div align="center">
36
- <img src="assets/tradeoff.png" width="100%" />
37
  <br>
38
  <em>Superior Parallelism-Accuracy Trade-off, Increased TPF with Maintained Accuracy.</em>
39
  </div>
@@ -66,7 +65,14 @@ model = model.to(torch.bfloat16)
66
  model.eval()
67
  tokenizer = AutoTokenizer.from_pretrained("Zigeng/DMax-Coder-16B", trust_remote_code=True)
68
 
69
- prompt = "Write a python function to find the first repeated character in a given string." + "\n\nPlease enclose your code within delimiters as follows:\n```python\n# YOUR CODE HERE\n```\n\n"
 
 
 
 
 
 
 
70
 
71
  input_ids = tokenizer.apply_chat_template(
72
  [{"role": "user", "content": prompt}],
@@ -91,7 +97,16 @@ print(generated_answer)
91
  print("nfe:",nfe,"token length",len(generated_tokens[0]))
92
  ```
93
 
94
- ## 📖 Experimental Results
95
-
96
- ![trade-off](assets/exp.png)
97
-
 
 
 
 
 
 
 
 
 
 
1
  ---
 
 
 
2
  base_model:
3
  - inclusionAI/LLaDA2.0-mini
4
+ datasets:
5
+ - Zigeng/DMax-LLaDA-2.0-Mini-Code-Trajectories
6
+ license: apache-2.0
7
+ library_name: transformers
8
+ pipeline_tag: text-generation
9
  ---
10
 
11
  <div align="center">
 
14
  <a href="https://github.com/czg1225/DMax/blob/main/LICENSE">
15
  <img alt="Apache" src="https://img.shields.io/badge/License-Apache-4E94CE.svg">
16
  </a>
17
+ <a href="https://huggingface.co/papers/2604.08302">
18
  <img src="https://img.shields.io/badge/Paper-Arxiv-darkred.svg" alt="Paper">
19
  </a>
20
  <a href="https://github.com/czg1225/DMax">
 
23
  </div>
24
  </div>
25
 
26
+ DMax is a new paradigm for efficient diffusion language models (dLLMs) that enables aggressive decoding parallelism while preserving generation quality. This repository contains the **DMax-Coder-16B** model, specialized for highly parallel code generation.
 
 
 
27
 
28
  ## 💪 Highlights
29
 
 
32
  - **Soft Parallel Decoding**: Uses interpolation between mask and token embeddings to propagate confidence priors from previous steps.
33
 
34
  <div align="center">
35
+ <img src="https://github.com/czg1225/DMax/raw/main/assets/tradeoff.png" width="100%" />
36
  <br>
37
  <em>Superior Parallelism-Accuracy Trade-off, Increased TPF with Maintained Accuracy.</em>
38
  </div>
 
65
  model.eval()
66
  tokenizer = AutoTokenizer.from_pretrained("Zigeng/DMax-Coder-16B", trust_remote_code=True)
67
 
68
+ prompt = "Write a python function to find the first repeated character in a given string." + "
69
+
70
+ Please enclose your code within delimiters as follows:
71
+ ```python
72
+ # YOUR CODE HERE
73
+ ```
74
+
75
+ "
76
 
77
  input_ids = tokenizer.apply_chat_template(
78
  [{"role": "user", "content": prompt}],
 
97
  print("nfe:",nfe,"token length",len(generated_tokens[0]))
98
  ```
99
 
100
+ ## 📖 Citation
101
+
102
+ ```bibtex
103
+ @misc{chen2026dmaxaggressiveparalleldecoding,
104
+ title={DMax: Aggressive Parallel Decoding for dLLMs},
105
+ author={Zigeng Chen and Gongfan Fang and Xinyin Ma and Ruonan Yu and Xinchao Wang},
106
+ year={2026},
107
+ eprint={2604.08302},
108
+ archivePrefix={arXiv},
109
+ primaryClass={cs.LG},
110
+ url={https://arxiv.org/abs/2604.08302},
111
+ }
112
+ ```