Add model card and paper link
Browse filesHi! I'm Niels from the Hugging Face community science team. I'm opening this PR to add a model card for dUltra, which includes:
- Metadata for `pipeline_tag` and `library_name`.
- A link to the paper: [dUltra: Ultra-Fast Diffusion Language Models via Reinforcement Learning](https://huggingface.co/papers/2512.21446).
- A link to the official GitHub repository.
- Sample usage instructions from the repository.
This helps users discover and use your model more easily on the Hugging Face Hub.
README.md
ADDED
|
@@ -0,0 +1,42 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
pipeline_tag: text-generation
|
| 3 |
+
library_name: transformers
|
| 4 |
+
---
|
| 5 |
+
|
| 6 |
+
# dUltra: Ultra-Fast Diffusion Language Models via Reinforcement Learning
|
| 7 |
+
|
| 8 |
+
dUltra is an on-policy reinforcement learning framework based on Group Relative Policy Optimization (GRPO) that learns unmasking strategies for efficient parallel decoding in masked diffusion language models (MDLMs). By jointly optimizing the base diffusion LLM and an unmasking order planner, dUltra achieves superior accuracy-efficiency trade-offs on mathematical reasoning and code generation tasks.
|
| 9 |
+
|
| 10 |
+
- **Paper:** [dUltra: Ultra-Fast Diffusion Language Models via Reinforcement Learning](https://huggingface.co/papers/2512.21446)
|
| 11 |
+
- **GitHub Repository:** [chinsengi/dUltra-os](https://github.com/chinsengi/dUltra-os)
|
| 12 |
+
|
| 13 |
+
## Usage
|
| 14 |
+
|
| 15 |
+
To use this model, you can load it through the `transformers` library. Note that it requires `trust_remote_code=True` to load the custom model architecture.
|
| 16 |
+
|
| 17 |
+
```python
|
| 18 |
+
from model.llada.lladou import LLaDOUModelLM
|
| 19 |
+
from transformers import AutoTokenizer
|
| 20 |
+
import torch
|
| 21 |
+
|
| 22 |
+
model = LLaDOUModelLM.from_pretrained(
|
| 23 |
+
"sengi/dUltra-math",
|
| 24 |
+
trust_remote_code=True,
|
| 25 |
+
torch_dtype=torch.bfloat16,
|
| 26 |
+
)
|
| 27 |
+
tokenizer = AutoTokenizer.from_pretrained("sengi/dUltra-math")
|
| 28 |
+
```
|
| 29 |
+
|
| 30 |
+
## Citation
|
| 31 |
+
|
| 32 |
+
```bibtex
|
| 33 |
+
@misc{chen2025dultraultrafastdiffusionlanguage,
|
| 34 |
+
title={dUltra: Ultra-Fast Diffusion Language Models via Reinforcement Learning},
|
| 35 |
+
author={Shirui Chen and Jiantao Jiao and Lillian J. Ratliff and Banghua Zhu},
|
| 36 |
+
year={2025},
|
| 37 |
+
eprint={2512.21446},
|
| 38 |
+
archivePrefix={arXiv},
|
| 39 |
+
primaryClass={cs.LG},
|
| 40 |
+
url={https://arxiv.org/abs/2512.21446},
|
| 41 |
+
}
|
| 42 |
+
```
|