| --- |
| license: mit |
| --- |
| |
| # Introduction to TraDo |
|
|
| [Paper](https://arxiv.org/abs/2509.06949) | [Code](https://github.com/Gen-Verse/dLLM-RL) | [Blog](https://yinjjiew.github.io/projects/dllmrl/) |
|
|
| We introduce **TraDo**, SOTA diffusion language model, trained with **TraceRL**. |
|
|
| * **TraDo-4B-Instruct** and **TraDo-8B-Instruct** outperform similarly sized strong AR models across math reasoning tasks. |
| * **TraDo-8B-Thinking** is the first Long-CoT diffusion language model. |
|
|
|
|
|
|
| <p align="center"> |
| <img src="https://github.com/yinjjiew/Data/raw/main/dllm-rl/figure1.png" width="100%"/> |
| </p> |
|
|
|
|
| <p align="center"> |
| <img src="https://github.com/yinjjiew/Data/raw/main/dllm-rl/maintable.png" width="100%"/> |
| </p> |
|
|
|
|
|
|
|
|
| # Citation |
|
|
| ``` |
| @article{wang2025trado, |
| title={Revolutionizing Reinforcement Learning Framework for Diffusion Large Language Models}, |
| author={Wang, Yinjie and Yang, Ling and Li, Bowen and Tian, Ye and Shen, Ke and Wang, Mengdi}, |
| journal={arXiv preprint arXiv:2509.06949}, |
| year={2025} |
| } |
| ``` |
|
|
|
|
|
|