Gen-Verse
/

TraDo-8B-Thinking

Model card Files Files and versions

TraDo-8B-Thinking / README.md

yinjiewang's picture

Update README.md

f6c78c6 verified 2 months ago

|

history blame contribute delete

1.01 kB

	---
	license: mit
	---

	# Introduction to TraDo

	[Paper](https://arxiv.org/abs/2509.06949) \| [Code](https://github.com/Gen-Verse/dLLM-RL) \| [Blog](https://yinjjiew.github.io/projects/dllmrl/)

	We introduce TraDo, SOTA diffusion language model, trained with TraceRL.

	* TraDo-4B-Instruct and TraDo-8B-Instruct outperform similarly sized strong AR models across math reasoning tasks.
	* TraDo-8B-Thinking is the first Long-CoT diffusion language model.



	<p align="center">
	<img src="https://github.com/yinjjiew/Data/raw/main/dllm-rl/figure1.png" width="100%"/>
	</p>


	<p align="center">
	<img src="https://github.com/yinjjiew/Data/raw/main/dllm-rl/maintable.png" width="100%"/>
	</p>




	# Citation

	```
	@article{wang2025trado,
	title={Revolutionizing Reinforcement Learning Framework for Diffusion Large Language Models},
	author={Wang, Yinjie and Yang, Ling and Li, Bowen and Tian, Ye and Shen, Ke and Wang, Mengdi},
	journal={arXiv preprint arXiv:2509.06949},
	year={2025}
	}
	```