| --- |
| license: apache-2.0 |
| library_name: peft |
| base_model: GSAI-ML/LLaDA-8B-Instruct |
| pipeline_tag: text-generation |
| tags: |
| - reinforcement-learning |
| - diffusion-llm |
| - block-r1 |
| --- |
| |
| # Block-R1 |
|
|
| This repository contains model checkpoints (LoRA adapters) for **Block-R1**, a benchmark for multi-domain reinforcement learning with block-based diffusion large language models (dLLMs). |
|
|
| ## Description |
|
|
| Block-R1 is designed to enhance block-based reasoning generation in diffusion LLMs. It investigates the role of block size from a domain conflict perspective during reinforcement learning (RL) post-training. The benchmark covers diverse domains including code, mathematics, puzzles, and general knowledge. |
|
|
| Key components include: |
| - **Block-R1-41K Dataset:** A dataset constructed with optimized training block sizes for multi-domain RL. |
| - **b1 Method:** A dynamic-size reasoning block method for dLLMs. |
| - **RL Framework:** Support for multiple RL algorithms for diffusion models such as Diffusion-GRPO, WD1, GDPO, and more. |
|
|
| ## Resources |
|
|
| - **Paper:** [Block-R1: Rethinking the Role of Block Size in Multi-domain Reinforcement Learning for Diffusion Large Language Models](https://huggingface.co/papers/2605.11726) |
| - **Code:** [GitHub Repository](https://github.com/YanJiangJerry/Block-R1) |
| - **Dataset:** [Block-R1 Dataset](https://huggingface.co/datasets/dLLM-R1/Block-R1) |
|
|
| ## Model Information |
|
|
| These weights are LoRA adapters trained on top of the [LLaDA-8B-Instruct](https://huggingface.co/GSAI-ML/LLaDA-8B-Instruct) backbone. For detailed usage, training, and evaluation scripts, please refer to the official repository. |
|
|
| ## Citation |
|
|
| If you use this benchmark or the associated methods, please cite the following work: |
|
|
| ```bibtex |
| @article{jiang2026breakblock, |
| title={{Break the Block: Dynamic-size Reasoning Blocks for Diffusion Large Language Models via Monotonic Entropy Descent with Reinforcement Learning}}, |
| author={Jiang, Yan and Qiu, Ruihong and Huang, Zi}, |
| journal={arXiv preprint arXiv:2605.02263}, |
| year={2026} |
| } |
| ``` |