Add model card for Block-R1

#1
by nielsr HF Staff - opened
Files changed (1) hide show
  1. README.md +46 -0
README.md ADDED
@@ -0,0 +1,46 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ library_name: peft
4
+ base_model: GSAI-ML/LLaDA-8B-Instruct
5
+ pipeline_tag: text-generation
6
+ tags:
7
+ - reinforcement-learning
8
+ - diffusion-llm
9
+ - block-r1
10
+ ---
11
+
12
+ # Block-R1
13
+
14
+ This repository contains model checkpoints (LoRA adapters) for **Block-R1**, a benchmark for multi-domain reinforcement learning with block-based diffusion large language models (dLLMs).
15
+
16
+ ## Description
17
+
18
+ Block-R1 is designed to enhance block-based reasoning generation in diffusion LLMs. It investigates the role of block size from a domain conflict perspective during reinforcement learning (RL) post-training. The benchmark covers diverse domains including code, mathematics, puzzles, and general knowledge.
19
+
20
+ Key components include:
21
+ - **Block-R1-41K Dataset:** A dataset constructed with optimized training block sizes for multi-domain RL.
22
+ - **b1 Method:** A dynamic-size reasoning block method for dLLMs.
23
+ - **RL Framework:** Support for multiple RL algorithms for diffusion models such as Diffusion-GRPO, WD1, GDPO, and more.
24
+
25
+ ## Resources
26
+
27
+ - **Paper:** [Block-R1: Rethinking the Role of Block Size in Multi-domain Reinforcement Learning for Diffusion Large Language Models](https://huggingface.co/papers/2605.11726)
28
+ - **Code:** [GitHub Repository](https://github.com/YanJiangJerry/Block-R1)
29
+ - **Dataset:** [Block-R1 Dataset](https://huggingface.co/datasets/dLLM-R1/Block-R1)
30
+
31
+ ## Model Information
32
+
33
+ These weights are LoRA adapters trained on top of the [LLaDA-8B-Instruct](https://huggingface.co/GSAI-ML/LLaDA-8B-Instruct) backbone. For detailed usage, training, and evaluation scripts, please refer to the official repository.
34
+
35
+ ## Citation
36
+
37
+ If you use this benchmark or the associated methods, please cite the following work:
38
+
39
+ ```bibtex
40
+ @article{jiang2026breakblock,
41
+ title={{Break the Block: Dynamic-size Reasoning Blocks for Diffusion Large Language Models via Monotonic Entropy Descent with Reinforcement Learning}},
42
+ author={Jiang, Yan and Qiu, Ruihong and Huang, Zi},
43
+ journal={arXiv preprint arXiv:2605.02263},
44
+ year={2026}
45
+ }
46
+ ```