nielsr HF Staff commited on
Commit
54f0d59
·
verified ·
1 Parent(s): 95da32d

Add model card for Block-R1

Browse files

This PR adds a model card for the Block-R1 project. It includes essential metadata such as the base model, library name (PEFT), license, and pipeline tag. The content provides links to the original paper, the source code on GitHub, and the associated dataset on Hugging Face.

Files changed (1) hide show
  1. README.md +46 -0
README.md ADDED
@@ -0,0 +1,46 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ library_name: peft
4
+ base_model: GSAI-ML/LLaDA-8B-Instruct
5
+ pipeline_tag: text-generation
6
+ tags:
7
+ - reinforcement-learning
8
+ - diffusion-llm
9
+ - block-r1
10
+ ---
11
+
12
+ # Block-R1
13
+
14
+ This repository contains model checkpoints (LoRA adapters) for **Block-R1**, a benchmark for multi-domain reinforcement learning with block-based diffusion large language models (dLLMs).
15
+
16
+ ## Description
17
+
18
+ Block-R1 is designed to enhance block-based reasoning generation in diffusion LLMs. It investigates the role of block size from a domain conflict perspective during reinforcement learning (RL) post-training. The benchmark covers diverse domains including code, mathematics, puzzles, and general knowledge.
19
+
20
+ Key components include:
21
+ - **Block-R1-41K Dataset:** A dataset constructed with optimized training block sizes for multi-domain RL.
22
+ - **b1 Method:** A dynamic-size reasoning block method for dLLMs.
23
+ - **RL Framework:** Support for multiple RL algorithms for diffusion models such as Diffusion-GRPO, WD1, GDPO, and more.
24
+
25
+ ## Resources
26
+
27
+ - **Paper:** [Block-R1: Rethinking the Role of Block Size in Multi-domain Reinforcement Learning for Diffusion Large Language Models](https://huggingface.co/papers/2605.11726)
28
+ - **Code:** [GitHub Repository](https://github.com/YanJiangJerry/Block-R1)
29
+ - **Dataset:** [Block-R1 Dataset](https://huggingface.co/datasets/dLLM-R1/Block-R1)
30
+
31
+ ## Model Information
32
+
33
+ These weights are LoRA adapters trained on top of the [LLaDA-8B-Instruct](https://huggingface.co/GSAI-ML/LLaDA-8B-Instruct) backbone. For detailed usage, training, and evaluation scripts, please refer to the official repository.
34
+
35
+ ## Citation
36
+
37
+ If you use this benchmark or the associated methods, please cite the following work:
38
+
39
+ ```bibtex
40
+ @article{jiang2026breakblock,
41
+ title={{Break the Block: Dynamic-size Reasoning Blocks for Diffusion Large Language Models via Monotonic Entropy Descent with Reinforcement Learning}},
42
+ author={Jiang, Yan and Qiu, Ruihong and Huang, Zi},
43
+ journal={arXiv preprint arXiv:2605.02263},
44
+ year={2026}
45
+ }
46
+ ```