YanJiangJerry
/

Block-R1-ckpts

Model card Files Files and versions

Add model card for Block-R1

#1

by nielsr HF Staff - opened about 8 hours ago

base: refs/heads/main

←

from: refs/pr/1

Discussion Files changed

Files changed (1) hide show

README.md +46 -0

README.md ADDED Viewed

	@@ -0,0 +1,46 @@

+---
+license: apache-2.0
+library_name: peft
+base_model: GSAI-ML/LLaDA-8B-Instruct
+pipeline_tag: text-generation
+tags:
+- reinforcement-learning
+- diffusion-llm
+- block-r1
+---
+# Block-R1
+This repository contains model checkpoints (LoRA adapters) for **Block-R1**, a benchmark for multi-domain reinforcement learning with block-based diffusion large language models (dLLMs).
+## Description
+Block-R1 is designed to enhance block-based reasoning generation in diffusion LLMs. It investigates the role of block size from a domain conflict perspective during reinforcement learning (RL) post-training. The benchmark covers diverse domains including code, mathematics, puzzles, and general knowledge.
+Key components include:
+- **Block-R1-41K Dataset:** A dataset constructed with optimized training block sizes for multi-domain RL.
+- **b1 Method:** A dynamic-size reasoning block method for dLLMs.
+- **RL Framework:** Support for multiple RL algorithms for diffusion models such as Diffusion-GRPO, WD1, GDPO, and more.
+## Resources
+- **Paper:** [Block-R1: Rethinking the Role of Block Size in Multi-domain Reinforcement Learning for Diffusion Large Language Models](https://huggingface.co/papers/2605.11726)
+- **Code:** [GitHub Repository](https://github.com/YanJiangJerry/Block-R1)
+- **Dataset:** [Block-R1 Dataset](https://huggingface.co/datasets/dLLM-R1/Block-R1)
+## Model Information
+These weights are LoRA adapters trained on top of the [LLaDA-8B-Instruct](https://huggingface.co/GSAI-ML/LLaDA-8B-Instruct) backbone. For detailed usage, training, and evaluation scripts, please refer to the official repository.
+## Citation
+If you use this benchmark or the associated methods, please cite the following work:
+```bibtex
+@article{jiang2026breakblock,
+  title={{Break the Block: Dynamic-size Reasoning Blocks for Diffusion Large Language Models via Monotonic Entropy Descent with Reinforcement Learning}},
+  author={Jiang, Yan and Qiu, Ruihong and Huang, Zi},
+  journal={arXiv preprint arXiv:2605.02263},
+  year={2026}
+}
+```