nielsr HF Staff commited on
Commit
48ee6e0
·
verified ·
1 Parent(s): 3723f46

Add model card for DICE-RL

Browse files

Hi! I'm Niels from the Hugging Face community science team. I noticed this repository was missing a model card, so I've opened this PR to add one.

The model card includes:
- A link to the paper: [From Prior to Pro: Efficient Skill Mastery via Distribution Contractive RL Finetuning](https://huggingface.co/papers/2603.10263).
- Links to the project page and GitHub repository.
- A brief description of the DICE-RL framework.
- Instructions for evaluating the checkpoints using the official evaluation script.
- The `robotics` pipeline tag to improve discoverability.

Feel free to merge this or let me know if you'd like any adjustments!

Files changed (1) hide show
  1. README.md +33 -0
README.md ADDED
@@ -0,0 +1,33 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ pipeline_tag: robotics
3
+ ---
4
+
5
+ # From Prior to Pro: Efficient Skill Mastery via Distribution Contractive RL Finetuning (DICE-RL)
6
+
7
+ This repository contains the checkpoints for **DICE-RL**, a framework that uses reinforcement learning (RL) as a "distribution contraction" operator to refine pretrained generative robot policies.
8
+
9
+ [**Project Website**](https://zhanyisun.github.io/dice.rl.2026/) | [**Paper**](https://huggingface.co/papers/2603.10263) | [**GitHub**](https://github.com/zhanyisun/dice-rl)
10
+
11
+ ## Introduction
12
+ Distribution Contractive Reinforcement Learning (DICE-RL) turns a pretrained behavior prior into a high-performing "pro" policy by amplifying high-success behaviors from online feedback. The framework pretrains a diffusion- or flow-based policy for broad behavioral coverage, then finetunes it with a stable, sample-efficient residual off-policy RL framework that combines selective behavior regularization with value-guided action selection. It enables mastery of complex long-horizon manipulation skills directly from high-dimensional pixel inputs.
13
+
14
+ ## Evaluation
15
+ To evaluate the finetuned RL checkpoints and pretrained BC checkpoints and to get success rates for both, use the following command from the [official repository](https://github.com/zhanyisun/dice-rl):
16
+
17
+ ```bash
18
+ python script/eval_rl_checkpoint.py --ckpt_path path_to_finetuned_checkpoint --num_eval_episodes 10 --eval_n_envs 10
19
+ ```
20
+
21
+ The output will include the success rates for both the finetuned RL checkpoint and the pretrained BC checkpoint, as well as the gain of the finetuned RL checkpoint over the pretrained BC checkpoint.
22
+
23
+ ## Citation
24
+ If you find this work or the checkpoints useful, please consider citing:
25
+
26
+ ```bibtex
27
+ @article{sun2026prior,
28
+ title={From Prior to Pro: Efficient Skill Mastery via Distribution Contractive RL Finetuning},
29
+ author={Sun, Zhanyi and Song, Shuran},
30
+ journal={arXiv preprint arXiv:2603.10263},
31
+ year={2026}
32
+ }
33
+ ```