File size: 2,660 Bytes
d40c85c d7e2662 d40c85c d7e2662 5ab7ed6 3ce6162 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 | ---
base_model:
- moojink/openvla-7b-oft-finetuned-libero-spatial
- moojink/openvla-7b-oft-finetuned-libero-10
- moojink/openvla-7b-oft-finetuned-libero-object
- moojink/openvla-7b-oft-finetuned-libero-goal
datasets:
- yifengzhu-hf/LIBERO-datasets
pipeline_tag: robotics
license: mit
---
# πͺ RIPT-VLA: Interactive Post-Training for Vision-Language-Action Models (arxiv.org/abs/2505.17016)
**Authors**: Shuhan Tan, Kairan Dou, Yue Zhao, Philipp KrΓ€henbΓΌhl
**Codebase**: [GitHub β RIPT-VLA](https://github.com/Ariostgx/ript-vla)
**Website**: [Project Page](https://ariostgx.github.io/ript_vla/)
> **RIPT-VLA** enables interactive post-training for any pretrained Vision-Language-Action (VLA) model using only **sparse binary success rewards**.
> With **K-rollout interaction**, **dynamic sampling**, and **leave-one-out advantage estimation**, RIPT-VLA achieves **state-of-the-art** performance in extremely low-data regimes.
---
## π§ Model Summary
RIPT-VLA takes a pretrained VLA model (e.g., QueST or OpenVLA-OFT) and improves its performance by fine-tuning it with reinforcement learning based on success/failure signals only β no dense rewards or value functions required.
Supported models:
- β
QueST (small, efficient)
- β
OpenVLA-OFT (large-scale, high-capacity)
---
## π§ͺ Model Use
### β
Intended Use
- Research on post-training VLA models via RL
- Evaluation on LIBERO benchmarks (LIBERO-90, Goal, Object, Spatial, Long)
- Studying low-data reinforcement learning settings
---
## π¦ Checkpoints
All checkpoints are hosted here in this repository.
### βοΈ QueST Checkpoints
| Suite | SFT Checkpoint | RIPT Checkpoint |
|------------------|----------------|-----------------|
| LIBERO-90 | β
| β
|
| LIBERO-GOAL | β
| β
|
| LIBERO-LONG | β
| β
|
| LIBERO-OBJECT | β
| β
|
| LIBERO-SPATIAL | β
| β
|
Each QueST checkpoint is ~80MB.
### βοΈ OpenVLA-OFT Checkpoints
| Suite | SFT Scale Head | RIPT LoRA Adaptor |
|------------------|----------------|--------------------|
| LIBERO-GOAL | β
| β
|
| LIBERO-LONG | β
| β
|
| LIBERO-OBJECT | β
| β
|
| LIBERO-SPATIAL | β
| β
|
OpenVLA-OFT scale heads are ~300MB; RIPT LoRA adaptors are ~1GB.
---
## π How to Use
For usage, see [INSTALL.md](https://github.com/Ariostgx/ript-vla/blob/main/INSTALL.md) in the main GitHub repo. |