Robotics
Safetensors
RIPT_VLA / README.md
tanshh97's picture
Remove library name (#2)
57532f4 verified
---
base_model:
- moojink/openvla-7b-oft-finetuned-libero-spatial
- moojink/openvla-7b-oft-finetuned-libero-10
- moojink/openvla-7b-oft-finetuned-libero-object
- moojink/openvla-7b-oft-finetuned-libero-goal
datasets:
- yifengzhu-hf/LIBERO-datasets
pipeline_tag: robotics
license: mit
---
# πŸ’ͺ RIPT-VLA: Interactive Post-Training for Vision-Language-Action Models (arxiv.org/abs/2505.17016)
**Authors**: Shuhan Tan, Kairan Dou, Yue Zhao, Philipp KrΓ€henbΓΌhl
**Codebase**: [GitHub – RIPT-VLA](https://github.com/Ariostgx/ript-vla)
**Website**: [Project Page](https://ariostgx.github.io/ript_vla/)
> **RIPT-VLA** enables interactive post-training for any pretrained Vision-Language-Action (VLA) model using only **sparse binary success rewards**.
> With **K-rollout interaction**, **dynamic sampling**, and **leave-one-out advantage estimation**, RIPT-VLA achieves **state-of-the-art** performance in extremely low-data regimes.
---
## 🧠 Model Summary
RIPT-VLA takes a pretrained VLA model (e.g., QueST or OpenVLA-OFT) and improves its performance by fine-tuning it with reinforcement learning based on success/failure signals only β€” no dense rewards or value functions required.
Supported models:
- βœ… QueST (small, efficient)
- βœ… OpenVLA-OFT (large-scale, high-capacity)
---
## πŸ§ͺ Model Use
### βœ… Intended Use
- Research on post-training VLA models via RL
- Evaluation on LIBERO benchmarks (LIBERO-90, Goal, Object, Spatial, Long)
- Studying low-data reinforcement learning settings
---
## πŸ“¦ Checkpoints
All checkpoints are hosted here in this repository.
### βœ”οΈ QueST Checkpoints
| Suite | SFT Checkpoint | RIPT Checkpoint |
|------------------|----------------|-----------------|
| LIBERO-90 | βœ… | βœ… |
| LIBERO-GOAL | βœ… | βœ… |
| LIBERO-LONG | βœ… | βœ… |
| LIBERO-OBJECT | βœ… | βœ… |
| LIBERO-SPATIAL | βœ… | βœ… |
Each QueST checkpoint is ~80MB.
### βœ”οΈ OpenVLA-OFT Checkpoints
| Suite | SFT Scale Head | RIPT LoRA Adaptor |
|------------------|----------------|--------------------|
| LIBERO-GOAL | βœ… | βœ… |
| LIBERO-LONG | βœ… | βœ… |
| LIBERO-OBJECT | βœ… | βœ… |
| LIBERO-SPATIAL | βœ… | βœ… |
OpenVLA-OFT scale heads are ~300MB; RIPT LoRA adaptors are ~1GB.
---
## πŸ›  How to Use
For usage, see [INSTALL.md](https://github.com/Ariostgx/ript-vla/blob/main/INSTALL.md) in the main GitHub repo.