---
base_model:
- moojink/openvla-7b-oft-finetuned-libero-spatial
- moojink/openvla-7b-oft-finetuned-libero-10
- moojink/openvla-7b-oft-finetuned-libero-object
- moojink/openvla-7b-oft-finetuned-libero-goal
datasets:
- yifengzhu-hf/LIBERO-datasets
pipeline_tag: robotics
license: mit
---

# 💪 RIPT-VLA: Interactive Post-Training for Vision-Language-Action Models (arxiv.org/abs/2505.17016)

**Authors**: Shuhan Tan, Kairan Dou, Yue Zhao, Philipp Krähenbühl  
**Codebase**: [GitHub – RIPT-VLA](https://github.com/Ariostgx/ript-vla)  
**Website**: [Project Page](https://ariostgx.github.io/ript_vla/)  

> **RIPT-VLA** enables interactive post-training for any pretrained Vision-Language-Action (VLA) model using only **sparse binary success rewards**.  
> With **K-rollout interaction**, **dynamic sampling**, and **leave-one-out advantage estimation**, RIPT-VLA achieves **state-of-the-art** performance in extremely low-data regimes.

---

## 🧠 Model Summary

RIPT-VLA takes a pretrained VLA model (e.g., QueST or OpenVLA-OFT) and improves its performance by fine-tuning it with reinforcement learning based on success/failure signals only — no dense rewards or value functions required.

Supported models:
- ✅ QueST (small, efficient)
- ✅ OpenVLA-OFT (large-scale, high-capacity)

---

## 🧪 Model Use

### ✅ Intended Use

- Research on post-training VLA models via RL
- Evaluation on LIBERO benchmarks (LIBERO-90, Goal, Object, Spatial, Long)
- Studying low-data reinforcement learning settings

---

## 📦 Checkpoints

All checkpoints are hosted here in this repository.

### ✔️ QueST Checkpoints

| Suite            | SFT Checkpoint | RIPT Checkpoint |
|------------------|----------------|-----------------|
| LIBERO-90        | ✅              | ✅              |
| LIBERO-GOAL      | ✅              | ✅              |
| LIBERO-LONG      | ✅              | ✅              |
| LIBERO-OBJECT    | ✅              | ✅              |
| LIBERO-SPATIAL   | ✅              | ✅              |

Each QueST checkpoint is ~80MB.

### ✔️ OpenVLA-OFT Checkpoints

| Suite            | SFT Scale Head | RIPT LoRA Adaptor |
|------------------|----------------|--------------------|
| LIBERO-GOAL      | ✅              | ✅    |
| LIBERO-LONG      | ✅              | ✅                 |
| LIBERO-OBJECT    | ✅              | ✅                 |
| LIBERO-SPATIAL   | ✅              | ✅                 |

OpenVLA-OFT scale heads are ~300MB; RIPT LoRA adaptors are ~1GB.

---

## 🛠 How to Use

For usage, see [INSTALL.md](https://github.com/Ariostgx/ript-vla/blob/main/INSTALL.md) in the main GitHub repo.