RIPT_VLA / README.md

tanshh97

Remove library name (#2)

57532f4 verified 8 months ago

preview code

raw

history blame contribute delete

2.66 kB

metadata

base_model:
  - moojink/openvla-7b-oft-finetuned-libero-spatial
  - moojink/openvla-7b-oft-finetuned-libero-10
  - moojink/openvla-7b-oft-finetuned-libero-object
  - moojink/openvla-7b-oft-finetuned-libero-goal
datasets:
  - yifengzhu-hf/LIBERO-datasets
pipeline_tag: robotics
license: mit

💪 RIPT-VLA: Interactive Post-Training for Vision-Language-Action Models (arxiv.org/abs/2505.17016)

Authors: Shuhan Tan, Kairan Dou, Yue Zhao, Philipp Krähenbühl
Codebase: GitHub – RIPT-VLA
Website: Project Page

RIPT-VLA enables interactive post-training for any pretrained Vision-Language-Action (VLA) model using only sparse binary success rewards.
With K-rollout interaction, dynamic sampling, and leave-one-out advantage estimation, RIPT-VLA achieves state-of-the-art performance in extremely low-data regimes.

🧠 Model Summary

RIPT-VLA takes a pretrained VLA model (e.g., QueST or OpenVLA-OFT) and improves its performance by fine-tuning it with reinforcement learning based on success/failure signals only — no dense rewards or value functions required.

Supported models:

✅ QueST (small, efficient)
✅ OpenVLA-OFT (large-scale, high-capacity)

🧪 Model Use

✅ Intended Use

Research on post-training VLA models via RL
Evaluation on LIBERO benchmarks (LIBERO-90, Goal, Object, Spatial, Long)
Studying low-data reinforcement learning settings

📦 Checkpoints

All checkpoints are hosted here in this repository.

✔️ QueST Checkpoints

Suite	SFT Checkpoint	RIPT Checkpoint
LIBERO-90	✅	✅
LIBERO-GOAL	✅	✅
LIBERO-LONG	✅	✅
LIBERO-OBJECT	✅	✅
LIBERO-SPATIAL	✅	✅

Each QueST checkpoint is ~80MB.

✔️ OpenVLA-OFT Checkpoints

Suite	SFT Scale Head	RIPT LoRA Adaptor
LIBERO-GOAL	✅	✅
LIBERO-LONG	✅	✅
LIBERO-OBJECT	✅	✅
LIBERO-SPATIAL	✅	✅

OpenVLA-OFT scale heads are ~300MB; RIPT LoRA adaptors are ~1GB.

🛠 How to Use

For usage, see INSTALL.md in the main GitHub repo.