tanshh97
/

RIPT_VLA

Model card Files Files and versions

RIPT_VLA / README.md

nielsr's picture

nielsr HF Staff

Remove library name

06bc279 verified 9 months ago

|

2.66 kB

	---
	base_model:
	- moojink/openvla-7b-oft-finetuned-libero-spatial
	- moojink/openvla-7b-oft-finetuned-libero-10
	- moojink/openvla-7b-oft-finetuned-libero-object
	- moojink/openvla-7b-oft-finetuned-libero-goal
	datasets:
	- yifengzhu-hf/LIBERO-datasets
	pipeline_tag: robotics
	license: mit
	---

	# 💪 RIPT-VLA: Interactive Post-Training for Vision-Language-Action Models (arxiv.org/abs/2505.17016)

	Authors: Shuhan Tan, Kairan Dou, Yue Zhao, Philipp Krähenbühl
	Codebase: [GitHub – RIPT-VLA](https://github.com/Ariostgx/ript-vla)
	Website: [Project Page](https://ariostgx.github.io/ript_vla/)

	> RIPT-VLA enables interactive post-training for any pretrained Vision-Language-Action (VLA) model using only sparse binary success rewards.
	> With K-rollout interaction, dynamic sampling, and leave-one-out advantage estimation, RIPT-VLA achieves state-of-the-art performance in extremely low-data regimes.

	---

	## 🧠 Model Summary

	RIPT-VLA takes a pretrained VLA model (e.g., QueST or OpenVLA-OFT) and improves its performance by fine-tuning it with reinforcement learning based on success/failure signals only — no dense rewards or value functions required.

	Supported models:
	- ✅ QueST (small, efficient)
	- ✅ OpenVLA-OFT (large-scale, high-capacity)

	---

	## 🧪 Model Use

	### ✅ Intended Use

	- Research on post-training VLA models via RL
	- Evaluation on LIBERO benchmarks (LIBERO-90, Goal, Object, Spatial, Long)
	- Studying low-data reinforcement learning settings

	---

	## 📦 Checkpoints

	All checkpoints are hosted here in this repository.

	### ✔️ QueST Checkpoints

	\| Suite \| SFT Checkpoint \| RIPT Checkpoint \|
	\|------------------\|----------------\|-----------------\|
	\| LIBERO-90 \| ✅ \| ✅ \|
	\| LIBERO-GOAL \| ✅ \| ✅ \|
	\| LIBERO-LONG \| ✅ \| ✅ \|
	\| LIBERO-OBJECT \| ✅ \| ✅ \|
	\| LIBERO-SPATIAL \| ✅ \| ✅ \|

	Each QueST checkpoint is ~80MB.

	### ✔️ OpenVLA-OFT Checkpoints

	\| Suite \| SFT Scale Head \| RIPT LoRA Adaptor \|
	\|------------------\|----------------\|--------------------\|
	\| LIBERO-GOAL \| ✅ \| ✅ \|
	\| LIBERO-LONG \| ✅ \| ✅ \|
	\| LIBERO-OBJECT \| ✅ \| ✅ \|
	\| LIBERO-SPATIAL \| ✅ \| ✅ \|

	OpenVLA-OFT scale heads are ~300MB; RIPT LoRA adaptors are ~1GB.

	---

	## 🛠 How to Use

	For usage, see [INSTALL.md](https://github.com/Ariostgx/ript-vla/blob/main/INSTALL.md) in the main GitHub repo.