--- base_model: - moojink/openvla-7b-oft-finetuned-libero-spatial - moojink/openvla-7b-oft-finetuned-libero-10 - moojink/openvla-7b-oft-finetuned-libero-object - moojink/openvla-7b-oft-finetuned-libero-goal datasets: - yifengzhu-hf/LIBERO-datasets pipeline_tag: robotics license: mit --- # πŸ’ͺ RIPT-VLA: Interactive Post-Training for Vision-Language-Action Models (arxiv.org/abs/2505.17016) **Authors**: Shuhan Tan, Kairan Dou, Yue Zhao, Philipp KrΓ€henbΓΌhl **Codebase**: [GitHub – RIPT-VLA](https://github.com/Ariostgx/ript-vla) **Website**: [Project Page](https://ariostgx.github.io/ript_vla/) > **RIPT-VLA** enables interactive post-training for any pretrained Vision-Language-Action (VLA) model using only **sparse binary success rewards**. > With **K-rollout interaction**, **dynamic sampling**, and **leave-one-out advantage estimation**, RIPT-VLA achieves **state-of-the-art** performance in extremely low-data regimes. --- ## 🧠 Model Summary RIPT-VLA takes a pretrained VLA model (e.g., QueST or OpenVLA-OFT) and improves its performance by fine-tuning it with reinforcement learning based on success/failure signals only β€” no dense rewards or value functions required. Supported models: - βœ… QueST (small, efficient) - βœ… OpenVLA-OFT (large-scale, high-capacity) --- ## πŸ§ͺ Model Use ### βœ… Intended Use - Research on post-training VLA models via RL - Evaluation on LIBERO benchmarks (LIBERO-90, Goal, Object, Spatial, Long) - Studying low-data reinforcement learning settings --- ## πŸ“¦ Checkpoints All checkpoints are hosted here in this repository. ### βœ”οΈ QueST Checkpoints | Suite | SFT Checkpoint | RIPT Checkpoint | |------------------|----------------|-----------------| | LIBERO-90 | βœ… | βœ… | | LIBERO-GOAL | βœ… | βœ… | | LIBERO-LONG | βœ… | βœ… | | LIBERO-OBJECT | βœ… | βœ… | | LIBERO-SPATIAL | βœ… | βœ… | Each QueST checkpoint is ~80MB. ### βœ”οΈ OpenVLA-OFT Checkpoints | Suite | SFT Scale Head | RIPT LoRA Adaptor | |------------------|----------------|--------------------| | LIBERO-GOAL | βœ… | βœ… | | LIBERO-LONG | βœ… | βœ… | | LIBERO-OBJECT | βœ… | βœ… | | LIBERO-SPATIAL | βœ… | βœ… | OpenVLA-OFT scale heads are ~300MB; RIPT LoRA adaptors are ~1GB. --- ## πŸ›  How to Use For usage, see [INSTALL.md](https://github.com/Ariostgx/ript-vla/blob/main/INSTALL.md) in the main GitHub repo.