RLinf-VLA: A Unified and Efficient Framework for VLA+RL Training Paper โข 2510.06710 โข Published Oct 8, 2025 โข 42
VLA-Adapter: An Effective Paradigm for Tiny-Scale Vision-Language-Action Model Paper โข 2509.09372 โข Published Sep 11, 2025 โข 246
view post Post 4555 Just included example scripts for aligning models using GSPO (including VLM example) ๐โโ๏ธ๐โโ๏ธGSPO is the latest RL alignment algo by @Alibaba_Qwen and it's already supported in the latest TRL v0.20 release.Super-easy-to-get-started example scripts below, GO run them!๐ฉโ๐ป๐ฉโ๐ป ๐งโ๐จ Script: https://github.com/huggingface/trl/blob/main/examples/scripts/gspo.py๐ฆ VLM script: https://github.com/huggingface/trl/blob/main/examples/scripts/gspo_vlm.py๐งฉ More TRL examples: https://huggingface.co/docs/trl/main/en/example_overview๐งโโ๏ธ GSPO paper: Group Sequence Policy Optimization (2507.18071) See translation ๐ 6 6 + Reply