RLinf/RLinf-OpenVLAOFT-PPO-ManiSkill3-25ood
Reinforcement Learning • 8B • Updated • 47
None defined yet.
WoVR: World Models as Reliable Simulators for Post-Training VLA Policies with RL
RLinf-Co: Reinforcement Learning-Based Sim-Real Co-Training for VLA Models