Join the conversation

Join the community of Machine Learners and AI enthusiasts.

Sign Up
sergiopaniegoย 
posted an update Aug 4, 2025
Post
3462
Want to learn how to align a Vision Language Model (VLM) for reasoning using GRPO and TRL? ๐ŸŒ‹

๐Ÿง‘โ€๐Ÿณ We've got you covered!!

NEW multimodal post training recipe to align a VLM using TRL in @HuggingFace 's Cookbook.

Go to the recipe ๐Ÿ‘‰https://huggingface.co/learn/cookbook/fine_tuning_vlm_grpo_trl

Powered by the latest TRL v0.20 release, this recipe shows how to teach Qwen2.5-VL-3B-Instruct to reason over images ๐ŸŒ‹
In this post