Join the conversation

Join the community of Machine Learners and AI enthusiasts.

Sign Up
sergiopaniego 
posted an update 14 days ago
Post
275
Continuous batching just landed in TRL for GRPO!

At 64 generations it runs faster and uses less VRAM than plain generate, no vLLM needed

How it works and when to reach for it, below

https://huggingface.co/blog/sergiopaniego/cb-trl-grpo
In this post