md896's picture
Fix GRPO batch/generation mismatch: auto-adjust num_generations; set launcher default to 2.
af54ccd