view article Article Fine-tuning SmolLM with Group Relative Policy Optimization (GRPO) by following the Methodologies prithivMLmods โข Feb 17, 2025 โข 30