FEST Collection Checkpoints for the paper "Boosting Reinforcement Learning with Verifiable Rewards via Randomly Selected Few-Shot Guidance" • 3 items • Updated 7 days ago • 1
docker model run hf.co/kaiyan289/FEST-GRPO-1.5B-Math