GRPO checkpoint: model.safetensors + tokenizer (no optimizer) 3774aea verified abharadwaj123 commited on Apr 7