Qwen2.5-7B-Instruct-GRPO-Math / trainer_state.json

Commit History

Upload GRPO fine-tuned Qwen2.5-7B-Instruct model
bc4cc58
verified

FutureMa commited on