grpo_math / trainer_state.json

Commit History

Backup current best GRPO math checkpoint
7909e36
verified

mmm128 commited on