Qwen3_1.7B-GRPO-math-reasoning / generation_config.json

Commit History

Trained with Unsloth
0e763b8
verified

Afaf commited on