injected_thinking / third_party /ms-swift /docs /source /Instruction /GRPO /DeveloperGuide /index.rst
| Developer Guide | |
| =============== | |
| .. toctree:: | |
| :maxdepth: 1 | |
| loss_types.md | |
| multi_turn.md | |
| multi_task.md | |
| reward_function.md | |
| reward_model.md | |
| gym_env.md | |