| # Reward Functions | |
| This module contains some useful reward functions, primarily intended for use with the [`GRPOTrainer`] and [`RLOOTrainer`]. | |
| ## accuracy_reward | |
| [[autodoc]] rewards.accuracy_reward | |
| ## reasoning_accuracy_reward | |
| [[autodoc]] rewards.reasoning_accuracy_reward | |
| ## think_format_reward | |
| [[autodoc]] rewards.think_format_reward | |
| ## get_soft_overlong_punishment | |
| [[autodoc]] rewards.get_soft_overlong_punishment | |