File size: 445 Bytes
1fa3c6c | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 | # Reward Functions
This module contains some useful reward functions, primarily intended for use with the [`GRPOTrainer`] and [`RLOOTrainer`].
## accuracy_reward
[[autodoc]] rewards.accuracy_reward
## reasoning_accuracy_reward
[[autodoc]] rewards.reasoning_accuracy_reward
## think_format_reward
[[autodoc]] rewards.think_format_reward
## get_soft_overlong_punishment
[[autodoc]] rewards.get_soft_overlong_punishment
|