File size: 445 Bytes
1fa3c6c
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
# Reward Functions

This module contains some useful reward functions, primarily intended for use with the [`GRPOTrainer`] and [`RLOOTrainer`].

## accuracy_reward



[[autodoc]] rewards.accuracy_reward

## reasoning_accuracy_reward

[[autodoc]] rewards.reasoning_accuracy_reward

## think_format_reward

[[autodoc]] rewards.think_format_reward

## get_soft_overlong_punishment



[[autodoc]] rewards.get_soft_overlong_punishment