File size: 429 Bytes
b34612c
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
# Qwen2.5 Math PRM Student (1.5B)

Custom pairwise reward model head (2 logits) on top of Qwen/Qwen2.5-Math-1.5B.
Pool at last occurrence of token `</think>`.

## Usage
```python
from transformers import AutoTokenizer, AutoModel
repo = 'omrisap/Qwen2.5-Math-PRM-1.5B'
tok = AutoTokenizer.from_pretrained(repo, trust_remote_code=True)
model = AutoModel.from_pretrained(repo, trust_remote_code=True)
# logits shape: (batch, 2)
```