File size: 110 Bytes
75ce43c
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
---
language:
- en
base_model:
- Qwen/Qwen3-8B
---

Downstream policy trained using GenRM-R-Align-14B via PPO.