File size: 110 Bytes
75ce43c | 1 2 3 4 5 6 7 8 | ---
language:
- en
base_model:
- Qwen/Qwen3-8B
---
Downstream policy trained using GenRM-R-Align-14B via PPO. |
75ce43c | 1 2 3 4 5 6 7 8 | ---
language:
- en
base_model:
- Qwen/Qwen3-8B
---
Downstream policy trained using GenRM-R-Align-14B via PPO. |