lyn22333's picture
Create README.md
75ce43c verified
metadata
language:
  - en
base_model:
  - Qwen/Qwen3-8B

Downstream policy trained using GenRM-R-Align-14B via PPO.