This repository hosts a [Star] trained by GCPO. The reward model is HPS.
The training code is available at GCPO
-