Pref-GRPO & UniGenBench
Collection
10 items
β’
Updated
β’
1
This model is trained using UnifiedReward-Flex as reward on the training dataset of UniGenBench.
π The inference code is available at Github.
For further details, please refer to the following resources:
Base model
Wan-AI/Wan2.1-T2V-14B