thejaminator's picture
verl GRPO trained model at step 60
5de68a1 verified
metadata
base_model: Qwen/Qwen3-8B
library_name: peft
tags:
  - lora
  - peft
pipeline_tag: text-generation