thejaminator's picture
verl GRPO trained model at step 125
2cbd4a0 verified
---
base_model: thejaminator/qwen-hook-layer-9-posneg-merged
library_name: peft
tags:
- lora
- peft
pipeline_tag: text-generation
---