This is a collection of humor generation models based on Qwen3-8B. Including SFT adapter, reward model, and GRPO adapter