lihaoxin2020/qwen3-4B-refiner-3201-rl-balanced-step50 Text Generation • 196k • Updated Apr 12 • 12 • 1
lihaoxin2020/qwen3-4B-refiner-3201-rl-balanced-step100 Text Generation • 196k • Updated about 1 month ago • 43
lihaoxin2020/qwen3-4B-refiner-sft-rl-balanced-step50 Text Generation • 196k • Updated about 1 month ago • 42
lihaoxin2020/qwen3-4B-refiner-sft-rl-balanced-resume-step100 Text Generation • 196k • Updated 30 days ago • 194
lihaoxin2020/qwen3-4b-refiner-gpt54-instance-rubric-gpt54-grpo-step50 Text Generation • 196k • Updated 24 days ago • 318