phuongntc/qwen3_06b_grpo_noSFT_multievalsumviet2_nopenalty Text Generation • Updated 13 days ago • 16
phuongntc/qwen3_0.6b_ppo_penalty_multievalsumviet2_fix1000 Text Generation • 0.6B • Updated 16 days ago • 13
phuongntc/qwen3_0.6b_ppo_penalty_multievalsumviet2_final Text Generation • 0.6B • Updated 17 days ago • 20