phuongntc/qwen3_0.6b_ppo_penalty_multievalsumviet2_fix1000 Text Generation • 0.6B • Updated Jan 16 • 4
phuongntc/llama32_1b_ppo_multievalsumviet2_penalty_improved Text Generation • Updated Dec 26, 2025 • 1
phuongntc/llama32_1b_ppo_multievalsumviet2_penaltyfull Text Generation • 1B • Updated Dec 25, 2025 • 2