violetxi/clbench-exploitable-poker-wm-sft-opponent-dynamics-thinking Viewer • Updated May 23 • 49 • 15
violetxi/clbench-exploitable-poker-wm_summar-gemini-flash-3-1-lite Viewer • Updated May 18 • 2.74k • 10
violetxi/single-turn-eval-int_qwen3-4b_distill_teacher_reverse_kl_lr1e-7-n32 Viewer • Updated May 15 • 566 • 11
violetxi/single-turn-eval-meta_feedback_qwen3-4b_step2_gpt-5-nano_gepa-n32 Viewer • Updated May 5 • 1.01k • 22
violetxi/single-turn-eval-meta_feedback_qwen3-4b_step2_gpt-5.4_gepa-n32 Viewer • Updated May 5 • 1.01k • 13
violetxi/stage1_proof-qwen3-4b-grpo-imoproofbench-summary-reasoning-graded Viewer • Updated Apr 30 • 960 • 7
violetxi/stage1_proof-qwen3-4b-dense-process-rubric-imoproofbench-summary-graded Viewer • Updated Apr 30 • 960 • 6