clarify-rl / training

Commit History

Run 6 results + training fixes + all plots regenerated
aae07d0

Anurag Agarwal commited on

plots: add training progression + diagnostics, drop W&B links
099bec8
verified

agarwalanu3103 commited on

rewrite training notebook with cleaner cell-by-cell structure
a45e7e7

Anurag Agarwal commited on

Add training/train_grpo.ipynb — GRPO training notebook (TRL + vLLM + ClarifyEnv)
5e8f794

Anurag Agarwal commited on