Semantic run names: Probe/Drift/Anchor/Restrain/Champion + regen all plots 84fbeda Anurag Agarwal commited on Apr 26
plots: add training progression + diagnostics, drop W&B links 099bec8 verified agarwalanu3103 commited on Apr 26
eval: enforce one-tool-call response format on every turn a22fcfd verified agarwalanu3103 commited on Apr 25
feat: add run_eval.py to Space (needed by eval_with_vllm.py for trained-model evals) 6473a24 verified agarwalanu3103 commited on Apr 25