docs: strengthen Chess960 thesis — why it's the right self-improvement benchmark 7b15ef1 qtzx06 commited on Mar 8
docs: expand architecture doc with full search stack and training pipeline details 7109aa9 qtzx06 commited on Mar 8
docs: rewrite demo script with concrete before/after metrics and full results 5a8e942 qtzx06 commited on Mar 8
feat: add thesis section + Codex agent swarm narrative + 9B scaling probe + rewrite process log 4ed9a84 qtzx06 commited on Mar 8
docs: sharpen demo script with concrete Elo gains and before/after metrics 219232e qtzx06 commited on Mar 8
docs: add GRPO deep-dive — environment-grounded RL over bounded tool use 55b59f4 qtzx06 commited on Mar 8
feat: rewrite training to use TRL rollout_func + OpenEnv multi-turn pattern 93f58fd qtzx06 commited on Mar 8