feat: reward verifier alignment, notebook hardening, model name fix cdc237b Running CreativeEngineer Claude Opus 4.6 commited on 6 days ago
fix: align llm_agent auto-submit and reward handling with notebook 5f2da5f CreativeEngineer Claude Opus 4.6 commited on 7 days ago
fix: robust JSON array extraction and notebook GRPO fixes e826e11 CreativeEngineer Claude Opus 4.6 commited on 7 days ago
feat: add llm rollout contract and simplify ppo smoke ebd0ff3 CreativeEngineer commited on 7 days ago
refactor: align p1 runtime contract and baseline reporting 6deaccc CreativeEngineer commited on 7 days ago