feat: reward verifier alignment, notebook hardening, model name fix cdc237b Running CreativeEngineer Claude Opus 4.6 commited on 9 days ago
feat: add llm rollout contract and simplify ppo smoke ebd0ff3 CreativeEngineer commited on 10 days ago