nervousystem-env / scripts /evaluate.py

Commit History

docs(evidence): add multi-agent eval, long-horizon trace, and constrained eval logs
9731ebe

vx7sh commited on

feat(eval): add benchmark harness and reward integrity artifacts
ee2f27b

vx7sh commited on