agentbench / tests /evaluation /fixtures /rubrics_valid_binary.md

Commit History

feat(judges): Rubric markdown loader with aggressive validation
7b72b2c

Nomearod Claude Opus 4.7 (1M context) commited on