agentbench / scripts /_dev /probe_4a_gpt4o_full.py

Commit History

calibrate(jury): 4A characterizes v1.1.1 residual as model-class-specific
504a35c

Nomearod Claude Opus 4.7 (1M context) commited on