israaaML's picture
Claude Sonnet 4.6
v3: benchmark results, final report, agent/eval improvements, smoke test fixes
b3fc5ee