Post
14
Uhh did Opus 4.8 cheat on PostTrainBench??
it found an API key in the PostTrainBench environment that allowed it to generate synthetic training data without using GPU hours, boosting the base model by 0.4913
Source: https://posttrainbench.com/traces/run.html?id=claude_non_api_max_claude-opus-4-8_10h_run1__healthbench_Qwen_Qwen3-4B-Base_17315102#tab=trace
it found an API key in the PostTrainBench environment that allowed it to generate synthetic training data without using GPU hours, boosting the base model by 0.4913
Source: https://posttrainbench.com/traces/run.html?id=claude_non_api_max_claude-opus-4-8_10h_run1__healthbench_Qwen_Qwen3-4B-Base_17315102#tab=trace