Buckets:
| # Opus token and wall-time performance deep dive | |
| Date: 2026-05-30 | |
| Scope: every report row whose model name contains `opus`: `opus47` from the publish suite, plus `opus46`, `opus48`, and the two Opus 4.8 task-budget experiments from `new-model-day`. | |
| ## Headline | |
| - Fastest wall time: `opus?task_budget=50000` at 325.2s; note it only generated 4/5 successfully. | |
| - Lowest total tokens: `opus?task_budget=50000` at 516,007 tokens. | |
| - Best 100-quality Opus row by quality-efficiency: `opus47` (872.9s; 2,041,367 tokens). | |
| - `task_budget=50000` cut wall time by 66.8% and total tokens by 78.1% vs `opus48`, but quality fell from 100.0 to 87.0 and one artifact failed generation. | |
| - `task_budget=200000` was slower (+21.5%) and used more tokens (+56.4%) than plain `opus48`, with lower quality (97.4 vs 100.0). | |
| ## Overall Opus rows | |
| | model | suite | quality | gen | wall time | total tokens | input | output | effective input | cache % | tok/s | out tok/s | turns | tools | det | VLM | QE rank | | |
| | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | | |
| | opus47 | publish | 100.0 | 5/5 | 872.9s | 2,041,367 | 1,980,822 | 60,545 | 228,388 | 88.5% | 2338.7 | 69.4 | 67 | 83 | 0 | 0F/0W | 2 | | |
| | opus46 | new-model-day | 98.8 | 5/5 | 997.5s | 1,851,922 | 1,793,023 | 58,899 | 191,646 | 89.3% | 1856.5 | 59.0 | 73 | 98 | 2 | 0F/0W | 7 | | |
| | opus48 | new-model-day | 100.0 | 5/5 | 979.4s | 2,354,120 | 2,286,911 | 67,209 | 267,975 | 88.3% | 2403.7 | 68.6 | 71 | 91 | 0 | 0F/0W | 6 | | |
| | opus?task_budget=50000 | new-model-day | 87.0 | 4/5 | 325.2s | 516,007 | 488,908 | 27,099 | 184,283 | 62.3% | 1586.7 | 83.3 | 28 | 29 | 4 | 1F/1W | 8 | | |
| | opus?task_budget=200000 | new-model-day | 97.4 | 5/5 | 1,189.6s | 3,680,965 | 3,584,777 | 96,188 | 311,493 | 91.3% | 3094.2 | 80.9 | 88 | 105 | 4 | 0F/2W | 10 | | |
| ## Relative to plain `opus48` | |
| | model | wall-time ratio | token ratio | output-token ratio | quality delta | notes | | |
| | --- | ---: | ---: | ---: | ---: | --- | | |
| | opus47 | 0.89× | 0.87× | 0.90× | +0.0 | clean | | |
| | opus46 | 1.02× | 0.79× | 0.88× | -1.2 | 2 det failures | | |
| | opus48 | 1.00× | 1.00× | 1.00× | +0.0 | clean | | |
| | opus?task_budget=50000 | 0.33× | 0.22× | 0.40× | -13.0 | 4/5 gen, 4 det failures, 1F/1W VLM | | |
| | opus?task_budget=200000 | 1.21× | 1.56× | 1.43× | -2.6 | 4 det failures, 0F/2W VLM | | |
| ## Per-artifact wall time and tokens | |
| ### `benchmark-comparison` | |
| | model | ok | quality | wall time | total tokens | input | output | effective input | tok/s | turns | tools | det | VLM | | |
| | --- | ---: | ---: | ---: | ---: | ---: | ---: | ---: | ---: | ---: | ---: | ---: | --- | | |
| | opus47 | True | 100.0 | 150.0s | 397,948 | 388,331 | 9,617 | 26,486 | 2652.2 | 19 | 22 | 0 | 0F/0W | | |
| | opus46 | True | 100.0 | 272.0s | 371,021 | 351,900 | 19,121 | 56,694 | 1364.3 | 14 | 18 | 0 | 0F/0W | | |
| | opus48 | True | 100.0 | 258.3s | 704,433 | 685,790 | 18,643 | 55,911 | 2727.1 | 21 | 26 | 0 | 0F/0W | | |
| | opus?task_budget=50000 | True | 100.0 | 76.8s | 111,775 | 105,163 | 6,612 | 20,498 | 1454.5 | 7 | 7 | 0 | 0F/0W | | |
| | opus?task_budget=200000 | True | 100.0 | 281.1s | 1,036,764 | 1,012,407 | 24,357 | 100,128 | 3688.1 | 22 | 28 | 0 | 0F/0W | | |
| ### `code-review` | |
| | model | ok | quality | wall time | total tokens | input | output | effective input | tok/s | turns | tools | det | VLM | | |
| | --- | ---: | ---: | ---: | ---: | ---: | ---: | ---: | ---: | ---: | ---: | ---: | --- | | |
| | opus47 | True | 100.0 | 268.4s | 588,373 | 571,314 | 17,059 | 73,388 | 2192.5 | 14 | 18 | 0 | 0F/0W | | |
| | opus46 | True | 100.0 | 237.0s | 540,085 | 528,342 | 11,743 | 40,896 | 2278.4 | 17 | 29 | 0 | 0F/0W | | |
| | opus48 | True | 100.0 | 197.0s | 474,233 | 459,662 | 14,571 | 72,302 | 2406.7 | 12 | 15 | 0 | 0F/0W | | |
| | opus?task_budget=50000 | True | 100.0 | 63.3s | 109,587 | 104,544 | 5,043 | 56,128 | 1730.6 | 4 | 5 | 0 | 0F/0W | | |
| | opus?task_budget=200000 | True | 87.0 | 176.7s | 425,417 | 411,266 | 14,151 | 58,001 | 2407.0 | 11 | 13 | 4 | 0F/2W | | |
| ### `implementation-plan` | |
| | model | ok | quality | wall time | total tokens | input | output | effective input | tok/s | turns | tools | det | VLM | | |
| | --- | ---: | ---: | ---: | ---: | ---: | ---: | ---: | ---: | ---: | ---: | ---: | --- | | |
| | opus47 | True | 100.0 | 141.6s | 215,600 | 206,186 | 9,414 | 22,107 | 1522.3 | 11 | 12 | 0 | 0F/0W | | |
| | opus46 | True | 94.0 | 130.3s | 167,161 | 159,833 | 7,328 | 22,835 | 1283.2 | 11 | 12 | 2 | 0F/0W | | |
| | opus48 | True | 100.0 | 196.4s | 264,333 | 252,260 | 12,073 | 39,929 | 1345.9 | 12 | 13 | 0 | 0F/0W | | |
| | opus?task_budget=50000 | True | 100.0 | 62.2s | 111,821 | 106,572 | 5,249 | 22,221 | 1797.7 | 7 | 7 | 0 | 0F/0W | | |
| | opus?task_budget=200000 | True | 100.0 | 132.8s | 343,763 | 332,156 | 11,607 | 42,016 | 2589.2 | 16 | 17 | 0 | 0F/0W | | |
| ### `module-explainer` | |
| | model | ok | quality | wall time | total tokens | input | output | effective input | tok/s | turns | tools | det | VLM | | |
| | --- | ---: | ---: | ---: | ---: | ---: | ---: | ---: | ---: | ---: | ---: | ---: | --- | | |
| | opus47 | True | 100.0 | 206.7s | 669,243 | 653,611 | 15,632 | 85,438 | 3237.0 | 13 | 19 | 0 | 0F/0W | | |
| | opus46 | True | 100.0 | 192.8s | 417,791 | 406,724 | 11,067 | 44,687 | 2167.1 | 11 | 18 | 0 | 0F/0W | | |
| | opus48 | True | 100.0 | 218.6s | 633,137 | 618,129 | 15,008 | 72,109 | 2896.4 | 12 | 21 | 0 | 0F/0W | | |
| | opus?task_budget=50000 | False | 35.0 | 56.1s | 87,378 | 82,544 | 4,834 | 68,845 | 1558.1 | 3 | 3 | 4 | 1F/1W | | |
| | opus?task_budget=200000 | True | 100.0 | 460.5s | 1,534,617 | 1,500,017 | 34,600 | 84,706 | 3332.5 | 23 | 30 | 0 | 0F/0W | | |
| ### `numeric-data` | |
| | model | ok | quality | wall time | total tokens | input | output | effective input | tok/s | turns | tools | det | VLM | | |
| | --- | ---: | ---: | ---: | ---: | ---: | ---: | ---: | ---: | ---: | ---: | ---: | --- | | |
| | opus47 | True | 100.0 | 106.1s | 170,203 | 161,380 | 8,823 | 20,969 | 1604.4 | 10 | 12 | 0 | 0F/0W | | |
| | opus46 | True | 100.0 | 165.4s | 355,864 | 346,224 | 9,640 | 26,534 | 2150.9 | 20 | 21 | 0 | 0F/0W | | |
| | opus48 | True | 100.0 | 109.0s | 277,984 | 271,070 | 6,914 | 27,724 | 2549.2 | 14 | 16 | 0 | 0F/0W | | |
| | opus?task_budget=50000 | True | 100.0 | 66.8s | 95,446 | 90,085 | 5,361 | 16,591 | 1429.6 | 7 | 7 | 0 | 0F/0W | | |
| | opus?task_budget=200000 | True | 100.0 | 138.5s | 340,404 | 328,931 | 11,473 | 26,642 | 2457.6 | 16 | 17 | 0 | 0F/0W | | |
| ## Files | |
| - CSV summary: `analysis/deep-dives/opus-performance-summary.csv` | |
| - CSV by artifact: `analysis/deep-dives/opus-performance-by-artifact.csv` | |
| - Source metrics: `analysis/data/model-summary.json`, `analysis/data/artifact-summary.json` | |
Xet Storage Details
- Size:
- 6.57 kB
- Xet hash:
- 5e161a9dac66e7f4bfef4e4aa9f62d40455ecaf32b99eef14793a43da3232859
·
Xet efficiently stores files, intelligently splitting them into unique chunks and accelerating uploads and downloads. More info.