Buckets:
deprecated/ablations_t0.95/ + deprecated/ablations_t0.95_source/
These were the canonical ablation evaluations until 2026-05-05, when the user
re-ran all 6 ablation cells at the sweep-optimal RL temperature T=1.15
(rather than T=0.95). The T=1.15 results are now canonical at
evaluation/eight_prompt/ablations/.
What's archived here
ablations_t0.95/— server-side-copy of the T=0.95 cell contents that previously occupiedevaluation/eight_prompt/ablations/in this bucket (mostly bucket-staged copies of the McClain/plasmidgpt-rl-* model evaluations at T=0.95)ablations_t0.95_source/— the local strict-QC re-run of all 6 ablations at T=0.95 performed on g6-big (/opt/dlami/nvme/strict_qc_ablations/); per-celloutputs.csv+qc/artifacts +metadata.json. Manifest atablations_t0.95_source/manifest.jsoncarries seed, sha256, and pass counts.
Numerical comparison
| Ablation | T=0.95 | T=1.15 (canonical) |
|---|---|---|
| full_reward | 66.88% | 78.35% |
| no_repeat_penalty | 72.17% | 75.15% |
| no_length_prior | 71.38% | 72.15% |
| no_cassette_bonus | 19.80% | 44.52% |
| length_only | 34.73% | 37.90% |
| cds_only | 2.40% | 1.73% |
Cassette-bonus removal remains the largest single-component drop in both settings (47.1pp at T=0.95, 33.8pp at T=1.15). All ablations except cds_only improve at the higher temperature, consistent with the rejection-sampling sweep showing T=1.15 as the GRPO peak.
Xet Storage Details
- Size:
- 1.46 kB
- Xet hash:
- 7b255d2d2cebd8f9c3425ed7b3eafb65814611e1222a42f3a1d3f23713d12ec0
·
Xet efficiently stores files, intelligently splitting them into unique chunks and accelerating uploads and downloads. More info.