Buckets:

McClain's picture
|
download
raw
1.46 kB

deprecated/ablations_t0.95/ + deprecated/ablations_t0.95_source/

These were the canonical ablation evaluations until 2026-05-05, when the user re-ran all 6 ablation cells at the sweep-optimal RL temperature T=1.15 (rather than T=0.95). The T=1.15 results are now canonical at evaluation/eight_prompt/ablations/.

What's archived here

  • ablations_t0.95/ — server-side-copy of the T=0.95 cell contents that previously occupied evaluation/eight_prompt/ablations/ in this bucket (mostly bucket-staged copies of the McClain/plasmidgpt-rl-* model evaluations at T=0.95)
  • ablations_t0.95_source/ — the local strict-QC re-run of all 6 ablations at T=0.95 performed on g6-big (/opt/dlami/nvme/strict_qc_ablations/); per-cell outputs.csv + qc/ artifacts + metadata.json. Manifest at ablations_t0.95_source/manifest.json carries seed, sha256, and pass counts.

Numerical comparison

Ablation T=0.95 T=1.15 (canonical)
full_reward 66.88% 78.35%
no_repeat_penalty 72.17% 75.15%
no_length_prior 71.38% 72.15%
no_cassette_bonus 19.80% 44.52%
length_only 34.73% 37.90%
cds_only 2.40% 1.73%

Cassette-bonus removal remains the largest single-component drop in both settings (47.1pp at T=0.95, 33.8pp at T=1.15). All ablations except cds_only improve at the higher temperature, consistent with the rejection-sampling sweep showing T=1.15 as the GRPO peak.

Xet Storage Details

Size:
1.46 kB
·
Xet hash:
7b255d2d2cebd8f9c3425ed7b3eafb65814611e1222a42f3a1d3f23713d12ec0

Xet efficiently stores files, intelligently splitting them into unique chunks and accelerating uploads and downloads. More info.