Buckets:

McClain's picture
|
download
raw
1.34 kB

deprecated/early_sft_checkpoint/

Files in this folder were generated using the broken SFT model.safetensors checkpoint that was the default in UCL-CSSB/PlasmidGPT-SFT until the 2026-05-02 cleanup commit daeaabf. The HF repo had two safetensors files, and AutoModelForCausalLM.from_pretrained defaulted to a Base-clone duplicate. The corrected checkpoint generates substantively different outputs (see evaluation/eight_prompt/SFT/ for the post-fix data).

Original locations on the bucket (before move):

  • evaluation/temperature_sweep/SFT_t0.95/ — T=0.95 SFT generations + qc
  • rejection_sampling_v2/direct/SFT/ — original Table 4 direct SFT cell (was reported as 7.15%)
  • rejection_sampling_v2/best_of_16/SFT/ — original Table 4 best-of-16 SFT cell (was reported as 32.4%)

The Base + GRPO cells in rejection_sampling_v2/ are unaffected and remain in their original locations to preserve Table 4 reproducibility for the original paper protocol.

For the camera-ready, the canonical SFT analyses live under:

  • evaluation/eight_prompt/SFT/ — main 8-prompt eval, T=1.0, pass rate 10.975%
  • mfe/SFT_real/ — MFE on corrected-checkpoint generations
  • continuation_benchmark/eval_set_656/ — held-out continuation/surprisal
  • rejection_v3/SFT/ — 8-prompt rejection sampling with corrected SFT (10.87%)

Xet Storage Details

Size:
1.34 kB
·
Xet hash:
ec5c7dcc14031732c76ba38aac0dc738479b7fd61f2ccc613f04f0c64dc8aa4c

Xet efficiently stores files, intelligently splitting them into unique chunks and accelerating uploads and downloads. More info.