Buckets:
deprecated/early_sft_checkpoint/
Files in this folder were generated using the broken SFT model.safetensors checkpoint
that was the default in UCL-CSSB/PlasmidGPT-SFT until the 2026-05-02 cleanup commit
daeaabf. The HF repo had two safetensors files, and AutoModelForCausalLM.from_pretrained
defaulted to a Base-clone duplicate. The corrected checkpoint generates substantively
different outputs (see evaluation/eight_prompt/SFT/ for the post-fix data).
Original locations on the bucket (before move):
evaluation/temperature_sweep/SFT_t0.95/— T=0.95 SFT generations + qcrejection_sampling_v2/direct/SFT/— original Table 4 direct SFT cell (was reported as 7.15%)rejection_sampling_v2/best_of_16/SFT/— original Table 4 best-of-16 SFT cell (was reported as 32.4%)
The Base + GRPO cells in rejection_sampling_v2/ are unaffected and remain in their
original locations to preserve Table 4 reproducibility for the original paper protocol.
For the camera-ready, the canonical SFT analyses live under:
evaluation/eight_prompt/SFT/— main 8-prompt eval, T=1.0, pass rate 10.975%mfe/SFT_real/— MFE on corrected-checkpoint generationscontinuation_benchmark/eval_set_656/— held-out continuation/surprisalrejection_v3/SFT/— 8-prompt rejection sampling with corrected SFT (10.87%)
Xet Storage Details
- Size:
- 1.34 kB
- Xet hash:
- ec5c7dcc14031732c76ba38aac0dc738479b7fd61f2ccc613f04f0c64dc8aa4c
Xet efficiently stores files, intelligently splitting them into unique chunks and accelerating uploads and downloads. More info.