Buckets:
| # deprecated/early_sft_checkpoint/ | |
| Files in this folder were generated using the **broken SFT model.safetensors checkpoint** | |
| that was the default in `UCL-CSSB/PlasmidGPT-SFT` until the 2026-05-02 cleanup commit | |
| `daeaabf`. The HF repo had two safetensors files, and `AutoModelForCausalLM.from_pretrained` | |
| defaulted to a Base-clone duplicate. The corrected checkpoint generates substantively | |
| different outputs (see `evaluation/eight_prompt/SFT/` for the post-fix data). | |
| Original locations on the bucket (before move): | |
| - `evaluation/temperature_sweep/SFT_t0.95/` — T=0.95 SFT generations + qc | |
| - `rejection_sampling_v2/direct/SFT/` — original Table 4 direct SFT cell (was reported as 7.15%) | |
| - `rejection_sampling_v2/best_of_16/SFT/` — original Table 4 best-of-16 SFT cell (was reported as 32.4%) | |
| The Base + GRPO cells in `rejection_sampling_v2/` are unaffected and remain in their | |
| original locations to preserve Table 4 reproducibility for the original paper protocol. | |
| For the camera-ready, the canonical SFT analyses live under: | |
| - `evaluation/eight_prompt/SFT/` — main 8-prompt eval, T=1.0, pass rate 10.975% | |
| - `mfe/SFT_real/` — MFE on corrected-checkpoint generations | |
| - `continuation_benchmark/eval_set_656/` — held-out continuation/surprisal | |
| - `rejection_v3/SFT/` — 8-prompt rejection sampling with corrected SFT (10.87%) | |
Xet Storage Details
- Size:
- 1.34 kB
- Xet hash:
- ec5c7dcc14031732c76ba38aac0dc738479b7fd61f2ccc613f04f0c64dc8aa4c
·
Xet efficiently stores files, intelligently splitting them into unique chunks and accelerating uploads and downloads. More info.