Buckets:
| # Bucket data status — what's verified, what's pending | |
| **Issue (root-caused 2026-05-02)**: HF repo `UCL-CSSB/PlasmidGPT-SFT` had two safetensors files; `AutoModelForCausalLM.from_pretrained` defaulted to a Base-clone duplicate. Cleanup commit `daeaabf` swapped them. SHA changed from `5748e6f9` → `daeaabf0`. `models/pinned_shas.csv` is updated. | |
| ## Verified clean (post-2026-05-04 audit) | |
| - `evaluation/eight_prompt/{Base,SFT,RL}/` — analysis2 strict QC; 4.275 / 10.975 / 71.575 | |
| - `evaluation/eight_prompt/ablations/full_reward/` — GRPO @ T=0.95 = 66.875% | |
| - `evaluation/eight_prompt/ablations/{cds_only,length_only,no_cassette_bonus,no_length_prior,no_repeat_penalty}/` — Table 7 rows 2-6 | |
| - `analysis/distribution_metrics.csv` + `analysis/distribution/per_seq_*.csv` — Table 6 source | |
| - `continuation_benchmark/eval_set_656/` — 656 plasmids × 5 splits (PRIMARY Table 5 source) | |
| - `continuation_benchmark/heldout_eng_r3/` — PLSDB-style F1–F6 NCBI queries | |
| - `continuation_benchmark/{both_metric_eval, validation_eval, holdout30_non_addgene}/` — additional held-out evals | |
| - `mfe/SFT_real/` — replaces stale; mean −0.148 (matches paper −0.149) | |
| - `mfe/{SFT_circ10k_subset, SFT_temp_sweep, RL_t1.15_8prompt, RL_temp_sweep_2prompt}/` — additional MFE coverage | |
| - `rejection_v3/{Base,SFT,GRPO}/` — 8-prompt × 1250 = 10K, analysis2 strict QC, sweep-optimal T | |
| - `rejection_topK/` — M=50 attempts × K∈{1,4,16,64} success rates | |
| - `plannotate/{RL, Base_t0.95, SFT_t0.95}/` — Table 8 sources | |
| - `novelty_blastn/summary.csv` — Table 2 | |
| - `reference/addgene_500/`, `original_paper/`, `models/`, `code_snapshots/` — auxiliary | |
| ## SFT-stale files NOT yet replaced | |
| - `rejection_sampling_v2/direct/SFT/` and `rejection_sampling_v2/best_of_16/SFT/` — the original Table 4 SFT cells. Kept in place for paper reproducibility but **superseded by `rejection_v3/SFT/`** if camera-ready uses the new 8-prompt protocol. Old numbers (7.15% / 32.4%) used pre-fix SFT checkpoint; new (10.87% / —) uses corrected checkpoint. | |
| - `evaluation/temperature_sweep/SFT_t0.95/` — generations with broken checkpoint; appendix material only | |
| The original `continuation_benchmark/{completion,surprisal}_benchmark.csv` (small 11-plasmid set) was kept as legacy data; numbers reproduce paper Table 5 (Base −12.449, RL −10.966) but the new `eval_set_656/` is much more rigorous. | |
| ## Unaffected — known good throughout | |
| - `evaluation/eight_prompt/Base/`, `mfe/Base/`, `mfe/RL/` (= old `GRPO_temp1.0`), `mfe/ablations/*/` | |
| - `rejection_sampling_v2/direct/{Base,GRPO}/`, `rejection_sampling_v2/best_of_16/{Base,GRPO}/` | |
| - `evaluation/eight_prompt/ablations/*/` (ablation models from McClain/plasmidgpt-rl-*) | |
| - `plannotate/RL/`, `novelty_blastn/`, `reference/`, `original_paper/` | |
Xet Storage Details
- Size:
- 2.78 kB
- Xet hash:
- b869aee02a057b47af98e4964c01c69b148ec945da708a832453fb7a48debb7b
·
Xet efficiently stores files, intelligently splitting them into unique chunks and accelerating uploads and downloads. More info.