Buckets:
| # PlasmidRL — ICML Revision Data | |
| Experimental data for: **Effects of Structural Reward Shaping on Biophysical Properties in RL-Trained Plasmid Generators** | |
| ## Best Model Configuration | |
| **UCL-CSSB/PlasmidGPT-GRPO at temperature=1.0** achieves the best quality-diversity tradeoff: | |
| - **71.6% QC pass rate** (ORI + AMR + no repeats) | |
| - **0.573 diversity** (1 − mean pairwise Jaccard of 21-mer MinHash) | |
| - **−0.149 kcal/mol/nt MFE density** (DNA parameters) | |
| - Mean sequence length: 6,517 bp | |
| A separately trained `McClain/PlasmidGPT-RL` run at temperature=0.95 underperforms | |
| (53.7% pass rate, 0.132 diversity) and is retained only as an ablation baseline. | |
| It is **not** the model used for any headline result in the paper. | |
| ## Bucket Structure | |
| ``` | |
| analysis/ Summary metric CSVs | |
| ├── full_ablation_metrics.csv 8-model ablation comparison | |
| ├── baselines_qc_metrics.csv rejection sampling QC results | |
| ├── rl_per_prompt_metrics.csv RL per-prompt breakdown | |
| ├── rl_temp_sweep_final.csv temperature sweep (RL) | |
| └── mfe_summary_all.csv MFE across all models | |
| baselines/ Rejection sampling raw sequences | |
| ├── rejection_sampling/{Base,SFT,GRPO}/ 10K samples each + metadata | |
| └── best_of_16/{Base,SFT,GRPO}/ 16K samples each + metadata | |
| eval_8prompt/ Main 8-prompt evaluation, T=1.0 | |
| └── {Base,SFT}/ | |
| ├── *_metrics.csv per-sequence metrics (4,000 rows) | |
| └── *_summary.json aggregate stats | |
| generations/ Multi-temperature generation | |
| ├── temp_0.8/{Base,RL,RL_cds_only,RL_length_only, | |
| │ RL_no_cassette,RL_no_length,RL_no_repeat}/ | |
| ├── temp_0.95/{Base,SFT,RL,RL_cds_only,RL_length_only, | |
| │ RL_no_cassette,RL_no_length,RL_no_repeat}/ | |
| └── temp_1.1/{Base,RL,RL_cds_only,RL_length_only, | |
| RL_no_cassette,RL_no_length,RL_no_repeat}/ | |
| NOTE: SFT generations exist only at temp_0.95 | |
| (resampled 2026-04-21). Earlier uploads at temp_0.8 | |
| and temp_1.1 were byte-identical duplicates of the | |
| corresponding Base outputs and have been deleted. | |
| generations_sweep/ Temperature sweep (RL + GRPO) | |
| ├── temp_{0.3,0.5,0.7}/{RL,GRPO}/ | |
| ├── temp_{0.9}/{GRPO}/ | |
| └── temp_{1.0}/{RL,GRPO}/ | |
| mfe/ ViennaRNA MFE density (DNA + RNA params) | |
| └── {Base,SFT,RL,RL_cds_only,RL_length_only,RL_no_cassette, | |
| RL_no_length,RL_no_repeat,GRPO_temp1.0,GRPO_temp0.9}/ | |
| ├── mfe_results.csv per-sequence MFE (both RNA and DNA params) | |
| └── mfe_summary.json mean ± std | |
| qc_results/ QC for ablation study (temp=0.95, 8 models) | |
| └── {Base,SFT,RL,...}/passed.csv, failed.csv, aggregate_*.csv, repeats.csv | |
| qc_baselines/ QC for rejection sampling | |
| └── {rejection_sampling,best_of_16}/{Base,SFT,GRPO}/ | |
| qc_sweep/ QC for temperature sweep | |
| └── temp_{0.3,...,1.0}/{RL_vllm,GRPO}/ | |
| original_paper/ Frozen snapshots from the pre-revision draft | |
| ``` | |
| ## Key Results | |
| ### MFE Density (DNA parameters, kcal/mol/nt) | |
| All MFE numbers computed with ViennaRNA 2.7.2, Mathews 2004 DNA energy | |
| parameters. Short sequences (≤3 kb) folded circularly; longer sequences | |
| use a 500-bp sliding window (stride 250 bp). | |
| | Model | DNA MFE density | n | Notes | | |
| |---|---:|---:|---| | |
| | Base | −0.1055 ± 0.0756 | 4000 | baseline, T=0.95 | | |
| | SFT | −0.1492 ± 0.0284 | 4000 | T=0.95 (resampled 2026-04-21; previous upload was a Base duplicate) | | |
| | **GRPO (temp=1.0)** | **−0.1491 ± 0.0322** | 4000 | main paper model | | |
| | RL (full, McClain/PlasmidGPT-RL) | −0.1546 ± 0.0233 | 4000 | ablation-control run, T=0.95 | | |
| | RL (no repeat) | −0.141 ± 0.031 | 4000 | T=0.95 | | |
| | RL (no cassette) | −0.134 ± 0.048 | 4000 | T=0.95 | | |
| | RL (no length) | −0.131 ± 0.025 | 4000 | T=0.95 | | |
| | RL (length only) | −0.126 ± 0.021 | 4000 | T=0.95 | | |
| | RL (CDS only) | −0.103 ± 0.022 | 4000 | T=0.95 | | |
| Addgene reference panel (n=500): **−0.151 ± 0.027**. | |
| ### Ablation Study (QC pass rate, temp=0.95) | |
| | Model | Pass rate | Diversity | | |
| |-------|----------:|----------:| | |
| | Base | 3.6 % | 1.000 | | |
| | SFT | 19.7 % | — | | |
| | RL (full) | 53.7 % | 0.132 | | |
| | RL (no repeat) | 72.2 % | 0.446 | | |
| | RL (no length) | 71.4 % | 0.446 | | |
| | RL (length only) | 34.7 % | 0.837 | | |
| | RL (no cassette) | 19.8 % | 0.183 | | |
| | RL (CDS only) | 2.4 % | 1.000 | | |
| ### Main 8-prompt evaluation (temp=1.0) — paper Table 1 | |
| | Model | Pass rate | | |
| |-------|----------:| | |
| | Base (UCL-CSSB/PlasmidGPT) | 27.0 % | | |
| | SFT (UCL-CSSB/PlasmidGPT-SFT) | 27.2 % | | |
| | RL (UCL-CSSB/PlasmidGPT-GRPO) | 71.6 % | | |
| ### Rejection sampling baselines | |
| | Model | Rejection 10K | Best-of-16 | Diversity | | |
| |-------|--------------:|-----------:|----------:| | |
| | Base | 2.8 % | 2.9 % | 1.000 | | |
| | SFT | 2.5 % | 2.8 % | 1.000 | | |
| | GRPO | 64.6 % | 64.6 % | 0.549–0.581 | | |
| ### Sampling parameters | |
| | Experiment | Temperature | | |
| |---|---:| | |
| | Training rollouts (GRPO) | 1.229 | | |
| | Main 8-prompt evaluation (`eval_8prompt/`) | 1.0 | | |
| | Ablation evaluation (`generations/temp_0.95/`) | 0.95 | | |
| | Rejection sampling / best-of-16 (`baselines/`) | 0.95 | | |
| Other decoder settings are constant: top-p 0.90, repetition penalty 1.0, | |
| max 256 BPE tokens (≈ 5–15 kb of DNA), stop token id 2. | |
| ## Models | |
| | Label | HF repo | | |
| |---|---| | |
| | Base | UCL-CSSB/PlasmidGPT | | |
| | SFT | UCL-CSSB/PlasmidGPT-SFT | | |
| | RL (main paper model) | UCL-CSSB/PlasmidGPT-GRPO | | |
| | RL (ablation control) | McClain/PlasmidGPT-RL | | |
| | Ablations | McClain/plasmidgpt-rl-{cds_only,no_repeat_penalty,no_length_prior,no_cassette_bonus,length_only} | | |
| ## QC Pipeline | |
| - BLAST (dc-megablast) against the OriDB reference for ORI detection | |
| - AMRFinderPlus 4.2.7 for ARG detection | |
| - Prodigal 2.6.3 for gene prediction | |
| - Suffix-array repeat detection (≥50 bp) | |
| - Two-stage filter: ORI ≥99 % identity and coverage; AMR ≥100 % identity and coverage | |
| ## Data history | |
| - **2026-04-21** SFT resampled at T=0.95 (4000 seqs) and MFE recomputed. | |
| Previous uploads at `generations/temp_{0.8,0.95,1.1}/SFT/` were | |
| byte-identical to the corresponding Base outputs (pipeline bug during | |
| the March upload). The T=0.95 SFT data is correct; the 0.8 and 1.1 | |
| copies were deleted. | |
| - **2026-04-20** `eval_8prompt/SFT/SFT_metrics.csv` briefly held T=0.95 | |
| data during the SFT resample; restored to the original T=1.0 values | |
| (SHA-verified against an older local copy). | |
Xet Storage Details
- Size:
- 6.71 kB
- Xet hash:
- 6ac5c896e068df615b1afaecbf7523bdfe16351e1d8eeac68fda73f7b6840fd2
·
Xet efficiently stores files, intelligently splitting them into unique chunks and accelerating uploads and downloads. More info.