Buckets:
PlasmidRL — ICML Revision Data
Experimental data for: Effects of Structural Reward Shaping on Biophysical Properties in RL-Trained Plasmid Generators
Best Model Configuration
UCL-CSSB/PlasmidGPT-GRPO at temperature=1.0 achieves the best quality-diversity tradeoff:
- 71.6% QC pass rate (ORI + AMR + no repeats)
- 0.573 diversity (1 − mean pairwise Jaccard of 21-mer MinHash)
- −0.149 kcal/mol/nt MFE density (DNA parameters)
- Mean sequence length: 6,517 bp
A separately trained McClain/PlasmidGPT-RL run at temperature=0.95 underperforms
(53.7% pass rate, 0.132 diversity) and is retained only as an ablation baseline.
It is not the model used for any headline result in the paper.
Bucket Structure
analysis/ Summary metric CSVs
├── full_ablation_metrics.csv 8-model ablation comparison
├── baselines_qc_metrics.csv rejection sampling QC results
├── rl_per_prompt_metrics.csv RL per-prompt breakdown
├── rl_temp_sweep_final.csv temperature sweep (RL)
└── mfe_summary_all.csv MFE across all models
baselines/ Rejection sampling raw sequences
├── rejection_sampling/{Base,SFT,GRPO}/ 10K samples each + metadata
└── best_of_16/{Base,SFT,GRPO}/ 16K samples each + metadata
eval_8prompt/ Main 8-prompt evaluation, T=1.0
└── {Base,SFT}/
├── *_metrics.csv per-sequence metrics (4,000 rows)
└── *_summary.json aggregate stats
generations/ Multi-temperature generation
├── temp_0.8/{Base,RL,RL_cds_only,RL_length_only,
│ RL_no_cassette,RL_no_length,RL_no_repeat}/
├── temp_0.95/{Base,SFT,RL,RL_cds_only,RL_length_only,
│ RL_no_cassette,RL_no_length,RL_no_repeat}/
└── temp_1.1/{Base,RL,RL_cds_only,RL_length_only,
RL_no_cassette,RL_no_length,RL_no_repeat}/
NOTE: SFT generations exist only at temp_0.95
(resampled 2026-04-21). Earlier uploads at temp_0.8
and temp_1.1 were byte-identical duplicates of the
corresponding Base outputs and have been deleted.
generations_sweep/ Temperature sweep (RL + GRPO)
├── temp_{0.3,0.5,0.7}/{RL,GRPO}/
├── temp_{0.9}/{GRPO}/
└── temp_{1.0}/{RL,GRPO}/
mfe/ ViennaRNA MFE density (DNA + RNA params)
└── {Base,SFT,RL,RL_cds_only,RL_length_only,RL_no_cassette,
RL_no_length,RL_no_repeat,GRPO_temp1.0,GRPO_temp0.9}/
├── mfe_results.csv per-sequence MFE (both RNA and DNA params)
└── mfe_summary.json mean ± std
qc_results/ QC for ablation study (temp=0.95, 8 models)
└── {Base,SFT,RL,...}/passed.csv, failed.csv, aggregate_*.csv, repeats.csv
qc_baselines/ QC for rejection sampling
└── {rejection_sampling,best_of_16}/{Base,SFT,GRPO}/
qc_sweep/ QC for temperature sweep
└── temp_{0.3,...,1.0}/{RL_vllm,GRPO}/
original_paper/ Frozen snapshots from the pre-revision draft
Key Results
MFE Density (DNA parameters, kcal/mol/nt)
All MFE numbers computed with ViennaRNA 2.7.2, Mathews 2004 DNA energy parameters. Short sequences (≤3 kb) folded circularly; longer sequences use a 500-bp sliding window (stride 250 bp).
| Model | DNA MFE density | n | Notes |
|---|---|---|---|
| Base | −0.1055 ± 0.0756 | 4000 | baseline, T=0.95 |
| SFT | −0.1492 ± 0.0284 | 4000 | T=0.95 (resampled 2026-04-21; previous upload was a Base duplicate) |
| GRPO (temp=1.0) | −0.1491 ± 0.0322 | 4000 | main paper model |
| RL (full, McClain/PlasmidGPT-RL) | −0.1546 ± 0.0233 | 4000 | ablation-control run, T=0.95 |
| RL (no repeat) | −0.141 ± 0.031 | 4000 | T=0.95 |
| RL (no cassette) | −0.134 ± 0.048 | 4000 | T=0.95 |
| RL (no length) | −0.131 ± 0.025 | 4000 | T=0.95 |
| RL (length only) | −0.126 ± 0.021 | 4000 | T=0.95 |
| RL (CDS only) | −0.103 ± 0.022 | 4000 | T=0.95 |
Addgene reference panel (n=500): −0.151 ± 0.027.
Ablation Study (QC pass rate, temp=0.95)
| Model | Pass rate | Diversity |
|---|---|---|
| Base | 3.6 % | 1.000 |
| SFT | 19.7 % | — |
| RL (full) | 53.7 % | 0.132 |
| RL (no repeat) | 72.2 % | 0.446 |
| RL (no length) | 71.4 % | 0.446 |
| RL (length only) | 34.7 % | 0.837 |
| RL (no cassette) | 19.8 % | 0.183 |
| RL (CDS only) | 2.4 % | 1.000 |
Main 8-prompt evaluation (temp=1.0) — paper Table 1
| Model | Pass rate |
|---|---|
| Base (UCL-CSSB/PlasmidGPT) | 27.0 % |
| SFT (UCL-CSSB/PlasmidGPT-SFT) | 27.2 % |
| RL (UCL-CSSB/PlasmidGPT-GRPO) | 71.6 % |
Rejection sampling baselines
| Model | Rejection 10K | Best-of-16 | Diversity |
|---|---|---|---|
| Base | 2.8 % | 2.9 % | 1.000 |
| SFT | 2.5 % | 2.8 % | 1.000 |
| GRPO | 64.6 % | 64.6 % | 0.549–0.581 |
Sampling parameters
| Experiment | Temperature |
|---|---|
| Training rollouts (GRPO) | 1.229 |
Main 8-prompt evaluation (eval_8prompt/) |
1.0 |
Ablation evaluation (generations/temp_0.95/) |
0.95 |
Rejection sampling / best-of-16 (baselines/) |
0.95 |
Other decoder settings are constant: top-p 0.90, repetition penalty 1.0, max 256 BPE tokens (≈ 5–15 kb of DNA), stop token id 2.
Models
| Label | HF repo |
|---|---|
| Base | UCL-CSSB/PlasmidGPT |
| SFT | UCL-CSSB/PlasmidGPT-SFT |
| RL (main paper model) | UCL-CSSB/PlasmidGPT-GRPO |
| RL (ablation control) | McClain/PlasmidGPT-RL |
| Ablations | McClain/plasmidgpt-rl-{cds_only,no_repeat_penalty,no_length_prior,no_cassette_bonus,length_only} |
QC Pipeline
- BLAST (dc-megablast) against the OriDB reference for ORI detection
- AMRFinderPlus 4.2.7 for ARG detection
- Prodigal 2.6.3 for gene prediction
- Suffix-array repeat detection (≥50 bp)
- Two-stage filter: ORI ≥99 % identity and coverage; AMR ≥100 % identity and coverage
Data history
- 2026-04-21 SFT resampled at T=0.95 (4000 seqs) and MFE recomputed.
Previous uploads at
generations/temp_{0.8,0.95,1.1}/SFT/were byte-identical to the corresponding Base outputs (pipeline bug during the March upload). The T=0.95 SFT data is correct; the 0.8 and 1.1 copies were deleted. - 2026-04-20
eval_8prompt/SFT/SFT_metrics.csvbriefly held T=0.95 data during the SFT resample; restored to the original T=1.0 values (SHA-verified against an older local copy).
Xet Storage Details
- Size:
- 6.71 kB
- Xet hash:
- 6ac5c896e068df615b1afaecbf7523bdfe16351e1d8eeac68fda73f7b6840fd2
Xet efficiently stores files, intelligently splitting them into unique chunks and accelerating uploads and downloads. More info.