Buckets:

UCL-CSSB
/

PlasmidRL-ICML

Files

xet

UCL-CSSB/PlasmidRL-ICML / deprecated /README_v1.md

McClain

15 days ago

preview code

download

raw

6.71 kB

PlasmidRL — ICML Revision Data

Experimental data for: Effects of Structural Reward Shaping on Biophysical Properties in RL-Trained Plasmid Generators

Best Model Configuration

UCL-CSSB/PlasmidGPT-GRPO at temperature=1.0 achieves the best quality-diversity tradeoff:

71.6% QC pass rate (ORI + AMR + no repeats)
0.573 diversity (1 − mean pairwise Jaccard of 21-mer MinHash)
−0.149 kcal/mol/nt MFE density (DNA parameters)
Mean sequence length: 6,517 bp

A separately trained McClain/PlasmidGPT-RL run at temperature=0.95 underperforms (53.7% pass rate, 0.132 diversity) and is retained only as an ablation baseline. It is not the model used for any headline result in the paper.

Bucket Structure

analysis/                              Summary metric CSVs
├── full_ablation_metrics.csv            8-model ablation comparison
├── baselines_qc_metrics.csv             rejection sampling QC results
├── rl_per_prompt_metrics.csv            RL per-prompt breakdown
├── rl_temp_sweep_final.csv              temperature sweep (RL)
└── mfe_summary_all.csv                  MFE across all models

baselines/                             Rejection sampling raw sequences
├── rejection_sampling/{Base,SFT,GRPO}/  10K samples each + metadata
└── best_of_16/{Base,SFT,GRPO}/          16K samples each + metadata

eval_8prompt/                          Main 8-prompt evaluation, T=1.0
└── {Base,SFT}/
    ├── *_metrics.csv                    per-sequence metrics (4,000 rows)
    └── *_summary.json                   aggregate stats

generations/                           Multi-temperature generation
├── temp_0.8/{Base,RL,RL_cds_only,RL_length_only,
│              RL_no_cassette,RL_no_length,RL_no_repeat}/
├── temp_0.95/{Base,SFT,RL,RL_cds_only,RL_length_only,
│              RL_no_cassette,RL_no_length,RL_no_repeat}/
└── temp_1.1/{Base,RL,RL_cds_only,RL_length_only,
              RL_no_cassette,RL_no_length,RL_no_repeat}/

                  NOTE: SFT generations exist only at temp_0.95
                  (resampled 2026-04-21). Earlier uploads at temp_0.8
                  and temp_1.1 were byte-identical duplicates of the
                  corresponding Base outputs and have been deleted.

generations_sweep/                     Temperature sweep (RL + GRPO)
├── temp_{0.3,0.5,0.7}/{RL,GRPO}/
├── temp_{0.9}/{GRPO}/
└── temp_{1.0}/{RL,GRPO}/

mfe/                                   ViennaRNA MFE density (DNA + RNA params)
└── {Base,SFT,RL,RL_cds_only,RL_length_only,RL_no_cassette,
     RL_no_length,RL_no_repeat,GRPO_temp1.0,GRPO_temp0.9}/
    ├── mfe_results.csv                  per-sequence MFE (both RNA and DNA params)
    └── mfe_summary.json                 mean ± std

qc_results/                            QC for ablation study (temp=0.95, 8 models)
└── {Base,SFT,RL,...}/passed.csv, failed.csv, aggregate_*.csv, repeats.csv

qc_baselines/                          QC for rejection sampling
└── {rejection_sampling,best_of_16}/{Base,SFT,GRPO}/

qc_sweep/                              QC for temperature sweep
└── temp_{0.3,...,1.0}/{RL_vllm,GRPO}/

original_paper/                        Frozen snapshots from the pre-revision draft

Key Results

MFE Density (DNA parameters, kcal/mol/nt)

All MFE numbers computed with ViennaRNA 2.7.2, Mathews 2004 DNA energy parameters. Short sequences (≤3 kb) folded circularly; longer sequences use a 500-bp sliding window (stride 250 bp).

Model	DNA MFE density	n	Notes
Base	−0.1055 ± 0.0756	4000	baseline, T=0.95
SFT	−0.1492 ± 0.0284	4000	T=0.95 (resampled 2026-04-21; previous upload was a Base duplicate)
GRPO (temp=1.0)	−0.1491 ± 0.0322	4000	main paper model
RL (full, McClain/PlasmidGPT-RL)	−0.1546 ± 0.0233	4000	ablation-control run, T=0.95
RL (no repeat)	−0.141 ± 0.031	4000	T=0.95
RL (no cassette)	−0.134 ± 0.048	4000	T=0.95
RL (no length)	−0.131 ± 0.025	4000	T=0.95
RL (length only)	−0.126 ± 0.021	4000	T=0.95
RL (CDS only)	−0.103 ± 0.022	4000	T=0.95

Addgene reference panel (n=500): −0.151 ± 0.027.

Ablation Study (QC pass rate, temp=0.95)

Model	Pass rate	Diversity
Base	3.6 %	1.000
SFT	19.7 %	—
RL (full)	53.7 %	0.132
RL (no repeat)	72.2 %	0.446
RL (no length)	71.4 %	0.446
RL (length only)	34.7 %	0.837
RL (no cassette)	19.8 %	0.183
RL (CDS only)	2.4 %	1.000

Main 8-prompt evaluation (temp=1.0) — paper Table 1

Model	Pass rate
Base (UCL-CSSB/PlasmidGPT)	27.0 %
SFT (UCL-CSSB/PlasmidGPT-SFT)	27.2 %
RL (UCL-CSSB/PlasmidGPT-GRPO)	71.6 %

Rejection sampling baselines

Model	Rejection 10K	Best-of-16	Diversity
Base	2.8 %	2.9 %	1.000
SFT	2.5 %	2.8 %	1.000
GRPO	64.6 %	64.6 %	0.549–0.581

Sampling parameters

Experiment	Temperature
Training rollouts (GRPO)	1.229
Main 8-prompt evaluation (`eval_8prompt/`)	1.0
Ablation evaluation (`generations/temp_0.95/`)	0.95
Rejection sampling / best-of-16 (`baselines/`)	0.95

Other decoder settings are constant: top-p 0.90, repetition penalty 1.0, max 256 BPE tokens (≈ 5–15 kb of DNA), stop token id 2.

Models

Label	HF repo
Base	UCL-CSSB/PlasmidGPT
SFT	UCL-CSSB/PlasmidGPT-SFT
RL (main paper model)	UCL-CSSB/PlasmidGPT-GRPO
RL (ablation control)	McClain/PlasmidGPT-RL
Ablations	McClain/plasmidgpt-rl-{cds_only,no_repeat_penalty,no_length_prior,no_cassette_bonus,length_only}

QC Pipeline

BLAST (dc-megablast) against the OriDB reference for ORI detection
AMRFinderPlus 4.2.7 for ARG detection
Prodigal 2.6.3 for gene prediction
Suffix-array repeat detection (≥50 bp)
Two-stage filter: ORI ≥99 % identity and coverage; AMR ≥100 % identity and coverage

Data history

2026-04-21 SFT resampled at T=0.95 (4000 seqs) and MFE recomputed. Previous uploads at generations/temp_{0.8,0.95,1.1}/SFT/ were byte-identical to the corresponding Base outputs (pipeline bug during the March upload). The T=0.95 SFT data is correct; the 0.8 and 1.1 copies were deleted.
2026-04-20 eval_8prompt/SFT/SFT_metrics.csv briefly held T=0.95 data during the SFT resample; restored to the original T=1.0 values (SHA-verified against an older local copy).

Xet Storage Details

Size:: 6.71 kB
Xet hash:: 6ac5c896e068df615b1afaecbf7523bdfe16351e1d8eeac68fda73f7b6840fd2

Xet efficiently stores files, intelligently splitting them into unique chunks and accelerating uploads and downloads. More info.