Buckets:

UCL-CSSB
/

PlasmidRL-ICML

UCL-CSSB/PlasmidRL-ICML / rejection_sampling_v2 /best_of_16

1.79 GB

497 files

Updated 20 days ago

Ctrl+K

Name	Size	Uploaded	Xet hash
Base		24 days ago	8 items
GRPO		24 days ago	8 items
README.md	1.01 kB xet	24 days ago	7e38a556
manifest.json	4.23 kB xet	24 days ago	c00a7ac7

README.md

best_of_16_v2

Best-of-16 selection at per-model optimal temperature, replacing the paper-tab:baselines best-of-16 numbers (which were at the suboptimal T=0.95 for all three models). Same protocol as v1 baselines/best_of_16/: 16K candidates per cell (8K per prompt × 2 prompts), keep argmax(reward) per 16-tuple, score the 1000 selected with the analysis2 strict QC pipeline.

Generated: 2026-05-01T10:12:26.691423Z

Per-cell results (analysis2 strict QC)

Cell	Model	T	n_selected	pass rate	sha256(outputs.csv)
Base_t1	UCL-CSSB/PlasmidGPT	1.0	1000	34.9%	`93e620f4dd7179ab…`
GRPO_t1.15	UCL-CSSB/PlasmidGPT-GRPO	1.15	1000	99.6%	`c82587be8b5c564e…`
SFT_t1	UCL-CSSB/PlasmidGPT-SFT	1.0	1000	32.4%	`5a8e9f1502b786cb…`

v1 cross-check

All v2 outputs.csv SHAs are distinct from v1 baselines/best_of_16/{Base,SFT,GRPO}:

Base: `9dc5e7aab90a15ca…`
SFT: `c96bdf4af04ce868…`
GRPO: `99527f852fcbe3f1…`

Total size: 1.79 GB

Files: 497

Last updated: May 5

Pre-warmed CDN: US EU US EU

best_of_16_v2

Per-cell results (analysis2 strict QC)

v1 cross-check

Contributors