Buckets:

UCL-CSSB
/

PlasmidRL-ICML

11 days ago

1.09 kB

	# rejection_sampling_v2

	Rerun of the v1 baselines/rejection_sampling/ protocol at per-model
	optimal temperature (from this session's temperature sweep). Same
	two prompts (ATG + cfg.default_query GFP cassette), same 10K samples,
	same in-process scorer for the reward column, plus strict QC mirroring
	analysis2 thresholds (ORI ≥99% identity, AMR ≥100% identity, no ≥50bp
	direct repeats).

	Generated: 2026-04-30T19:08:34.790388Z

	## Per-cell results

	\| Cell \| Model \| T \| n \| strict-QC pass rate \| sha256(outputs.csv) \|
	\|---\|---\|---:\|---:\|---:\|---\|
	\| Base_t1 \| UCL-CSSB/PlasmidGPT \| 1.0 \| 10000 \| 5.82% \| `1a326eef8578e653…` \|
	\| GRPO_t1.15 \| UCL-CSSB/PlasmidGPT-GRPO \| 1.15 \| 10000 \| 19.55% \| `8a8738b1485a2043…` \|
	\| SFT_t1 \| UCL-CSSB/PlasmidGPT-SFT \| 1.0 \| 10000 \| 5.75% \| `fa586c04962e4552…` \|

	## v1 cross-check

	All v2 outputs.csv SHAs were verified to differ from the v1
	`baselines/rejection_sampling/{Base,SFT,GRPO}/outputs.csv` SHAs:

	- Base: `363e89d716c87e8a…`
	- SFT: `0921fb93c60b2fac…`
	- GRPO: `404d3cb55215ad70…`

	Generated by `scripts/launch_rejection_v2.sh`.

Xet efficiently stores files, intelligently splitting them into unique chunks and accelerating uploads and downloads. More info.