metadata
name: Lab implementing CtrlDNA baseline on mllm-integrate
description: >-
Lab server is training CtrlDNA on our dataset; pushes will land on
origin/mllm-integrate
type: project
originSessionId: 4037f43b-2133-46c6-84bd-02f7d454ec8b
Lab server is implementing CtrlDNA trained on our dataset, pushing to
origin/mllm-integrate over the next while.
Why: CtrlDNA is one of the §5c baselines (alongside TACO, ATGC-Gen) that the paper acknowledges. Having CtrlDNA trained on the same data we use (not just paper-reported numbers) makes the comparison apples- to-apples.
How to apply:
- Periodically
git fetch originand checkorigin/mllm-integratefor new commits — once CtrlDNA results land, integrate into EXPERIMENTS.md §5c (replace "acknowledged" with actual numbers on our test set). - Don't duplicate the work on H100. Lab owns this.
- Branch hygiene: when lab pushes, FF
origin/mllm-integrateinto ourmllm-integrate-server3, then PR tomain(sandbox blocks direct pushes to main now).