phase8_rl / _claude_memory /project_lab_ctrldna_baseline.md
explcre's picture
Upload _claude_memory/project_lab_ctrldna_baseline.md with huggingface_hub
28eaef2 verified
metadata
name: Lab implementing CtrlDNA baseline on mllm-integrate
description: >-
  Lab server is training CtrlDNA on our dataset; pushes will land on
  origin/mllm-integrate
type: project
originSessionId: 4037f43b-2133-46c6-84bd-02f7d454ec8b

Lab server is implementing CtrlDNA trained on our dataset, pushing to origin/mllm-integrate over the next while.

Why: CtrlDNA is one of the §5c baselines (alongside TACO, ATGC-Gen) that the paper acknowledges. Having CtrlDNA trained on the same data we use (not just paper-reported numbers) makes the comparison apples- to-apples.

How to apply:

  • Periodically git fetch origin and check origin/mllm-integrate for new commits — once CtrlDNA results land, integrate into EXPERIMENTS.md §5c (replace "acknowledged" with actual numbers on our test set).
  • Don't duplicate the work on H100. Lab owns this.
  • Branch hygiene: when lab pushes, FF origin/mllm-integrate into our mllm-integrate-server3, then PR to main (sandbox blocks direct pushes to main now).