File size: 12,917 Bytes
61fc39b e815b38 61fc39b daba1b9 2d47f4f e815b38 daba1b9 61fc39b e815b38 61fc39b daba1b9 e815b38 61fc39b daba1b9 6deaccc daba1b9 d58c100 daba1b9 d58c100 2d47f4f fe3a41d 88d9b78 e815b38 daba1b9 d58c100 c3a24db f238af4 daba1b9 61fc39b ba716cf 61fc39b ba716cf 2d47f4f 8bf0155 cdc237b 1c1f314 61fc39b daba1b9 61fc39b 2fccde8 61fc39b daba1b9 e815b38 61fc39b 2d47f4f 61fc39b daba1b9 e815b38 daba1b9 61fc39b 2d47f4f 6deaccc 2d47f4f e815b38 2d47f4f 61fc39b daba1b9 61fc39b daba1b9 61fc39b daba1b9 61fc39b daba1b9 e815b38 61fc39b daba1b9 61fc39b daba1b9 61fc39b d58c100 61fc39b daba1b9 cdc237b daba1b9 61fc39b fe3a41d 2d47f4f fe3a41d 2d47f4f fe3a41d 2d47f4f fe3a41d 2d47f4f 88d9b78 61fc39b e815b38 2d47f4f 6deaccc e815b38 61fc39b daba1b9 e815b38 918007b c3a24db 61fc39b c3a24db 61fc39b e815b38 61fc39b daba1b9 e815b38 61fc39b c3a24db 1c1f314 8bf0155 c3a24db 513a2e2 1c1f314 61fc39b 1c1f314 61fc39b daba1b9 e815b38 61fc39b 2fccde8 61fc39b daba1b9 cdc237b 2fccde8 ba716cf 2fccde8 ba716cf daba1b9 61fc39b daba1b9 61fc39b daba1b9 61fc39b d58c100 61fc39b daba1b9 61fc39b d58c100 61fc39b e815b38 daba1b9 61fc39b daba1b9 61fc39b daba1b9 61fc39b 513a2e2 61fc39b daba1b9 e815b38 61fc39b 2d47f4f 61fc39b 1c1f314 513a2e2 d58c100 88d9b78 918007b cdc237b 2fccde8 cdc237b | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 | # Fusion Design Lab TODO
This is the execution tracker for the hackathon repo.
Use this file for day-of build progress. Use the linked docs for rationale, contract truth, and submission framing:
- [Plan V2](docs/FUSION_DESIGN_LAB_PLAN_V2.md)
- [P1 Environment Contract](docs/P1_ENV_CONTRACT_V1.md)
- [P1 Parameterization Deep-Dive](docs/P1_PARAMETERIZATION_DEEPDIVE.md)
- [Repo Guardrails](AGENTS.md)
Archived legacy references:
- [P1 Pivot Record](docs/archive/PIVOT_P1_ROTATING_ELLIPSE.md)
- [Deliverables Map](docs/archive/FUSION_DELIVERABLES_MAP.md)
- [Next 12 Hours Checklist](docs/archive/FUSION_NEXT_12_HOURS_CHECKLIST.md)
Priority source:
- [Plan V2](docs/FUSION_DESIGN_LAB_PLAN_V2.md) is the planning SSOT
- [P1 Environment Contract](docs/P1_ENV_CONTRACT_V1.md) is the technical contract SSOT
- [P1 Parameterization Deep-Dive](docs/P1_PARAMETERIZATION_DEEPDIVE.md) is the evidence and rationale record
- this file should track execution progress only
## Current State
- [x] `P1` strategy is locked
- [x] shared models reflect the repaired low-dimensional `P1` contract
- [x] environment loop reflects the repaired low-dimensional `P1` contract
- [x] API/task surface reflects `P1`
- [x] baselines reflect the `P1` contract
- [x] repo docs call out the low-fi/high-fi `constellaration` split honestly
- [x] post-terminal guard in `step()`
- [x] `constellaration` verifier wiring
- [x] verify the current 3-knob family against the real low-fidelity verifier
- [x] repair the low-dimensional parameterization so triangularity is controllable
- [x] split boundary building from boundary evaluation
- [x] update the action schema from 3 knobs to the repaired low-dimensional family
- [x] add explicit VMEC failure semantics
- [x] label low-fi vs high-fi truth in the observation/task surface
- [x] separate high-fi submit scoring/reporting from low-fi rollout score state
- [x] tracked `P1` fixtures
- [x] manual playtest log
- [x] settle the non-submit terminal reward policy
- [x] baseline comparison has been re-run on the `constellaration` branch state
- [x] tiny low-fi PPO smoke run exists
Note:
`training/ppo_smoke.py` now runs a diagnostic-only low-fidelity PPO smoke pass and the first artifact is summarized in `docs/P1_PPO_SMOKE_NOTE.md`
- [x] refresh the heuristic baseline for the real verifier path
Note:
the refreshed heuristic now uses the measured `rotational_transform -> triangularity_scale -> elongation -> submit` path; a fresh `uv run python baselines/compare.py 5` rerun finished at `5/5` feasible high-fidelity finals and `5/5` wins over random
## Execution Graph
```mermaid
flowchart TD
A["Northflank Smoke Test"] --> E["Fixture Checks"]
B["P1 Contract Lock"] --> D["P1 Models + Environment"]
C["constellaration Physics Wiring"] --> D
D --> P["Parameterization Repair"]
P --> F["Tiny PPO Smoke"]
F --> E["Fixture Checks"]
E --> G["Submit-side Manual Playtest"]
G --> H["Reward V2"]
H --> I["Baselines"]
I --> J["HF Space Deploy"]
J --> K["Colab Notebook"]
K --> L["Demo + README"]
```
## Hour 0-2
- [x] Lock the exact `P1` environment contract
Goal:
freeze observation schema, action schema, episode loop, terminal conditions, and the live reward contract
Related:
[Plan V2](docs/FUSION_DESIGN_LAB_PLAN_V2.md),
[Next 12 Hours Checklist](docs/archive/FUSION_NEXT_12_HOURS_CHECKLIST.md)
- [x] Pass the Northflank smoke test
Related:
[Plan V2](docs/FUSION_DESIGN_LAB_PLAN_V2.md),
[Next 12 Hours Checklist](docs/archive/FUSION_NEXT_12_HOURS_CHECKLIST.md),
[training/notebooks/README.md](training/notebooks/README.md)
- [x] Verify that the current 3-knob family can or cannot approach P1 feasibility
Goal:
resolve the historical gating question about whether parameterization repair was required before more reward work
Related:
[P1 Environment Contract](docs/P1_ENV_CONTRACT_V1.md),
[P1 Pivot Record](docs/archive/PIVOT_P1_ROTATING_ELLIPSE.md)
## Fresh Wiring
- [x] Rewrite the shared models to the locked `P1` contract
Files:
[fusion_lab/models.py](fusion_lab/models.py),
[Plan V2](docs/FUSION_DESIGN_LAB_PLAN_V2.md)
- [x] Rewrite the environment loop to the locked `P1` contract
Files:
[server/environment.py](server/environment.py),
[Plan V2](docs/FUSION_DESIGN_LAB_PLAN_V2.md),
[P1 Pivot Record](docs/archive/PIVOT_P1_ROTATING_ELLIPSE.md)
- [x] Add a post-terminal guard to the environment loop
Files:
[server/environment.py](server/environment.py)
Goal:
reject or no-op any `step()` call after terminal state so budget and step count do not drift past episode end
- [x] Replace the synthetic physics path with `constellaration` wiring
Files:
[server/physics.py](server/physics.py),
[Dockerfile](Dockerfile),
[pyproject.toml](pyproject.toml)
- [x] Update the API/task surface to match `P1`
Files:
[server/app.py](server/app.py),
[README.md](README.md)
- [x] Repair the low-dimensional boundary family
Goal:
add an explicit triangularity control knob or equivalent low-dimensional control so the environment can actually approach P1 feasibility
Files:
[server/physics.py](server/physics.py),
[fusion_lab/models.py](fusion_lab/models.py),
[server/environment.py](server/environment.py),
[server/app.py](server/app.py)
Related:
[P1 Environment Contract](docs/P1_ENV_CONTRACT_V1.md)
- [x] Split boundary construction from boundary evaluation
Goal:
make the verifier boundary-based and keep parameterization-specific logic in the environment adapter layer
Files:
[server/physics.py](server/physics.py)
Related:
[P1 Environment Contract](docs/P1_ENV_CONTRACT_V1.md)
- [x] Add explicit VMEC failure semantics
Goal:
failed evaluations must cost budget, return a visible failure observation, and apply a documented penalty without silent fallback
Files:
[server/physics.py](server/physics.py),
[server/environment.py](server/environment.py)
Related:
[P1 Environment Contract](docs/P1_ENV_CONTRACT_V1.md)
- [x] Label low-fi vs high-fi truth in the observation/task surface
Goal:
make it obvious whether a metric came from a low-fidelity `run` step or a high-fidelity `submit`
Files:
[fusion_lab/models.py](fusion_lab/models.py),
[server/environment.py](server/environment.py),
[server/app.py](server/app.py)
Related:
[P1 Environment Contract](docs/P1_ENV_CONTRACT_V1.md)
- [x] Separate high-fi submit scoring/reporting from low-fi rollout score state
Completed:
submit-time reward now uses a high-fidelity initial reference, and submit summaries / displayed best score use high-fidelity state instead of low-fidelity rollout state
Files:
[server/environment.py](server/environment.py)
[fusion_lab/models.py](fusion_lab/models.py)
Related:
[P1 Environment Contract](docs/P1_ENV_CONTRACT_V1.md)
## Validation and Reward
- [x] Run a small measured sweep on the repaired low-dimensional family
Goal:
choose useful parameter ranges, step deltas, and reset seeds from the repaired action family instead of guessing them from prose
Related:
[P1 Environment Contract](docs/P1_ENV_CONTRACT_V1.md)
- [x] Clarify or split fidelity-dependent best-state observation fields
Goal:
replace ambiguous mixed best-state reporting with explicit low-fidelity and high-fidelity best-state fields before fixture evidence or baseline comparisons
Related:
[P1 Environment Contract](docs/P1_ENV_CONTRACT_V1.md)
- [x] Add 1-2 tracked `P1` fixtures
Files:
[server/data/p1/README.md](server/data/p1/README.md),
[P1 Pivot Record](docs/archive/PIVOT_P1_ROTATING_ELLIPSE.md)
Note:
paired high-fidelity submit checks are now written into each tracked fixture and summarized in `baselines/fixture_high_fidelity_pairs.json`
- [x] Run fixture sanity checks
Goal:
confirm paired low-fi/high-fi verifier outputs, objective direction, and reward ordering
Related:
[Plan V2](docs/FUSION_DESIGN_LAB_PLAN_V2.md),
[Next 12 Hours Checklist](docs/archive/FUSION_NEXT_12_HOURS_CHECKLIST.md)
- [x] Run a tiny low-fi PPO smoke pass
Goal:
fail quickly on learnability, reward exploits, and action-space problems before investing in longer training
Note:
treat this as a smoke test, not as proof that the terminal `submit` contract is already validated
stop after a few readable trajectories or one clear failure mode
paired high-fidelity fixture checks must happen immediately after this smoke pass
Status:
first smoke artifact exists; next use of this step should only happen if a follow-up reward or observation change needs re-checking
high-fidelity VMEC-backed `submit` should stay out of the normal RL inner loop
- [ ] Manual-playtest 5-10 episodes
Goal:
start with one submit-side trace, then expand the initial low-fidelity playtest note into 5-10 episodes and surface at least one pathology or ambiguity
Related:
[Plan V2](docs/FUSION_DESIGN_LAB_PLAN_V2.md),
[Deliverables Map](docs/archive/FUSION_DELIVERABLES_MAP.md)
- [x] Update reward from `V0` to `V1` after playtesting exposed a real repair-path pathology
Goal:
keep a short exploit -> fix -> behavior improvement story
Related:
[AGENTS.md](AGENTS.md),
[Plan V2](docs/FUSION_DESIGN_LAB_PLAN_V2.md)
- [x] Update reward from `V1` to `V2` after the verifier-native shaping exposed short-horizon gaps
Goal:
add bounded new-best, near-feasible, and anti-stagnation terms without breaking the verifier-native reward story
Related:
[AGENTS.md](AGENTS.md),
[P1 Environment Contract](docs/P1_ENV_CONTRACT_V1.md)
- [x] Write down why `Reward V0` did not survive unchanged
Goal:
document the concrete pathology: pure `Δ official_feasibility` hid useful non-dominant repairs because official feasibility is a max over normalized constraint violations
Related:
[README.md](README.md),
[Plan V2](docs/FUSION_DESIGN_LAB_PLAN_V2.md)
- [x] Decide the non-submit terminal reward policy
Goal:
budget exhaustion now yields a smaller end-of-episode reward than `submit`, so non-submitting agents still get terminal feedback without outranking explicit submit behavior
Files:
[server/environment.py](server/environment.py),
[README.md](README.md)
## Baselines
- [x] Implement the random baseline
Files:
[baselines/random_agent.py](baselines/random_agent.py),
[baselines/compare.py](baselines/compare.py)
- [x] Implement the heuristic baseline
Files:
[baselines/heuristic_agent.py](baselines/heuristic_agent.py),
[baselines/compare.py](baselines/compare.py)
- [x] Run the baseline comparison on the current `constellaration` branch state
Files:
[baselines/compare.py](baselines/compare.py)
- [ ] Refresh the heuristic baseline after the `constellaration` rerun
Goal:
the old synthetic-path heuristic no longer gives a useful anchor on the real verifier path; redesign it after manual playtesting
- [ ] Save one comparison trace that is presentation-ready
Goal:
show at least one stable trajectory and one heuristic-vs-random comparison
## Submission Surfaces
- [ ] Deploy the environment to HF Space
Related:
[Deliverables Map](docs/archive/FUSION_DELIVERABLES_MAP.md),
[README.md](README.md)
- [ ] Create the thin public Colab notebook
Files:
[training/notebooks/README.md](training/notebooks/README.md)
- [ ] Record the 1-minute demo
Goal:
explain `P1`, show one trajectory, show reward iteration, show baseline evidence
- [ ] Finalize the public README
Files:
[README.md](README.md)
- [ ] Only treat training evidence as submission-ready if low-fidelity gains survive sparse high-fidelity evaluation
Related:
[Plan V2](docs/FUSION_DESIGN_LAB_PLAN_V2.md),
[Next 12 Hours Checklist](docs/archive/FUSION_NEXT_12_HOURS_CHECKLIST.md)
## Guardrails
- [ ] Do not reopen `P1 + rotating-ellipse` strategy without a real blocker
- [ ] Do not pretend the current 3-knob family is sufficient for P1 after the verified triangularity blocker
- [ ] Do not guess repaired-family ranges, deltas, or budget changes without measurement
- [ ] Do not port the old `ai-sci-feasible-designs` harness
- [ ] Do not let notebook or demo work outrun environment evidence
- [ ] Do not let tiny low-fi smoke training replace paired high-fidelity checks or submit-side manual playtesting
- [ ] Do not move high-fidelity VMEC-backed `submit` into the normal RL inner loop
- [ ] Do not describe low-fidelity `run` metrics as equivalent to high-fidelity `submit` results
- [x] Do not compare high-fidelity submit scores against low-fidelity best/initial score state in the final story
- [ ] Do not describe the current baseline reset state as feasible or near-feasible
- [x] Do not force a new reward-version story until the previous reward version shows a real pathology
Note:
completed by recording the concrete `Reward V0` pathology before `Reward V1`, then recording the concrete short-horizon `Reward V1` gaps before `Reward V2`
|