Fusion Design Lab TODO
This is the execution tracker for the hackathon repo.
Use this file for day-of build progress. Use the linked docs for rationale, contract truth, and submission framing:
Archived legacy references:
Priority source:
- Plan V2 is the planning SSOT
- P1 Environment Contract is the technical contract SSOT
- P1 Parameterization Deep-Dive is the evidence and rationale record
- this file should track execution progress only
Current State
-
P1strategy is locked - shared models reflect the repaired low-dimensional
P1contract - environment loop reflects the repaired low-dimensional
P1contract - API/task surface reflects
P1 - baselines reflect the
P1contract - repo docs call out the low-fi/high-fi
constellarationsplit honestly - post-terminal guard in
step() -
constellarationverifier wiring - verify the current 3-knob family against the real low-fidelity verifier
- repair the low-dimensional parameterization so triangularity is controllable
- split boundary building from boundary evaluation
- update the action schema from 3 knobs to the repaired low-dimensional family
- add explicit VMEC failure semantics
- label low-fi vs high-fi truth in the observation/task surface
- separate high-fi submit scoring/reporting from low-fi rollout score state
- tracked
P1fixtures - manual playtest log
- settle the non-submit terminal reward policy
- baseline comparison has been re-run on the
constellarationbranch state - tiny low-fi PPO smoke run exists
Note:
training/ppo_smoke.pynow runs a diagnostic-only low-fidelity PPO smoke pass and the first artifact is summarized indocs/P1_PPO_SMOKE_NOTE.md - refresh the heuristic baseline for the real verifier path
Note:
the refreshed heuristic now uses the measured
rotational_transform -> triangularity_scale -> elongation -> submitpath; a freshuv run python baselines/compare.py 5rerun finished at5/5feasible high-fidelity finals and5/5wins over random
Execution Graph
flowchart TD
A["Northflank Smoke Test"] --> E["Fixture Checks"]
B["P1 Contract Lock"] --> D["P1 Models + Environment"]
C["constellaration Physics Wiring"] --> D
D --> P["Parameterization Repair"]
P --> F["Tiny PPO Smoke"]
F --> E["Fixture Checks"]
E --> G["Submit-side Manual Playtest"]
G --> H["Reward V2"]
H --> I["Baselines"]
I --> J["HF Space Deploy"]
J --> K["Colab Notebook"]
K --> L["Demo + README"]
Hour 0-2
Lock the exact
P1environment contract Goal: freeze observation schema, action schema, episode loop, terminal conditions, and the live reward contract Related: Plan V2, Next 12 Hours ChecklistPass the Northflank smoke test Related: Plan V2, Next 12 Hours Checklist, training/notebooks/README.md
Verify that the current 3-knob family can or cannot approach P1 feasibility Goal: resolve the historical gating question about whether parameterization repair was required before more reward work Related: P1 Environment Contract, P1 Pivot Record
Fresh Wiring
Rewrite the shared models to the locked
P1contract Files: fusion_lab/models.py, Plan V2Rewrite the environment loop to the locked
P1contract Files: server/environment.py, Plan V2, P1 Pivot RecordAdd a post-terminal guard to the environment loop Files: server/environment.py Goal: reject or no-op any
step()call after terminal state so budget and step count do not drift past episode endReplace the synthetic physics path with
constellarationwiring Files: server/physics.py, Dockerfile, pyproject.tomlUpdate the API/task surface to match
P1Files: server/app.py, README.mdRepair the low-dimensional boundary family Goal: add an explicit triangularity control knob or equivalent low-dimensional control so the environment can actually approach P1 feasibility Files: server/physics.py, fusion_lab/models.py, server/environment.py, server/app.py Related: P1 Environment Contract
Split boundary construction from boundary evaluation Goal: make the verifier boundary-based and keep parameterization-specific logic in the environment adapter layer Files: server/physics.py Related: P1 Environment Contract
Add explicit VMEC failure semantics Goal: failed evaluations must cost budget, return a visible failure observation, and apply a documented penalty without silent fallback Files: server/physics.py, server/environment.py Related: P1 Environment Contract
Label low-fi vs high-fi truth in the observation/task surface Goal: make it obvious whether a metric came from a low-fidelity
runstep or a high-fidelitysubmitFiles: fusion_lab/models.py, server/environment.py, server/app.py Related: P1 Environment ContractSeparate high-fi submit scoring/reporting from low-fi rollout score state Completed: submit-time reward now uses a high-fidelity initial reference, and submit summaries / displayed best score use high-fidelity state instead of low-fidelity rollout state Files: server/environment.py fusion_lab/models.py Related: P1 Environment Contract
Validation and Reward
Run a small measured sweep on the repaired low-dimensional family Goal: choose useful parameter ranges, step deltas, and reset seeds from the repaired action family instead of guessing them from prose Related: P1 Environment Contract
Clarify or split fidelity-dependent best-state observation fields Goal: replace ambiguous mixed best-state reporting with explicit low-fidelity and high-fidelity best-state fields before fixture evidence or baseline comparisons Related: P1 Environment Contract
Add 1-2 tracked
P1fixtures Files: server/data/p1/README.md, P1 Pivot Record Note: paired high-fidelity submit checks are now written into each tracked fixture and summarized inbaselines/fixture_high_fidelity_pairs.jsonRun fixture sanity checks Goal: confirm paired low-fi/high-fi verifier outputs, objective direction, and reward ordering Related: Plan V2, Next 12 Hours Checklist
Run a tiny low-fi PPO smoke pass Goal: fail quickly on learnability, reward exploits, and action-space problems before investing in longer training Note: treat this as a smoke test, not as proof that the terminal
submitcontract is already validated stop after a few readable trajectories or one clear failure mode paired high-fidelity fixture checks must happen immediately after this smoke pass Status: first smoke artifact exists; next use of this step should only happen if a follow-up reward or observation change needs re-checking high-fidelity VMEC-backedsubmitshould stay out of the normal RL inner loopManual-playtest 5-10 episodes Goal: start with one submit-side trace, then expand the initial low-fidelity playtest note into 5-10 episodes and surface at least one pathology or ambiguity Related: Plan V2, Deliverables Map
Update reward from
V0toV1after playtesting exposed a real repair-path pathology Goal: keep a short exploit -> fix -> behavior improvement story Related: AGENTS.md, Plan V2Update reward from
V1toV2after the verifier-native shaping exposed short-horizon gaps Goal: add bounded new-best, near-feasible, and anti-stagnation terms without breaking the verifier-native reward story Related: AGENTS.md, P1 Environment ContractWrite down why
Reward V0did not survive unchanged Goal: document the concrete pathology: pureΔ official_feasibilityhid useful non-dominant repairs because official feasibility is a max over normalized constraint violations Related: README.md, Plan V2Decide the non-submit terminal reward policy Goal: budget exhaustion now yields a smaller end-of-episode reward than
submit, so non-submitting agents still get terminal feedback without outranking explicit submit behavior Files: server/environment.py, README.md
Baselines
Implement the random baseline Files: baselines/random_agent.py, baselines/compare.py
Implement the heuristic baseline Files: baselines/heuristic_agent.py, baselines/compare.py
Run the baseline comparison on the current
constellarationbranch state Files: baselines/compare.pyRefresh the heuristic baseline after the
constellarationrerun Goal: the old synthetic-path heuristic no longer gives a useful anchor on the real verifier path; redesign it after manual playtestingSave one comparison trace that is presentation-ready Goal: show at least one stable trajectory and one heuristic-vs-random comparison
Submission Surfaces
Deploy the environment to HF Space Related: Deliverables Map, README.md
Create the thin public Colab notebook Files: training/notebooks/README.md
Record the 1-minute demo Goal: explain
P1, show one trajectory, show reward iteration, show baseline evidenceFinalize the public README Files: README.md
Only treat training evidence as submission-ready if low-fidelity gains survive sparse high-fidelity evaluation Related: Plan V2, Next 12 Hours Checklist
Guardrails
- Do not reopen
P1 + rotating-ellipsestrategy without a real blocker - Do not pretend the current 3-knob family is sufficient for P1 after the verified triangularity blocker
- Do not guess repaired-family ranges, deltas, or budget changes without measurement
- Do not port the old
ai-sci-feasible-designsharness - Do not let notebook or demo work outrun environment evidence
- Do not let tiny low-fi smoke training replace paired high-fidelity checks or submit-side manual playtesting
- Do not move high-fidelity VMEC-backed
submitinto the normal RL inner loop - Do not describe low-fidelity
runmetrics as equivalent to high-fidelitysubmitresults - Do not compare high-fidelity submit scores against low-fidelity best/initial score state in the final story
- Do not describe the current baseline reset state as feasible or near-feasible
- Do not force a new reward-version story until the previous reward version shows a real pathology
Note:
completed by recording the concrete
Reward V0pathology beforeReward V1, then recording the concrete short-horizonReward V1gaps beforeReward V2