CreativeEngineer commited on
Commit
fe3a41d
·
1 Parent(s): d22b376

feat: implement repaired p1 parameterization

Browse files
README.md CHANGED
@@ -24,8 +24,8 @@ Implementation status:
24
  - docs are aligned to fresh `P1` wiring in this repo
25
  - shared models, baselines, and server/client entry points now reflect the locked `P1` contract
26
  - the current environment uses `constellaration` for low-fidelity `run` steps and high-fidelity `submit` evaluation
27
- - the current 3-knob parameterization has been verified as blocked on P1 triangularity under the real verifier path
28
- - the next runtime work is parameterization repair, then fixtures, manual playtesting, heuristic refresh, and deployment evidence
29
 
30
  ## Execution Status
31
 
@@ -40,11 +40,11 @@ Implementation status:
40
  - [x] Add a runnable Northflank smoke workflow and note
41
  - [x] Pass the Northflank smoke test on the H100 workspace
42
  - [x] Verify the current 3-knob family against the real low-fidelity verifier
43
- - [ ] Add a custom low-dimensional boundary builder with an explicit triangularity control knob
44
- - [ ] Split boundary construction from boundary evaluation in `server/physics.py`
45
- - [ ] Update the action contract from 3 knobs to the repaired low-dimensional family
46
- - [ ] Add explicit VMEC failure semantics to the environment contract
47
- - [ ] Label low-fi `run` truth vs high-fi `submit` truth in observations and task docs
48
  - [ ] Add tracked `P1` fixtures under `server/data/p1/`
49
  - [ ] Run manual playtesting and record the first reward pathology
50
  - [ ] Refresh the heuristic baseline for the real verifier path
@@ -53,10 +53,10 @@ Implementation status:
53
  ## Known Gaps
54
 
55
  - The current 3-knob family is structurally blocked on P1 triangularity with the real verifier path. A sampled low-fidelity sweep kept `average_triangularity` at roughly `+0.004975` and `p1_feasibility` at roughly `1.00995`, with zero feasible samples. That means reward tuning is secondary until the parameterization is repaired.
56
- - `BASELINE_PARAMS` is not a near-feasible anchor on the real verifier path. The current low-fidelity measurement is roughly `p1_feasibility=1.01`, `average_triangularity=+0.005`, and `edge_iota_over_nfp=0.059`, so fixture discovery has to happen after parameterization repair, not before.
57
  - The repaired low-dimensional family still needs measured ranges and deltas. Do not narrate guessed `rotational_transform` bounds, `triangularity_scale` deltas, or a larger budget as validated facts until they are measured on the repaired environment.
58
  - `run` uses low-fidelity `constellaration` metrics, while `submit` re-evaluates the current design with high-fidelity `skip_qi`; do not present step-time metrics as final submission metrics.
59
- - The environment still needs explicit VMEC failure semantics. Failed evaluations should cost budget, produce a visible failure observation, and apply a documented penalty; they should not be silently swallowed.
60
  - Budget exhaustion now returns a smaller terminal reward than explicit `submit`; keep that asymmetry when tuning reward so agents still prefer deliberate submission.
61
  - The real-verifier baseline rerun showed the old heuristic is no longer useful as-is: over 5 seeded episodes, both agents stayed at `0.0` mean best score and the heuristic underperformed random on reward. The heuristic needs redesign after the repaired parameterization and manual playtesting.
62
 
@@ -117,17 +117,13 @@ uv sync --extra notebooks
117
 
118
  ## Immediate Next Steps
119
 
120
- 1. Repair the low-dimensional boundary parameterization so it can actually move P1 triangularity.
121
- 2. Split boundary construction from boundary evaluation in `server/physics.py`.
122
- 3. Add explicit VMEC failure semantics to the environment loop.
123
- 4. Update the environment contract to the repaired low-dimensional family and label low-fi vs high-fi truth clearly in observations.
124
- 5. Run a small measured sweep on the repaired family to choose useful ranges, deltas, and reset seeds.
125
- 6. Add tracked `P1` fixtures under `server/data/p1`.
126
- 7. Run manual playtest episodes and record the first real reward pathology, if any.
127
- 8. Refresh the heuristic baseline using manual playtest evidence, then save one comparison trace.
128
- 9. Use the passing Northflank H100 setup to produce remote traces and comparisons from the real verifier path.
129
- 10. Deploy the environment to HF Space.
130
- 11. Add the Colab notebook under `training/notebooks`.
131
 
132
  These are implementation steps, not another planning phase.
133
 
 
24
  - docs are aligned to fresh `P1` wiring in this repo
25
  - shared models, baselines, and server/client entry points now reflect the locked `P1` contract
26
  - the current environment uses `constellaration` for low-fidelity `run` steps and high-fidelity `submit` evaluation
27
+ - the repaired 4-knob low-dimensional family is now wired into the runtime path
28
+ - the next runtime work is measured sweep validation, fixtures, manual playtesting, heuristic refresh, and deployment evidence
29
 
30
  ## Execution Status
31
 
 
40
  - [x] Add a runnable Northflank smoke workflow and note
41
  - [x] Pass the Northflank smoke test on the H100 workspace
42
  - [x] Verify the current 3-knob family against the real low-fidelity verifier
43
+ - [x] Add a custom low-dimensional boundary builder with an explicit triangularity control knob
44
+ - [x] Split boundary construction from boundary evaluation in `server/physics.py`
45
+ - [x] Update the action contract from 3 knobs to the repaired low-dimensional family
46
+ - [x] Add explicit VMEC failure semantics to the environment contract
47
+ - [x] Label low-fi `run` truth vs high-fi `submit` truth in observations and task docs
48
  - [ ] Add tracked `P1` fixtures under `server/data/p1/`
49
  - [ ] Run manual playtesting and record the first reward pathology
50
  - [ ] Refresh the heuristic baseline for the real verifier path
 
53
  ## Known Gaps
54
 
55
  - The current 3-knob family is structurally blocked on P1 triangularity with the real verifier path. A sampled low-fidelity sweep kept `average_triangularity` at roughly `+0.004975` and `p1_feasibility` at roughly `1.00995`, with zero feasible samples. That means reward tuning is secondary until the parameterization is repaired.
56
+ - The repaired family now uses frozen exact seeds with explicit triangularity control. Those seeds are near-boundary references, not yet tracked fixtures.
57
  - The repaired low-dimensional family still needs measured ranges and deltas. Do not narrate guessed `rotational_transform` bounds, `triangularity_scale` deltas, or a larger budget as validated facts until they are measured on the repaired environment.
58
  - `run` uses low-fidelity `constellaration` metrics, while `submit` re-evaluates the current design with high-fidelity `skip_qi`; do not present step-time metrics as final submission metrics.
59
+ - VMEC failure semantics are now explicit in the runtime path. Failed evaluations cost budget, produce a visible failure observation, and apply a penalty.
60
  - Budget exhaustion now returns a smaller terminal reward than explicit `submit`; keep that asymmetry when tuning reward so agents still prefer deliberate submission.
61
  - The real-verifier baseline rerun showed the old heuristic is no longer useful as-is: over 5 seeded episodes, both agents stayed at `0.0` mean best score and the heuristic underperformed random on reward. The heuristic needs redesign after the repaired parameterization and manual playtesting.
62
 
 
117
 
118
  ## Immediate Next Steps
119
 
120
+ 1. Run a small measured sweep on the repaired family to choose useful ranges, deltas, and reset seeds.
121
+ 2. Add tracked `P1` fixtures under `server/data/p1`.
122
+ 3. Run manual playtest episodes and record the first real reward pathology, if any.
123
+ 4. Refresh the heuristic baseline using manual playtest evidence, then save one comparison trace.
124
+ 5. Use the passing Northflank H100 setup to produce remote traces and comparisons from the real verifier path.
125
+ 6. Deploy the environment to HF Space.
126
+ 7. Add the Colab notebook under `training/notebooks`.
 
 
 
 
127
 
128
  These are implementation steps, not another planning phase.
129
 
TODO.md CHANGED
@@ -28,11 +28,11 @@ Priority source:
28
  - [x] post-terminal guard in `step()`
29
  - [x] `constellaration` verifier wiring
30
  - [x] verify the current 3-knob family against the real low-fidelity verifier
31
- - [ ] repair the low-dimensional parameterization so triangularity is controllable
32
- - [ ] split boundary building from boundary evaluation
33
- - [ ] update the action schema from 3 knobs to the repaired low-dimensional family
34
- - [ ] add explicit VMEC failure semantics
35
- - [ ] label low-fi vs high-fi truth in the observation/task surface
36
  - [ ] tracked `P1` fixtures
37
  - [ ] manual playtest log
38
  - [x] settle the non-submit terminal reward policy
@@ -108,7 +108,7 @@ flowchart TD
108
  [server/app.py](server/app.py),
109
  [README.md](README.md)
110
 
111
- - [ ] Repair the low-dimensional boundary family
112
  Goal:
113
  add an explicit triangularity control knob or equivalent low-dimensional control so the environment can actually approach P1 feasibility
114
  Files:
@@ -119,7 +119,7 @@ flowchart TD
119
  Related:
120
  [P1 Environment Contract](docs/P1_ENV_CONTRACT_V1.md)
121
 
122
- - [ ] Split boundary construction from boundary evaluation
123
  Goal:
124
  make the verifier boundary-based and keep parameterization-specific logic in the environment adapter layer
125
  Files:
@@ -127,7 +127,7 @@ flowchart TD
127
  Related:
128
  [P1 Environment Contract](docs/P1_ENV_CONTRACT_V1.md)
129
 
130
- - [ ] Add explicit VMEC failure semantics
131
  Goal:
132
  failed evaluations must cost budget, return a visible failure observation, and apply a documented penalty without silent fallback
133
  Files:
@@ -136,7 +136,7 @@ flowchart TD
136
  Related:
137
  [P1 Environment Contract](docs/P1_ENV_CONTRACT_V1.md)
138
 
139
- - [ ] Label low-fi vs high-fi truth in the observation/task surface
140
  Goal:
141
  make it obvious whether a metric came from a low-fidelity `run` step or a high-fidelity `submit`
142
  Files:
 
28
  - [x] post-terminal guard in `step()`
29
  - [x] `constellaration` verifier wiring
30
  - [x] verify the current 3-knob family against the real low-fidelity verifier
31
+ - [x] repair the low-dimensional parameterization so triangularity is controllable
32
+ - [x] split boundary building from boundary evaluation
33
+ - [x] update the action schema from 3 knobs to the repaired low-dimensional family
34
+ - [x] add explicit VMEC failure semantics
35
+ - [x] label low-fi vs high-fi truth in the observation/task surface
36
  - [ ] tracked `P1` fixtures
37
  - [ ] manual playtest log
38
  - [x] settle the non-submit terminal reward policy
 
108
  [server/app.py](server/app.py),
109
  [README.md](README.md)
110
 
111
+ - [x] Repair the low-dimensional boundary family
112
  Goal:
113
  add an explicit triangularity control knob or equivalent low-dimensional control so the environment can actually approach P1 feasibility
114
  Files:
 
119
  Related:
120
  [P1 Environment Contract](docs/P1_ENV_CONTRACT_V1.md)
121
 
122
+ - [x] Split boundary construction from boundary evaluation
123
  Goal:
124
  make the verifier boundary-based and keep parameterization-specific logic in the environment adapter layer
125
  Files:
 
127
  Related:
128
  [P1 Environment Contract](docs/P1_ENV_CONTRACT_V1.md)
129
 
130
+ - [x] Add explicit VMEC failure semantics
131
  Goal:
132
  failed evaluations must cost budget, return a visible failure observation, and apply a documented penalty without silent fallback
133
  Files:
 
136
  Related:
137
  [P1 Environment Contract](docs/P1_ENV_CONTRACT_V1.md)
138
 
139
+ - [x] Label low-fi vs high-fi truth in the observation/task surface
140
  Goal:
141
  make it obvious whether a metric came from a low-fidelity `run` step or a high-fidelity `submit`
142
  Files:
baselines/README.md CHANGED
@@ -7,7 +7,7 @@ Random and heuristic baselines will live here.
7
  - [x] baseline comparison script exists
8
  - [x] baseline comparison rerun completed on the real verifier path
9
  - [x] verified that the current 3-knob family is blocked on P1 triangularity under the real verifier path
10
- - [ ] repair the low-dimensional parameterization before further heuristic work
11
  - [ ] wait for measured repaired-family ranges and reset seeds before retuning the heuristic
12
  - [ ] heuristic refreshed after the real-verifier rerun
13
  - [ ] near-boundary fixture-backed baseline start chosen for manual playtesting
 
7
  - [x] baseline comparison script exists
8
  - [x] baseline comparison rerun completed on the real verifier path
9
  - [x] verified that the current 3-knob family is blocked on P1 triangularity under the real verifier path
10
+ - [x] repair the low-dimensional parameterization before further heuristic work
11
  - [ ] wait for measured repaired-family ranges and reset seeds before retuning the heuristic
12
  - [ ] heuristic refreshed after the real-verifier rerun
13
  - [ ] near-boundary fixture-backed baseline start chosen for manual playtesting
baselines/heuristic_agent.py CHANGED
@@ -1,26 +1,12 @@
1
- """Heuristic baseline agent for the stellarator design environment.
2
-
3
- Strategy: guided perturbations informed by domain knowledge.
4
- 1. Push elongation upward to improve triangularity.
5
- 2. Nudge rotational transform upward to stay on the iota side of feasibility.
6
- 3. Submit before exhausting budget.
7
- """
8
 
9
  from __future__ import annotations
10
 
11
  import sys
12
 
13
- from fusion_lab.models import StellaratorAction
14
  from server.environment import StellaratorEnvironment
15
 
16
- STRATEGY: list[tuple[str, str, str]] = [
17
- ("elongation", "increase", "medium"),
18
- ("elongation", "increase", "small"),
19
- ("rotational_transform", "increase", "small"),
20
- ("aspect_ratio", "decrease", "small"),
21
- ("rotational_transform", "increase", "small"),
22
- ]
23
-
24
 
25
  def heuristic_episode(
26
  env: StellaratorEnvironment, seed: int | None = None
@@ -29,43 +15,74 @@ def heuristic_episode(
29
  total_reward = 0.0
30
  trace: list[dict[str, object]] = [{"step": 0, "score": obs.p1_score}]
31
 
32
- for parameter, direction, magnitude in STRATEGY:
33
- if obs.done or obs.budget_remaining <= 1:
34
- break
35
-
36
- action = StellaratorAction(
37
- intent="run",
38
- parameter=parameter,
39
- direction=direction,
40
- magnitude=magnitude,
41
- )
42
  obs = env.step(action)
43
  total_reward += obs.reward or 0.0
44
  trace.append(
45
  {
46
  "step": len(trace),
47
- "action": f"{parameter} {direction} {magnitude}",
48
  "score": obs.p1_score,
49
  "best_score": obs.best_score,
50
  "reward": obs.reward,
 
51
  }
52
  )
53
 
54
- if not obs.done:
55
- submit = StellaratorAction(intent="submit")
56
- obs = env.step(submit)
57
- total_reward += obs.reward or 0.0
58
- trace.append(
59
- {
60
- "step": len(trace),
61
- "action": "submit",
62
- "score": obs.p1_score,
63
- "best_score": obs.best_score,
64
- "reward": obs.reward,
65
- }
66
  )
67
 
68
- return total_reward, trace
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
69
 
70
 
71
  def main(n_episodes: int = 20) -> None:
 
1
+ """Heuristic baseline agent for the stellarator design environment."""
 
 
 
 
 
 
2
 
3
  from __future__ import annotations
4
 
5
  import sys
6
 
7
+ from fusion_lab.models import StellaratorAction, StellaratorObservation
8
  from server.environment import StellaratorEnvironment
9
 
 
 
 
 
 
 
 
 
10
 
11
  def heuristic_episode(
12
  env: StellaratorEnvironment, seed: int | None = None
 
15
  total_reward = 0.0
16
  trace: list[dict[str, object]] = [{"step": 0, "score": obs.p1_score}]
17
 
18
+ while not obs.done:
19
+ action = _choose_action(obs)
 
 
 
 
 
 
 
 
20
  obs = env.step(action)
21
  total_reward += obs.reward or 0.0
22
  trace.append(
23
  {
24
  "step": len(trace),
25
+ "action": _action_label(action),
26
  "score": obs.p1_score,
27
  "best_score": obs.best_score,
28
  "reward": obs.reward,
29
+ "failure": obs.evaluation_failed,
30
  }
31
  )
32
 
33
+ return total_reward, trace
34
+
35
+
36
+ def _choose_action(obs: StellaratorObservation) -> StellaratorAction:
37
+ if obs.constraints_satisfied:
38
+ if obs.budget_remaining <= 2:
39
+ return StellaratorAction(intent="submit")
40
+ return StellaratorAction(
41
+ intent="run",
42
+ parameter="elongation",
43
+ direction="decrease",
44
+ magnitude="small",
45
  )
46
 
47
+ if obs.evaluation_failed:
48
+ return StellaratorAction(intent="restore_best")
49
+
50
+ if obs.average_triangularity > -0.5:
51
+ return StellaratorAction(
52
+ intent="run",
53
+ parameter="triangularity_scale",
54
+ direction="increase",
55
+ magnitude="small",
56
+ )
57
+
58
+ if obs.edge_iota_over_nfp < 0.3:
59
+ return StellaratorAction(
60
+ intent="run",
61
+ parameter="rotational_transform",
62
+ direction="increase",
63
+ magnitude="small",
64
+ )
65
+
66
+ if obs.aspect_ratio > 4.0:
67
+ return StellaratorAction(
68
+ intent="run",
69
+ parameter="aspect_ratio",
70
+ direction="decrease",
71
+ magnitude="small",
72
+ )
73
+
74
+ return StellaratorAction(
75
+ intent="run",
76
+ parameter="elongation",
77
+ direction="decrease",
78
+ magnitude="small",
79
+ )
80
+
81
+
82
+ def _action_label(action: StellaratorAction) -> str:
83
+ if action.intent != "run":
84
+ return action.intent
85
+ return f"{action.parameter} {action.direction} {action.magnitude}"
86
 
87
 
88
  def main(n_episodes: int = 20) -> None:
baselines/random_agent.py CHANGED
@@ -8,7 +8,12 @@ import sys
8
  from fusion_lab.models import StellaratorAction
9
  from server.environment import StellaratorEnvironment
10
 
11
- PARAMETERS = ["aspect_ratio", "elongation", "rotational_transform"]
 
 
 
 
 
12
  DIRECTIONS = ["increase", "decrease"]
13
  MAGNITUDES = ["small", "medium", "large"]
14
 
 
8
  from fusion_lab.models import StellaratorAction
9
  from server.environment import StellaratorEnvironment
10
 
11
+ PARAMETERS = [
12
+ "aspect_ratio",
13
+ "elongation",
14
+ "rotational_transform",
15
+ "triangularity_scale",
16
+ ]
17
  DIRECTIONS = ["increase", "decrease"]
18
  MAGNITUDES = ["small", "medium", "large"]
19
 
docs/FUSION_DELIVERABLES_MAP.md CHANGED
@@ -14,9 +14,9 @@ Use this map to sequence execution, not to reopen already-locked task choices.
14
  - [x] Northflank smoke workflow and note are committed
15
  - [x] Northflank smoke test has passed on the team H100
16
  - [x] current 3-knob family has been verified as blocked on P1 triangularity
17
- - [ ] repaired low-dimensional boundary builder is implemented
18
- - [ ] explicit VMEC failure semantics are implemented
19
- - [ ] low-fi `run` truth vs high-fi `submit` truth is labeled clearly
20
  - [ ] tracked fixtures are checked in
21
  - [ ] manual playtest evidence exists
22
  - [ ] heuristic baseline has been refreshed for the real verifier path
 
14
  - [x] Northflank smoke workflow and note are committed
15
  - [x] Northflank smoke test has passed on the team H100
16
  - [x] current 3-knob family has been verified as blocked on P1 triangularity
17
+ - [x] repaired low-dimensional boundary builder is implemented
18
+ - [x] explicit VMEC failure semantics are implemented
19
+ - [x] low-fi `run` truth vs high-fi `submit` truth is labeled clearly
20
  - [ ] tracked fixtures are checked in
21
  - [ ] manual playtest evidence exists
22
  - [ ] heuristic baseline has been refreshed for the real verifier path
docs/FUSION_DESIGN_LAB_PLAN_V2.md CHANGED
@@ -7,7 +7,7 @@
7
  ## 0. Current Branch Status
8
 
9
  - [x] `P1` task family is locked
10
- - [x] 3-knob rotating-ellipse `P1` contract is implemented in code
11
  - [x] real `constellaration` verifier wiring is in place
12
  - [x] low-fidelity `run` plus high-fidelity `submit` split is documented
13
  - [x] post-terminal `step()` guard is in place
@@ -15,9 +15,9 @@
15
  - [x] Northflank smoke workflow and note are committed
16
  - [x] Northflank smoke test has passed on the team H100
17
  - [x] current 3-knob family has been checked against the real low-fidelity verifier
18
- - [ ] parameterization repair is implemented so triangularity is controllable
19
- - [ ] explicit VMEC failure semantics are implemented
20
- - [ ] low-fi `run` truth vs high-fi `submit` truth is labeled clearly in the environment surface
21
  - [ ] tracked `P1` fixtures are added
22
  - [ ] manual playtest evidence is recorded
23
  - [ ] heuristic baseline is refreshed for the real verifier path
@@ -25,7 +25,7 @@
25
 
26
  Current caution:
27
 
28
- - the current 3-knob family is structurally blocked on the official triangularity constraint under the real verifier path, so parameterization repair is now the first blocker before fixture discovery or manual playtesting
29
 
30
  ## 1. Submission Thesis
31
 
 
7
  ## 0. Current Branch Status
8
 
9
  - [x] `P1` task family is locked
10
+ - [x] repaired 4-knob low-dimensional `P1` contract is implemented in code
11
  - [x] real `constellaration` verifier wiring is in place
12
  - [x] low-fidelity `run` plus high-fidelity `submit` split is documented
13
  - [x] post-terminal `step()` guard is in place
 
15
  - [x] Northflank smoke workflow and note are committed
16
  - [x] Northflank smoke test has passed on the team H100
17
  - [x] current 3-knob family has been checked against the real low-fidelity verifier
18
+ - [x] parameterization repair is implemented so triangularity is controllable
19
+ - [x] explicit VMEC failure semantics are implemented
20
+ - [x] low-fi `run` truth vs high-fi `submit` truth is labeled clearly in the environment surface
21
  - [ ] tracked `P1` fixtures are added
22
  - [ ] manual playtest evidence is recorded
23
  - [ ] heuristic baseline is refreshed for the real verifier path
 
25
 
26
  Current caution:
27
 
28
+ - the repaired family is now live, but the exact ranges, deltas, and reset seeds still need a measured sweep before they should be treated as stable defaults
29
 
30
  ## 1. Submission Thesis
31
 
docs/FUSION_NEXT_12_HOURS_CHECKLIST.md CHANGED
@@ -9,7 +9,7 @@ Do not expand scope beyond one stable task. Training is supporting evidence, not
9
  ## Current Branch Status
10
 
11
  - [x] `P1` task is locked
12
- - [x] 3-knob rotating-ellipse `P1` contract is implemented in the working tree
13
  - [x] baselines and API surface have been moved to the `P1` contract
14
  - [x] add a post-terminal guard in `step()`
15
  - [x] replace the synthetic evaluator with `constellaration`
@@ -17,15 +17,15 @@ Do not expand scope beyond one stable task. Training is supporting evidence, not
17
  - [x] commit the Northflank smoke workflow and note
18
  - [x] pass the Northflank smoke test on the team H100
19
  - [x] verify that the current 3-knob family is blocked on P1 triangularity under the real verifier path
20
- - [ ] repair the low-dimensional parameterization
21
- - [ ] add explicit VMEC failure semantics
22
- - [ ] label low-fi `run` truth vs high-fi `submit` truth in the task surface
23
  - [ ] add tracked fixtures and manual playtest evidence
24
  - [ ] refresh the heuristic baseline after the real-verifier rerun
25
 
26
  Current caution:
27
 
28
- - do not assume the current 3-knob family is a viable playtest start; parameterization repair comes before fixture discovery, manual playtesting, and heuristic refresh
29
 
30
  ## Plan V2 Inheritance
31
 
 
9
  ## Current Branch Status
10
 
11
  - [x] `P1` task is locked
12
+ - [x] repaired 4-knob low-dimensional `P1` contract is implemented in the working tree
13
  - [x] baselines and API surface have been moved to the `P1` contract
14
  - [x] add a post-terminal guard in `step()`
15
  - [x] replace the synthetic evaluator with `constellaration`
 
17
  - [x] commit the Northflank smoke workflow and note
18
  - [x] pass the Northflank smoke test on the team H100
19
  - [x] verify that the current 3-knob family is blocked on P1 triangularity under the real verifier path
20
+ - [x] repair the low-dimensional parameterization
21
+ - [x] add explicit VMEC failure semantics
22
+ - [x] label low-fi `run` truth vs high-fi `submit` truth in the task surface
23
  - [ ] add tracked fixtures and manual playtest evidence
24
  - [ ] refresh the heuristic baseline after the real-verifier rerun
25
 
26
  Current caution:
27
 
28
+ - do not assume the first repaired defaults are final; run a measured sweep before treating ranges, deltas, or reset seeds as stable
29
 
30
  ## Plan V2 Inheritance
31
 
docs/P1_ENV_CONTRACT_V1.md CHANGED
@@ -1,6 +1,6 @@
1
  # P1 Environment Contract V1
2
 
3
- **Status:** Technical revision plan over a partial implementation
4
  **Role:** Supporting spec for the `P1` environment contract
5
  **SSOT relationship:** This file refines [FUSION_DESIGN_LAB_PLAN_V2.md](FUSION_DESIGN_LAB_PLAN_V2.md). If this file conflicts with the planning SSOT, update both in the same task.
6
 
@@ -17,7 +17,7 @@ The central change is now explicit:
17
 
18
  - the current upstream 3-knob rotating-ellipse family is blocked on P1 triangularity under the real verifier path
19
  - the next environment contract must repair parameterization before more reward iteration or heuristic work
20
- - the current repo still exposes the old 3-knob surface and needs to be revised to this 4-knob target
21
 
22
  ## Verified Blocker
23
 
@@ -62,9 +62,9 @@ Keep three layers separate:
62
 
63
  Current repo state:
64
 
65
- - the live code still exposes `evaluate_params(...)`
66
- - boundary construction and evaluation are not yet split cleanly
67
- - the verifier rewrite in this file is still pending
68
 
69
  Target functions:
70
 
@@ -129,11 +129,11 @@ This keeps the environment human-playable and aligned with the historical low-di
129
 
130
  Current repo state:
131
 
132
- - the live action schema still exposes only:
133
  - `aspect_ratio`
134
  - `elongation`
135
  - `rotational_transform`
136
- - the fourth low-dimensional control is still pending
137
 
138
  ## Observation Contract
139
 
@@ -167,8 +167,8 @@ The minimum requirement is that a reader can tell whether a metric came from low
167
 
168
  Current repo state:
169
 
170
- - the live observation surface still presents a single `p1_score` / `p1_feasibility` view
171
- - the environment and `/task` surface still need an explicit low-fi vs high-fi distinction
172
 
173
  ## Reward V0
174
 
 
1
  # P1 Environment Contract V1
2
 
3
+ **Status:** Technical contract with partial implementation now landed
4
  **Role:** Supporting spec for the `P1` environment contract
5
  **SSOT relationship:** This file refines [FUSION_DESIGN_LAB_PLAN_V2.md](FUSION_DESIGN_LAB_PLAN_V2.md). If this file conflicts with the planning SSOT, update both in the same task.
6
 
 
17
 
18
  - the current upstream 3-knob rotating-ellipse family is blocked on P1 triangularity under the real verifier path
19
  - the next environment contract must repair parameterization before more reward iteration or heuristic work
20
+ - the runtime now exposes the repaired 4-knob target, but measured sweep validation and fixture calibration are still pending
21
 
22
  ## Verified Blocker
23
 
 
62
 
63
  Current repo state:
64
 
65
+ - the live code now exposes a boundary builder plus boundary-based evaluator
66
+ - explicit failure results are returned when VMEC evaluation fails
67
+ - measured sweep validation is still pending
68
 
69
  Target functions:
70
 
 
129
 
130
  Current repo state:
131
 
132
+ - the live action schema now exposes:
133
  - `aspect_ratio`
134
  - `elongation`
135
  - `rotational_transform`
136
+ - `triangularity_scale`
137
 
138
  ## Observation Contract
139
 
 
167
 
168
  Current repo state:
169
 
170
+ - the live observation surface now exposes evaluation fidelity and failure state
171
+ - the exact naming can still be refined after playtesting, but low-fi vs high-fi is no longer implicit
172
 
173
  ## Reward V0
174
 
docs/P1_PARAMETERIZATION_DEEPDIVE.md CHANGED
@@ -1,7 +1,7 @@
1
  # P1 Parameterization Deep-Dive
2
 
3
  **Date:** 2026-03-07
4
- **Status:** Findings complete. Partial implementation exists; parameterization repair pending.
5
 
6
  This document records the investigation into why the current 3-knob rotating-ellipse
7
  environment cannot produce P1-feasible designs, what the original winning session
 
1
  # P1 Parameterization Deep-Dive
2
 
3
  **Date:** 2026-03-07
4
+ **Status:** Findings complete. Parameterization repair is implemented; measured sweep follow-up is pending.
5
 
6
  This document records the investigation into why the current 3-knob rotating-ellipse
7
  environment cannot produce P1-feasible designs, what the original winning session
fusion_lab/models.py CHANGED
@@ -6,15 +6,22 @@ from openenv.core import Action, Observation, State
6
  from pydantic import BaseModel, Field
7
 
8
  ActionIntent = Literal["run", "submit", "restore_best"]
9
- ParameterName = Literal["aspect_ratio", "elongation", "rotational_transform"]
 
 
 
 
 
10
  DirectionName = Literal["increase", "decrease"]
11
  MagnitudeName = Literal["small", "medium", "large"]
 
12
 
13
 
14
- class RotatingEllipseParams(BaseModel):
15
  aspect_ratio: float
16
  elongation: float
17
  rotational_transform: float
 
18
 
19
 
20
  class StellaratorAction(Action):
@@ -34,6 +41,9 @@ class StellaratorObservation(Observation):
34
  p1_score: float = 0.0
35
  p1_feasibility: float = 0.0
36
  vacuum_well: float = 0.0
 
 
 
37
  step_number: int = 0
38
  budget_remaining: int = 6
39
  best_score: float = 0.0
@@ -43,18 +53,20 @@ class StellaratorObservation(Observation):
43
 
44
 
45
  class StellaratorState(State):
46
- current_params: RotatingEllipseParams = Field(
47
- default_factory=lambda: RotatingEllipseParams(
48
- aspect_ratio=3.5,
49
- elongation=1.5,
50
- rotational_transform=0.4,
 
51
  )
52
  )
53
- best_params: RotatingEllipseParams = Field(
54
- default_factory=lambda: RotatingEllipseParams(
55
- aspect_ratio=3.5,
56
- elongation=1.5,
57
- rotational_transform=0.4,
 
58
  )
59
  )
60
  initial_score: float = 0.0
 
6
  from pydantic import BaseModel, Field
7
 
8
  ActionIntent = Literal["run", "submit", "restore_best"]
9
+ ParameterName = Literal[
10
+ "aspect_ratio",
11
+ "elongation",
12
+ "rotational_transform",
13
+ "triangularity_scale",
14
+ ]
15
  DirectionName = Literal["increase", "decrease"]
16
  MagnitudeName = Literal["small", "medium", "large"]
17
+ EvaluationFidelityName = Literal["low", "high"]
18
 
19
 
20
+ class LowDimBoundaryParams(BaseModel):
21
  aspect_ratio: float
22
  elongation: float
23
  rotational_transform: float
24
+ triangularity_scale: float
25
 
26
 
27
  class StellaratorAction(Action):
 
41
  p1_score: float = 0.0
42
  p1_feasibility: float = 0.0
43
  vacuum_well: float = 0.0
44
+ evaluation_fidelity: EvaluationFidelityName = "low"
45
+ evaluation_failed: bool = False
46
+ failure_reason: str = ""
47
  step_number: int = 0
48
  budget_remaining: int = 6
49
  best_score: float = 0.0
 
53
 
54
 
55
  class StellaratorState(State):
56
+ current_params: LowDimBoundaryParams = Field(
57
+ default_factory=lambda: LowDimBoundaryParams(
58
+ aspect_ratio=3.6,
59
+ elongation=1.4,
60
+ rotational_transform=1.6,
61
+ triangularity_scale=0.55,
62
  )
63
  )
64
+ best_params: LowDimBoundaryParams = Field(
65
+ default_factory=lambda: LowDimBoundaryParams(
66
+ aspect_ratio=3.6,
67
+ elongation=1.4,
68
+ rotational_transform=1.6,
69
+ triangularity_scale=0.55,
70
  )
71
  )
72
  initial_score: float = 0.0
server/app.py CHANGED
@@ -16,7 +16,10 @@ app = create_fastapi_app(
16
  @app.get("/task")
17
  def task_summary() -> dict[str, object]:
18
  return {
19
- "description": "Optimize the P1 benchmark with a rotating-ellipse parameterization.",
 
 
 
20
  "constraints": {
21
  "aspect_ratio_max": ASPECT_RATIO_MAX,
22
  "average_triangularity_max": AVERAGE_TRIANGULARITY_MAX,
@@ -25,9 +28,18 @@ def task_summary() -> dict[str, object]:
25
  "n_field_periods": N_FIELD_PERIODS,
26
  "budget": BUDGET,
27
  "actions": ["run", "submit", "restore_best"],
28
- "parameters": ["aspect_ratio", "elongation", "rotational_transform"],
 
 
 
 
 
29
  "directions": ["increase", "decrease"],
30
  "magnitudes": ["small", "medium", "large"],
 
 
 
 
31
  }
32
 
33
 
 
16
  @app.get("/task")
17
  def task_summary() -> dict[str, object]:
18
  return {
19
+ "description": (
20
+ "Optimize the P1 benchmark with a custom low-dimensional boundary family "
21
+ "derived from a rotating-ellipse seed."
22
+ ),
23
  "constraints": {
24
  "aspect_ratio_max": ASPECT_RATIO_MAX,
25
  "average_triangularity_max": AVERAGE_TRIANGULARITY_MAX,
 
28
  "n_field_periods": N_FIELD_PERIODS,
29
  "budget": BUDGET,
30
  "actions": ["run", "submit", "restore_best"],
31
+ "parameters": [
32
+ "aspect_ratio",
33
+ "elongation",
34
+ "rotational_transform",
35
+ "triangularity_scale",
36
+ ],
37
  "directions": ["increase", "decrease"],
38
  "magnitudes": ["small", "medium", "large"],
39
+ "evaluation_modes": {
40
+ "run": "low-fidelity constellaration evaluation",
41
+ "submit": "high-fidelity constellaration evaluation",
42
+ },
43
  }
44
 
45
 
server/environment.py CHANGED
@@ -1,12 +1,11 @@
1
  from __future__ import annotations
2
 
3
- from random import Random
4
  from typing import Any, Final, Optional
5
 
6
  from openenv.core import Environment as BaseEnvironment
7
 
8
  from fusion_lab.models import (
9
- RotatingEllipseParams,
10
  StellaratorAction,
11
  StellaratorObservation,
12
  StellaratorState,
@@ -17,37 +16,58 @@ from server.physics import (
17
  EDGE_IOTA_OVER_NFP_MIN,
18
  FEASIBILITY_TOLERANCE,
19
  EvaluationMetrics,
20
- evaluate_params,
 
21
  )
22
 
23
  BUDGET: Final[int] = 6
24
  N_FIELD_PERIODS: Final[int] = 3
25
 
26
  PARAMETER_RANGES: Final[dict[str, tuple[float, float]]] = {
27
- "aspect_ratio": (2.0, 8.0),
28
- "elongation": (1.0, 5.0),
29
- "rotational_transform": (0.1, 1.0),
 
30
  }
31
 
32
  PARAMETER_DELTAS: Final[dict[str, dict[str, float]]] = {
33
- "aspect_ratio": {"small": 0.1, "medium": 0.3, "large": 0.8},
34
- "elongation": {"small": 0.1, "medium": 0.3, "large": 0.8},
35
- "rotational_transform": {"small": 0.02, "medium": 0.05, "large": 0.15},
 
36
  }
37
 
38
- BASELINE_PARAMS: Final[RotatingEllipseParams] = RotatingEllipseParams(
39
- aspect_ratio=3.5,
40
- elongation=1.5,
41
- rotational_transform=0.4,
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
42
  )
43
 
44
  TARGET_SPEC: Final[str] = (
45
- "Optimize the P1 benchmark using a rotating-ellipse parameterization. "
46
- "Constraints: aspect ratio <= 4.0, average triangularity <= -0.5, "
47
- "edge rotational transform / n_field_periods >= 0.3. "
 
48
  "Budget: 6 evaluations."
49
  )
50
 
 
 
51
 
52
  class StellaratorEnvironment(
53
  BaseEnvironment[StellaratorAction, StellaratorObservation, StellaratorState]
@@ -56,6 +76,7 @@ class StellaratorEnvironment(
56
  super().__init__()
57
  self._state = StellaratorState()
58
  self._last_metrics: EvaluationMetrics | None = None
 
59
 
60
  def reset(
61
  self,
@@ -64,11 +85,7 @@ class StellaratorEnvironment(
64
  **kwargs: Any,
65
  ) -> StellaratorObservation:
66
  params = self._initial_params(seed)
67
- metrics = evaluate_params(
68
- params,
69
- n_field_periods=N_FIELD_PERIODS,
70
- fidelity="low",
71
- )
72
  self._state = StellaratorState(
73
  episode_id=episode_id,
74
  step_count=0,
@@ -83,9 +100,10 @@ class StellaratorEnvironment(
83
  constraints_satisfied=metrics.constraints_satisfied,
84
  )
85
  self._last_metrics = metrics
 
86
  return self._build_observation(
87
  metrics,
88
- action_summary="Episode started from the rotating-ellipse baseline.",
89
  )
90
 
91
  def step(
@@ -95,14 +113,13 @@ class StellaratorEnvironment(
95
  **kwargs: Any,
96
  ) -> StellaratorObservation:
97
  if self._state.episode_done or self._state.budget_remaining <= 0:
98
- metrics = self._last_metrics or evaluate_params(
99
  self._state.current_params,
100
- n_field_periods=N_FIELD_PERIODS,
101
  fidelity="low",
102
  )
103
  return self._build_observation(
104
  metrics,
105
- action_summary=("Episode already ended. Call reset() before sending more actions."),
106
  reward=0.0,
107
  done=True,
108
  )
@@ -119,10 +136,6 @@ class StellaratorEnvironment(
119
  def state(self) -> StellaratorState:
120
  return self._state
121
 
122
- # ------------------------------------------------------------------
123
- # Action handlers
124
- # ------------------------------------------------------------------
125
-
126
  def _handle_run(self, action: StellaratorAction) -> StellaratorObservation:
127
  if not all([action.parameter, action.direction, action.magnitude]):
128
  return self._handle_invalid_run()
@@ -134,11 +147,7 @@ class StellaratorEnvironment(
134
  direction=action.direction,
135
  magnitude=action.magnitude,
136
  )
137
- metrics = evaluate_params(
138
- params,
139
- n_field_periods=N_FIELD_PERIODS,
140
- fidelity="low",
141
- )
142
  self._state.current_params = params
143
  self._state.constraints_satisfied = metrics.constraints_satisfied
144
  self._update_best(params, metrics)
@@ -148,6 +157,8 @@ class StellaratorEnvironment(
148
  summary = self._summary_run(action, metrics)
149
  self._state.history.append(summary)
150
  self._last_metrics = metrics
 
 
151
  self._state.episode_done = done
152
 
153
  return self._build_observation(
@@ -158,16 +169,14 @@ class StellaratorEnvironment(
158
  )
159
 
160
  def _handle_submit(self) -> StellaratorObservation:
161
- metrics = evaluate_params(
162
- self._state.current_params,
163
- n_field_periods=N_FIELD_PERIODS,
164
- fidelity="high",
165
- )
166
  reward = self._compute_reward(metrics, "submit", done=True)
167
  summary = self._summary_submit(metrics)
168
  self._state.history.append(summary)
169
  self._state.episode_done = True
170
  self._last_metrics = metrics
 
 
171
 
172
  return self._build_observation(
173
  metrics,
@@ -179,21 +188,16 @@ class StellaratorEnvironment(
179
  def _handle_restore(self) -> StellaratorObservation:
180
  self._state.budget_remaining -= 1
181
  self._state.current_params = self._state.best_params
182
- metrics = evaluate_params(
183
- self._state.current_params,
184
- n_field_periods=N_FIELD_PERIODS,
185
- fidelity="low",
186
- )
187
  self._state.constraints_satisfied = metrics.constraints_satisfied
188
 
189
  done = self._state.budget_remaining <= 0
190
  reward = self._compute_reward(metrics, "restore_best", done)
191
- summary = (
192
- "Restored the best-known design. "
193
- f"Score={metrics.p1_score:.6f}, feasibility={metrics.p1_feasibility:.6f}."
194
- )
195
  self._state.history.append(summary)
196
  self._last_metrics = metrics
 
 
197
  self._state.episode_done = done
198
 
199
  return self._build_observation(
@@ -205,9 +209,8 @@ class StellaratorEnvironment(
205
 
206
  def _handle_invalid_run(self) -> StellaratorObservation:
207
  self._state.budget_remaining -= 1
208
- metrics = self._last_metrics or evaluate_params(
209
  self._state.current_params,
210
- n_field_periods=N_FIELD_PERIODS,
211
  fidelity="low",
212
  )
213
  done = self._state.budget_remaining <= 0
@@ -221,17 +224,23 @@ class StellaratorEnvironment(
221
  done=done,
222
  )
223
 
224
- # ------------------------------------------------------------------
225
- # Reward V0
226
- # ------------------------------------------------------------------
227
-
228
  def _compute_reward(
229
  self,
230
  metrics: EvaluationMetrics,
231
  intent: str,
232
  done: bool,
233
  ) -> float:
234
- previous_metrics = self._last_metrics or metrics
 
 
 
 
 
 
 
 
 
 
235
  reward = 0.0
236
 
237
  if metrics.constraints_satisfied and not previous_metrics.constraints_satisfied:
@@ -264,10 +273,6 @@ class StellaratorEnvironment(
264
 
265
  return round(reward, 4)
266
 
267
- # ------------------------------------------------------------------
268
- # Observation builders
269
- # ------------------------------------------------------------------
270
-
271
  def _build_observation(
272
  self,
273
  metrics: EvaluationMetrics,
@@ -278,15 +283,23 @@ class StellaratorEnvironment(
278
  text_lines = [
279
  action_summary,
280
  "",
281
- f"max_elongation={metrics.max_elongation:.4f} | best_score={self._state.best_score:.6f}",
282
- f"aspect_ratio={metrics.aspect_ratio:.4f} (<= {ASPECT_RATIO_MAX:.1f})",
283
- f"average_triangularity={metrics.average_triangularity:.4f} (<= {AVERAGE_TRIANGULARITY_MAX:.1f})",
284
- f"edge_iota_over_nfp={metrics.edge_iota_over_nfp:.4f} (>= {EDGE_IOTA_OVER_NFP_MIN:.1f})",
285
- f"feasibility={metrics.p1_feasibility:.6f} | best_feasibility={self._state.best_feasibility:.6f}",
286
- f"vacuum_well={metrics.vacuum_well:.4f}",
287
- f"constraints={'SATISFIED' if metrics.constraints_satisfied else 'VIOLATED'}",
288
- f"step={self._state.step_count} | budget={self._state.budget_remaining}/{self._state.budget_total}",
289
  ]
 
 
 
 
 
 
 
 
 
 
 
 
 
 
290
 
291
  return StellaratorObservation(
292
  diagnostics_text="\n".join(text_lines),
@@ -297,6 +310,9 @@ class StellaratorEnvironment(
297
  p1_score=metrics.p1_score,
298
  p1_feasibility=metrics.p1_feasibility,
299
  vacuum_well=metrics.vacuum_well,
 
 
 
300
  step_number=self._state.step_count,
301
  budget_remaining=self._state.budget_remaining,
302
  best_score=self._state.best_score,
@@ -307,16 +323,18 @@ class StellaratorEnvironment(
307
  done=done,
308
  )
309
 
310
- # ------------------------------------------------------------------
311
- # Action summaries
312
- # ------------------------------------------------------------------
313
-
314
  def _summary_run(self, action: StellaratorAction, metrics: EvaluationMetrics) -> str:
315
  assert action.parameter is not None
316
  assert action.direction is not None
317
  assert action.magnitude is not None
318
- previous_metrics = self._last_metrics or metrics
319
- if metrics.constraints_satisfied:
 
 
 
 
 
 
320
  delta = previous_metrics.max_elongation - metrics.max_elongation
321
  objective_summary = (
322
  f"max_elongation changed by {delta:+.4f} to {metrics.max_elongation:.4f}."
@@ -327,10 +345,13 @@ class StellaratorEnvironment(
327
  f"feasibility changed by {delta:+.6f} to {metrics.p1_feasibility:.6f}."
328
  )
329
  return (
330
- f"Applied {action.parameter} {action.direction} {action.magnitude}. {objective_summary}"
 
331
  )
332
 
333
  def _summary_submit(self, metrics: EvaluationMetrics) -> str:
 
 
334
  return (
335
  f"Submitted current_score={metrics.p1_score:.6f}, "
336
  f"best_seen_score={self._state.best_score:.6f}, "
@@ -338,32 +359,26 @@ class StellaratorEnvironment(
338
  f"constraints={'SATISFIED' if metrics.constraints_satisfied else 'VIOLATED'}."
339
  )
340
 
341
- def _initial_params(self, seed: int | None) -> RotatingEllipseParams:
342
- if seed is None:
343
- return BASELINE_PARAMS
344
- rng = Random(seed)
345
- return RotatingEllipseParams(
346
- aspect_ratio=self._clamp(
347
- BASELINE_PARAMS.aspect_ratio + rng.uniform(-0.1, 0.1),
348
- parameter="aspect_ratio",
349
- ),
350
- elongation=self._clamp(
351
- BASELINE_PARAMS.elongation + rng.uniform(-0.1, 0.1),
352
- parameter="elongation",
353
- ),
354
- rotational_transform=self._clamp(
355
- BASELINE_PARAMS.rotational_transform + rng.uniform(-0.015, 0.015),
356
- parameter="rotational_transform",
357
- ),
358
  )
359
 
 
 
 
 
 
360
  def _apply_action(
361
  self,
362
- params: RotatingEllipseParams,
363
  parameter: str,
364
  direction: str,
365
  magnitude: str,
366
- ) -> RotatingEllipseParams:
367
  delta = PARAMETER_DELTAS[parameter][magnitude]
368
  signed_delta = delta if direction == "increase" else -delta
369
 
@@ -372,13 +387,35 @@ class StellaratorEnvironment(
372
  next_values[parameter] + signed_delta,
373
  parameter=parameter,
374
  )
375
- return RotatingEllipseParams.model_validate(next_values)
376
 
377
  def _clamp(self, value: float, *, parameter: str) -> float:
378
  lower, upper = PARAMETER_RANGES[parameter]
379
  return min(max(value, lower), upper)
380
 
381
- def _update_best(self, params: RotatingEllipseParams, metrics: EvaluationMetrics) -> None:
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
382
  current = (
383
  (1, metrics.p1_score) if metrics.constraints_satisfied else (0, -metrics.p1_feasibility)
384
  )
 
1
  from __future__ import annotations
2
 
 
3
  from typing import Any, Final, Optional
4
 
5
  from openenv.core import Environment as BaseEnvironment
6
 
7
  from fusion_lab.models import (
8
+ LowDimBoundaryParams,
9
  StellaratorAction,
10
  StellaratorObservation,
11
  StellaratorState,
 
16
  EDGE_IOTA_OVER_NFP_MIN,
17
  FEASIBILITY_TOLERANCE,
18
  EvaluationMetrics,
19
+ build_boundary_from_params,
20
+ evaluate_boundary,
21
  )
22
 
23
  BUDGET: Final[int] = 6
24
  N_FIELD_PERIODS: Final[int] = 3
25
 
26
  PARAMETER_RANGES: Final[dict[str, tuple[float, float]]] = {
27
+ "aspect_ratio": (3.2, 3.8),
28
+ "elongation": (1.2, 1.8),
29
+ "rotational_transform": (1.2, 1.9),
30
+ "triangularity_scale": (0.4, 0.7),
31
  }
32
 
33
  PARAMETER_DELTAS: Final[dict[str, dict[str, float]]] = {
34
+ "aspect_ratio": {"small": 0.05, "medium": 0.1, "large": 0.2},
35
+ "elongation": {"small": 0.05, "medium": 0.1, "large": 0.2},
36
+ "rotational_transform": {"small": 0.05, "medium": 0.1, "large": 0.2},
37
+ "triangularity_scale": {"small": 0.02, "medium": 0.05, "large": 0.1},
38
  }
39
 
40
+ RESET_SEEDS: Final[tuple[LowDimBoundaryParams, ...]] = (
41
+ LowDimBoundaryParams(
42
+ aspect_ratio=3.6,
43
+ elongation=1.4,
44
+ rotational_transform=1.5,
45
+ triangularity_scale=0.55,
46
+ ),
47
+ LowDimBoundaryParams(
48
+ aspect_ratio=3.4,
49
+ elongation=1.4,
50
+ rotational_transform=1.6,
51
+ triangularity_scale=0.55,
52
+ ),
53
+ LowDimBoundaryParams(
54
+ aspect_ratio=3.8,
55
+ elongation=1.4,
56
+ rotational_transform=1.5,
57
+ triangularity_scale=0.55,
58
+ ),
59
  )
60
 
61
  TARGET_SPEC: Final[str] = (
62
+ "Optimize the P1 benchmark using a custom low-dimensional boundary family derived "
63
+ "from a rotating-ellipse seed. Constraints: aspect ratio <= 4.0, average "
64
+ "triangularity <= -0.5, edge rotational transform / n_field_periods >= 0.3. "
65
+ "Run actions use low-fidelity verification. Submit uses high-fidelity verification. "
66
  "Budget: 6 evaluations."
67
  )
68
 
69
+ FAILURE_PENALTY: Final[float] = -2.0
70
+
71
 
72
  class StellaratorEnvironment(
73
  BaseEnvironment[StellaratorAction, StellaratorObservation, StellaratorState]
 
76
  super().__init__()
77
  self._state = StellaratorState()
78
  self._last_metrics: EvaluationMetrics | None = None
79
+ self._last_successful_metrics: EvaluationMetrics | None = None
80
 
81
  def reset(
82
  self,
 
85
  **kwargs: Any,
86
  ) -> StellaratorObservation:
87
  params = self._initial_params(seed)
88
+ metrics = self._evaluate_params(params, fidelity="low")
 
 
 
 
89
  self._state = StellaratorState(
90
  episode_id=episode_id,
91
  step_count=0,
 
100
  constraints_satisfied=metrics.constraints_satisfied,
101
  )
102
  self._last_metrics = metrics
103
+ self._last_successful_metrics = None if metrics.evaluation_failed else metrics
104
  return self._build_observation(
105
  metrics,
106
+ action_summary="Episode started from a frozen low-dimensional seed.",
107
  )
108
 
109
  def step(
 
113
  **kwargs: Any,
114
  ) -> StellaratorObservation:
115
  if self._state.episode_done or self._state.budget_remaining <= 0:
116
+ metrics = self._last_metrics or self._evaluate_params(
117
  self._state.current_params,
 
118
  fidelity="low",
119
  )
120
  return self._build_observation(
121
  metrics,
122
+ action_summary="Episode already ended. Call reset() before sending more actions.",
123
  reward=0.0,
124
  done=True,
125
  )
 
136
  def state(self) -> StellaratorState:
137
  return self._state
138
 
 
 
 
 
139
  def _handle_run(self, action: StellaratorAction) -> StellaratorObservation:
140
  if not all([action.parameter, action.direction, action.magnitude]):
141
  return self._handle_invalid_run()
 
147
  direction=action.direction,
148
  magnitude=action.magnitude,
149
  )
150
+ metrics = self._evaluate_params(params, fidelity="low")
 
 
 
 
151
  self._state.current_params = params
152
  self._state.constraints_satisfied = metrics.constraints_satisfied
153
  self._update_best(params, metrics)
 
157
  summary = self._summary_run(action, metrics)
158
  self._state.history.append(summary)
159
  self._last_metrics = metrics
160
+ if not metrics.evaluation_failed:
161
+ self._last_successful_metrics = metrics
162
  self._state.episode_done = done
163
 
164
  return self._build_observation(
 
169
  )
170
 
171
  def _handle_submit(self) -> StellaratorObservation:
172
+ metrics = self._evaluate_params(self._state.current_params, fidelity="high")
 
 
 
 
173
  reward = self._compute_reward(metrics, "submit", done=True)
174
  summary = self._summary_submit(metrics)
175
  self._state.history.append(summary)
176
  self._state.episode_done = True
177
  self._last_metrics = metrics
178
+ if not metrics.evaluation_failed:
179
+ self._last_successful_metrics = metrics
180
 
181
  return self._build_observation(
182
  metrics,
 
188
  def _handle_restore(self) -> StellaratorObservation:
189
  self._state.budget_remaining -= 1
190
  self._state.current_params = self._state.best_params
191
+ metrics = self._evaluate_params(self._state.current_params, fidelity="low")
 
 
 
 
192
  self._state.constraints_satisfied = metrics.constraints_satisfied
193
 
194
  done = self._state.budget_remaining <= 0
195
  reward = self._compute_reward(metrics, "restore_best", done)
196
+ summary = self._summary_restore(metrics)
 
 
 
197
  self._state.history.append(summary)
198
  self._last_metrics = metrics
199
+ if not metrics.evaluation_failed:
200
+ self._last_successful_metrics = metrics
201
  self._state.episode_done = done
202
 
203
  return self._build_observation(
 
209
 
210
  def _handle_invalid_run(self) -> StellaratorObservation:
211
  self._state.budget_remaining -= 1
212
+ metrics = self._last_metrics or self._evaluate_params(
213
  self._state.current_params,
 
214
  fidelity="low",
215
  )
216
  done = self._state.budget_remaining <= 0
 
224
  done=done,
225
  )
226
 
 
 
 
 
227
  def _compute_reward(
228
  self,
229
  metrics: EvaluationMetrics,
230
  intent: str,
231
  done: bool,
232
  ) -> float:
233
+ previous_metrics = self._reference_metrics(metrics)
234
+ if metrics.evaluation_failed:
235
+ reward = FAILURE_PENALTY
236
+ if intent != "submit":
237
+ reward -= 0.1
238
+ if intent == "submit":
239
+ reward -= 1.0
240
+ elif done:
241
+ reward -= 0.5
242
+ return round(reward, 4)
243
+
244
  reward = 0.0
245
 
246
  if metrics.constraints_satisfied and not previous_metrics.constraints_satisfied:
 
273
 
274
  return round(reward, 4)
275
 
 
 
 
 
276
  def _build_observation(
277
  self,
278
  metrics: EvaluationMetrics,
 
283
  text_lines = [
284
  action_summary,
285
  "",
286
+ f"evaluation_fidelity={metrics.evaluation_fidelity}",
287
+ f"evaluation_status={'FAILED' if metrics.evaluation_failed else 'OK'}",
 
 
 
 
 
 
288
  ]
289
+ if metrics.evaluation_failed:
290
+ text_lines.append(f"failure_reason={metrics.failure_reason}")
291
+ text_lines.extend(
292
+ [
293
+ f"max_elongation={metrics.max_elongation:.4f} | best_score={self._state.best_score:.6f}",
294
+ f"aspect_ratio={metrics.aspect_ratio:.4f} (<= {ASPECT_RATIO_MAX:.1f})",
295
+ f"average_triangularity={metrics.average_triangularity:.4f} (<= {AVERAGE_TRIANGULARITY_MAX:.1f})",
296
+ f"edge_iota_over_nfp={metrics.edge_iota_over_nfp:.4f} (>= {EDGE_IOTA_OVER_NFP_MIN:.1f})",
297
+ f"feasibility={metrics.p1_feasibility:.6f} | best_feasibility={self._state.best_feasibility:.6f}",
298
+ f"vacuum_well={metrics.vacuum_well:.4f}",
299
+ f"constraints={'SATISFIED' if metrics.constraints_satisfied else 'VIOLATED'}",
300
+ f"step={self._state.step_count} | budget={self._state.budget_remaining}/{self._state.budget_total}",
301
+ ]
302
+ )
303
 
304
  return StellaratorObservation(
305
  diagnostics_text="\n".join(text_lines),
 
310
  p1_score=metrics.p1_score,
311
  p1_feasibility=metrics.p1_feasibility,
312
  vacuum_well=metrics.vacuum_well,
313
+ evaluation_fidelity=metrics.evaluation_fidelity,
314
+ evaluation_failed=metrics.evaluation_failed,
315
+ failure_reason=metrics.failure_reason,
316
  step_number=self._state.step_count,
317
  budget_remaining=self._state.budget_remaining,
318
  best_score=self._state.best_score,
 
323
  done=done,
324
  )
325
 
 
 
 
 
326
  def _summary_run(self, action: StellaratorAction, metrics: EvaluationMetrics) -> str:
327
  assert action.parameter is not None
328
  assert action.direction is not None
329
  assert action.magnitude is not None
330
+ if metrics.evaluation_failed:
331
+ return (
332
+ f"Applied {action.parameter} {action.direction} {action.magnitude}. "
333
+ f"Low-fidelity evaluation failed: {metrics.failure_reason}"
334
+ )
335
+
336
+ previous_metrics = self._reference_metrics(metrics)
337
+ if metrics.constraints_satisfied and previous_metrics.constraints_satisfied:
338
  delta = previous_metrics.max_elongation - metrics.max_elongation
339
  objective_summary = (
340
  f"max_elongation changed by {delta:+.4f} to {metrics.max_elongation:.4f}."
 
345
  f"feasibility changed by {delta:+.6f} to {metrics.p1_feasibility:.6f}."
346
  )
347
  return (
348
+ f"Applied {action.parameter} {action.direction} {action.magnitude}. "
349
+ f"Low-fidelity evaluation. {objective_summary}"
350
  )
351
 
352
  def _summary_submit(self, metrics: EvaluationMetrics) -> str:
353
+ if metrics.evaluation_failed:
354
+ return f"Submit failed during high-fidelity evaluation: {metrics.failure_reason}"
355
  return (
356
  f"Submitted current_score={metrics.p1_score:.6f}, "
357
  f"best_seen_score={self._state.best_score:.6f}, "
 
359
  f"constraints={'SATISFIED' if metrics.constraints_satisfied else 'VIOLATED'}."
360
  )
361
 
362
+ def _summary_restore(self, metrics: EvaluationMetrics) -> str:
363
+ if metrics.evaluation_failed:
364
+ return f"Restore-best failed during low-fidelity evaluation: {metrics.failure_reason}"
365
+ return (
366
+ "Restored the best-known design. "
367
+ f"Score={metrics.p1_score:.6f}, feasibility={metrics.p1_feasibility:.6f}."
 
 
 
 
 
 
 
 
 
 
 
368
  )
369
 
370
+ def _initial_params(self, seed: int | None) -> LowDimBoundaryParams:
371
+ if seed is None:
372
+ return RESET_SEEDS[0]
373
+ return RESET_SEEDS[seed % len(RESET_SEEDS)]
374
+
375
  def _apply_action(
376
  self,
377
+ params: LowDimBoundaryParams,
378
  parameter: str,
379
  direction: str,
380
  magnitude: str,
381
+ ) -> LowDimBoundaryParams:
382
  delta = PARAMETER_DELTAS[parameter][magnitude]
383
  signed_delta = delta if direction == "increase" else -delta
384
 
 
387
  next_values[parameter] + signed_delta,
388
  parameter=parameter,
389
  )
390
+ return LowDimBoundaryParams.model_validate(next_values)
391
 
392
  def _clamp(self, value: float, *, parameter: str) -> float:
393
  lower, upper = PARAMETER_RANGES[parameter]
394
  return min(max(value, lower), upper)
395
 
396
+ def _evaluate_params(
397
+ self,
398
+ params: LowDimBoundaryParams,
399
+ *,
400
+ fidelity: str,
401
+ ) -> EvaluationMetrics:
402
+ boundary = build_boundary_from_params(
403
+ params,
404
+ n_field_periods=N_FIELD_PERIODS,
405
+ )
406
+ return evaluate_boundary(boundary, fidelity=fidelity)
407
+
408
+ def _reference_metrics(self, fallback: EvaluationMetrics) -> EvaluationMetrics:
409
+ if self._last_metrics is not None and not self._last_metrics.evaluation_failed:
410
+ return self._last_metrics
411
+ if self._last_successful_metrics is not None:
412
+ return self._last_successful_metrics
413
+ return fallback
414
+
415
+ def _update_best(self, params: LowDimBoundaryParams, metrics: EvaluationMetrics) -> None:
416
+ if metrics.evaluation_failed:
417
+ return
418
+
419
  current = (
420
  (1, metrics.p1_score) if metrics.constraints_satisfied else (0, -metrics.p1_feasibility)
421
  )
server/physics.py CHANGED
@@ -3,20 +3,27 @@ from __future__ import annotations
3
  from dataclasses import dataclass
4
  from typing import Final, Literal
5
 
 
6
  from constellaration.forward_model import (
7
  ConstellarationMetrics,
8
  ConstellarationSettings,
9
  forward_model,
10
  )
 
 
11
  from constellaration.initial_guess import generate_rotating_ellipse
12
  from constellaration.problems import GeometricalProblem
13
 
14
- from fusion_lab.models import RotatingEllipseParams
15
 
16
  ASPECT_RATIO_MAX: Final[float] = 4.0
17
  AVERAGE_TRIANGULARITY_MAX: Final[float] = -0.5
18
  EDGE_IOTA_OVER_NFP_MIN: Final[float] = 0.3
19
  FEASIBILITY_TOLERANCE: Final[float] = 0.01
 
 
 
 
20
 
21
  EvaluationFidelity = Literal["low", "high"]
22
 
@@ -31,23 +38,57 @@ class EvaluationMetrics:
31
  p1_feasibility: float
32
  constraints_satisfied: bool
33
  vacuum_well: float
 
 
 
34
 
35
 
36
- def evaluate_params(
37
- params: RotatingEllipseParams,
38
  *,
39
  n_field_periods: int = 3,
40
- fidelity: EvaluationFidelity = "low",
41
- ) -> EvaluationMetrics:
42
- boundary = generate_rotating_ellipse(
 
43
  aspect_ratio=params.aspect_ratio,
44
  elongation=params.elongation,
45
  rotational_transform=params.rotational_transform,
46
  n_field_periods=n_field_periods,
47
  )
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
48
  settings = _settings_for_fidelity(fidelity)
49
- metrics, _ = forward_model(boundary, settings=settings)
50
- return _to_evaluation_metrics(metrics)
 
 
 
51
 
52
 
53
  def _settings_for_fidelity(fidelity: EvaluationFidelity) -> ConstellarationSettings:
@@ -65,7 +106,11 @@ def _settings_for_fidelity(fidelity: EvaluationFidelity) -> ConstellarationSetti
65
  )
66
 
67
 
68
- def _to_evaluation_metrics(metrics: ConstellarationMetrics) -> EvaluationMetrics:
 
 
 
 
69
  problem = GeometricalProblem()
70
  constraints_satisfied = problem.is_feasible(metrics)
71
  p1_feasibility = float(problem.compute_feasibility(metrics))
@@ -83,6 +128,29 @@ def _to_evaluation_metrics(metrics: ConstellarationMetrics) -> EvaluationMetrics
83
  p1_feasibility=p1_feasibility,
84
  constraints_satisfied=constraints_satisfied,
85
  vacuum_well=float(metrics.vacuum_well),
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
86
  )
87
 
88
 
 
3
  from dataclasses import dataclass
4
  from typing import Final, Literal
5
 
6
+ import numpy as np
7
  from constellaration.forward_model import (
8
  ConstellarationMetrics,
9
  ConstellarationSettings,
10
  forward_model,
11
  )
12
+ from constellaration.geometry import surface_rz_fourier
13
+ from constellaration.geometry.surface_rz_fourier import SurfaceRZFourier
14
  from constellaration.initial_guess import generate_rotating_ellipse
15
  from constellaration.problems import GeometricalProblem
16
 
17
+ from fusion_lab.models import LowDimBoundaryParams
18
 
19
  ASPECT_RATIO_MAX: Final[float] = 4.0
20
  AVERAGE_TRIANGULARITY_MAX: Final[float] = -0.5
21
  EDGE_IOTA_OVER_NFP_MIN: Final[float] = 0.3
22
  FEASIBILITY_TOLERANCE: Final[float] = 0.01
23
+ MAX_POLOIDAL_MODE: Final[int] = 3
24
+ MAX_TOROIDAL_MODE: Final[int] = 3
25
+ FAILED_FEASIBILITY: Final[float] = 1_000_000.0
26
+ FAILED_ELONGATION: Final[float] = 10.0
27
 
28
  EvaluationFidelity = Literal["low", "high"]
29
 
 
38
  p1_feasibility: float
39
  constraints_satisfied: bool
40
  vacuum_well: float
41
+ evaluation_fidelity: EvaluationFidelity
42
+ evaluation_failed: bool
43
+ failure_reason: str
44
 
45
 
46
+ def build_boundary_from_params(
47
+ params: LowDimBoundaryParams,
48
  *,
49
  n_field_periods: int = 3,
50
+ max_poloidal_mode: int = MAX_POLOIDAL_MODE,
51
+ max_toroidal_mode: int = MAX_TOROIDAL_MODE,
52
+ ) -> SurfaceRZFourier:
53
+ surface = generate_rotating_ellipse(
54
  aspect_ratio=params.aspect_ratio,
55
  elongation=params.elongation,
56
  rotational_transform=params.rotational_transform,
57
  n_field_periods=n_field_periods,
58
  )
59
+ expanded_surface = surface_rz_fourier.set_max_mode_numbers(
60
+ surface,
61
+ max_poloidal_mode=max_poloidal_mode,
62
+ max_toroidal_mode=max_toroidal_mode,
63
+ )
64
+ r_cos = np.asarray(expanded_surface.r_cos, dtype=float).copy()
65
+ z_sin = np.asarray(expanded_surface.z_sin, dtype=float).copy()
66
+ center = r_cos.shape[1] // 2
67
+ minor_radius = float(r_cos[1, center])
68
+
69
+ r_cos[2, center] = -params.triangularity_scale * minor_radius
70
+ r_cos[0, :center] = 0.0
71
+ z_sin[0, : center + 1] = 0.0
72
+
73
+ return SurfaceRZFourier(
74
+ r_cos=r_cos,
75
+ z_sin=z_sin,
76
+ n_field_periods=n_field_periods,
77
+ is_stellarator_symmetric=True,
78
+ )
79
+
80
+
81
+ def evaluate_boundary(
82
+ boundary: SurfaceRZFourier,
83
+ *,
84
+ fidelity: EvaluationFidelity = "low",
85
+ ) -> EvaluationMetrics:
86
  settings = _settings_for_fidelity(fidelity)
87
+ try:
88
+ metrics, _ = forward_model(boundary, settings=settings)
89
+ except RuntimeError as error:
90
+ return _failure_metrics(fidelity=fidelity, failure_reason=str(error))
91
+ return _to_evaluation_metrics(metrics, fidelity=fidelity)
92
 
93
 
94
  def _settings_for_fidelity(fidelity: EvaluationFidelity) -> ConstellarationSettings:
 
106
  )
107
 
108
 
109
+ def _to_evaluation_metrics(
110
+ metrics: ConstellarationMetrics,
111
+ *,
112
+ fidelity: EvaluationFidelity,
113
+ ) -> EvaluationMetrics:
114
  problem = GeometricalProblem()
115
  constraints_satisfied = problem.is_feasible(metrics)
116
  p1_feasibility = float(problem.compute_feasibility(metrics))
 
128
  p1_feasibility=p1_feasibility,
129
  constraints_satisfied=constraints_satisfied,
130
  vacuum_well=float(metrics.vacuum_well),
131
+ evaluation_fidelity=fidelity,
132
+ evaluation_failed=False,
133
+ failure_reason="",
134
+ )
135
+
136
+
137
+ def _failure_metrics(
138
+ *,
139
+ fidelity: EvaluationFidelity,
140
+ failure_reason: str,
141
+ ) -> EvaluationMetrics:
142
+ return EvaluationMetrics(
143
+ max_elongation=FAILED_ELONGATION,
144
+ aspect_ratio=0.0,
145
+ average_triangularity=0.0,
146
+ edge_iota_over_nfp=0.0,
147
+ p1_score=0.0,
148
+ p1_feasibility=FAILED_FEASIBILITY,
149
+ constraints_satisfied=False,
150
+ vacuum_well=0.0,
151
+ evaluation_fidelity=fidelity,
152
+ evaluation_failed=True,
153
+ failure_reason=failure_reason,
154
  )
155
 
156