CallMeDaniel Claude Opus 4.6 (1M context) commited on
Commit
1923201
·
1 Parent(s): 46fb80d

docs: add pydantic unification implementation plan

Browse files

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

docs/superpowers/plans/2026-04-13-pydantic-unification.md ADDED
@@ -0,0 +1,1828 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Pydantic Unification Implementation Plan
2
+
3
+ > **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
4
+
5
+ **Goal:** Replace all raw dicts, dataclasses, and untyped state passing with Pydantic models across the NeuralCAD codebase.
6
+
7
+ **Architecture:** Bottom-up migration: define new models first, convert existing dataclasses, then update function signatures and consumers from core -> agents -> server. Each task produces a working, test-passing codebase.
8
+
9
+ **Tech Stack:** Pydantic v2 (BaseModel), pytest, FastAPI
10
+
11
+ ---
12
+
13
+ ## File Structure
14
+
15
+ | File | Changes |
16
+ |------|---------|
17
+ | `core/cam.py` | Add `ToolConfig` model; update `CAMResult.tool_config`, `CAMPlan.to_tool_config()`, `generate_gcode()` |
18
+ | `core/types.py` | Remove `AgentResponse` and `ChatResult` dataclasses; keep enums + `LLMBackend` ABC |
19
+ | `core/pipeline.py` | Convert `PipelineResult` from dataclass to Pydantic |
20
+ | `agents/definitions.py` | Convert `AgentDef` from dataclass to Pydantic |
21
+ | `agents/agent_flow.py` | Add `PreviewData` and `ChatTurnResponse` models |
22
+ | `agents/gap_analyzer.py` | Change `analyze_gaps()` to accept `list[AgentResponse]` |
23
+ | `agents/design_state.py` | Change `update_from_messages()` / `extract_decisions()` to accept `list[AgentResponse]` |
24
+ | `agents/tools.py` | Change `set/get_design_state()` to use `DesignState`; update `GenerateGcodeTool` |
25
+ | `agents/base.py` | Update `BaseOrchestrator.chat_turn()` signature |
26
+ | `agents/orchestrator.py` | Replace `_format_response()` with `AgentResponse.from_agent()`; update `MockChatBackend` |
27
+ | `agents/crew_orchestrator.py` | Return `ChatTurnResponse` throughout; stop `.model_dump()` serialization of intermediate models |
28
+ | `server/routes.py` | Type `design_state` and `plan` fields in request models; return `ChatTurnResponse` |
29
+ | `server/mcp.py` | Use typed models in MCP tool responses |
30
+ | `tests/test_cam.py` | Update for `ToolConfig` |
31
+ | `tests/test_types.py` | Remove `AgentResponse`/`ChatResult` tests; keep enum + ABC tests |
32
+ | `tests/test_gap_analyzer.py` | Use `AgentResponse` objects instead of dicts |
33
+ | `tests/test_design_state.py` | Use `AgentResponse` objects instead of dicts |
34
+ | `tests/test_mock_orchestrator.py` | Assert on `ChatTurnResponse` attributes instead of dict keys |
35
+ | `tests/test_crew_orchestrator.py` | Assert on `ChatTurnResponse` attributes instead of dict keys |
36
+ | `tests/test_api_routes.py` | Verify JSON shape still matches (routes serialize for HTTP) |
37
+ | `tests/conftest.py` | Update `populated_design_state` fixture to return `DesignState` |
38
+
39
+ ---
40
+
41
+ ### Task 1: Add ToolConfig Model to core/cam.py
42
+
43
+ **Files:**
44
+ - Modify: `core/cam.py:12-19` (CAMResult), `core/cam.py:32-39` (CAMPlan.to_tool_config), `core/cam.py:45-50` (_get_default_tool_config), `core/cam.py:63-68` (generate_gcode)
45
+ - Modify: `agents/tools.py:99-109` (GenerateGcodeTool._run)
46
+ - Test: `tests/test_cam.py`
47
+
48
+ - [ ] **Step 1: Write failing test for ToolConfig**
49
+
50
+ In `tests/test_cam.py`, add after `TestCAMPlan`:
51
+
52
+ ```python
53
+ from core.cam import ToolConfig
54
+
55
+
56
+ class TestToolConfig:
57
+ def test_default_values(self):
58
+ tc = ToolConfig()
59
+ assert tc.diameter == 6.0
60
+ assert tc.h_feed == 800
61
+ assert tc.v_feed == 200
62
+ assert tc.speed == 18000
63
+
64
+ def test_custom_values(self):
65
+ tc = ToolConfig(diameter=3.0, h_feed=400, v_feed=100, speed=24000)
66
+ assert tc.diameter == 3.0
67
+ assert tc.h_feed == 400
68
+
69
+ def test_model_dump(self):
70
+ tc = ToolConfig()
71
+ d = tc.model_dump()
72
+ assert d == {"diameter": 6.0, "h_feed": 800, "v_feed": 200, "speed": 18000}
73
+ ```
74
+
75
+ - [ ] **Step 2: Run test to verify it fails**
76
+
77
+ Run: `cd /home/daniel/NeuralCAD && python -m pytest tests/test_cam.py::TestToolConfig -v`
78
+ Expected: FAIL with `ImportError: cannot import name 'ToolConfig'`
79
+
80
+ - [ ] **Step 3: Create ToolConfig model and update CAMResult, CAMPlan, generate_gcode**
81
+
82
+ In `core/cam.py`, add the `ToolConfig` model after the imports, before `CAMResult`:
83
+
84
+ ```python
85
+ class ToolConfig(BaseModel):
86
+ """CNC tool configuration for G-code generation."""
87
+ diameter: float = 6.0
88
+ h_feed: float = 800
89
+ v_feed: float = 200
90
+ speed: float = 18000
91
+ ```
92
+
93
+ Change `CAMResult.tool_config` from `dict` to `ToolConfig`:
94
+
95
+ ```python
96
+ class CAMResult(BaseModel):
97
+ """Result of G-code generation from a CadQuery shape."""
98
+ success: bool
99
+ gcode: str | None = None
100
+ operations: list[str] = Field(default_factory=list)
101
+ tool_config: ToolConfig = Field(default_factory=ToolConfig)
102
+ post_processor: str = "grbl"
103
+ error: str | None = None
104
+ ```
105
+
106
+ Change `CAMPlan.to_tool_config()` to return `ToolConfig`:
107
+
108
+ ```python
109
+ def to_tool_config(self) -> ToolConfig:
110
+ """Convert to ToolConfig for generate_gcode()."""
111
+ return ToolConfig(
112
+ diameter=self.tool_diameter,
113
+ h_feed=self.tool_h_feed,
114
+ v_feed=self.tool_v_feed,
115
+ speed=self.tool_speed,
116
+ )
117
+ ```
118
+
119
+ Change `_get_default_tool_config()` to return `ToolConfig`:
120
+
121
+ ```python
122
+ def _get_default_tool_config() -> ToolConfig:
123
+ """Load default roughing tool config from config.yaml cam section."""
124
+ roughing = settings.cam.tools.get("roughing")
125
+ if roughing:
126
+ return ToolConfig(**roughing.model_dump())
127
+ return ToolConfig()
128
+ ```
129
+
130
+ Change `generate_gcode()` signature:
131
+
132
+ ```python
133
+ def generate_gcode(
134
+ shape,
135
+ operations: list[str],
136
+ tool_config: ToolConfig | None = None,
137
+ post_processor: str | None = None,
138
+ stock_offset_mm: float | None = None,
139
+ ) -> CAMResult:
140
+ ```
141
+
142
+ And update the body to access `tool_config.diameter` etc. instead of `.get()`:
143
+
144
+ ```python
145
+ tool = Endmill(
146
+ diameter=tool_config.diameter,
147
+ h_feed=tool_config.h_feed,
148
+ v_feed=tool_config.v_feed,
149
+ speed=tool_config.speed,
150
+ )
151
+ ```
152
+
153
+ - [ ] **Step 4: Update GenerateGcodeTool in agents/tools.py**
154
+
155
+ Change `GenerateGcodeTool._run()` (line 99-109):
156
+
157
+ ```python
158
+ def _run(self, operations: list[str], tool_diameter: float = 6.0, post_processor: str = "grbl") -> str:
159
+ from core.cam import generate_gcode, ToolConfig
160
+ shape = get_last_shape()
161
+ if shape is None:
162
+ return json.dumps({"success": False, "error": "No shape available. Run Execute CadQuery Code first."})
163
+ tool_config = ToolConfig(diameter=tool_diameter, h_feed=800, v_feed=200, speed=18000)
164
+ result = generate_gcode(
165
+ shape=shape, operations=operations,
166
+ tool_config=tool_config, post_processor=post_processor,
167
+ )
168
+ return json.dumps(result.model_dump(), indent=2)
169
+ ```
170
+
171
+ - [ ] **Step 5: Update existing CAM tests for ToolConfig type**
172
+
173
+ In `tests/test_cam.py`, update `TestCAMResult`:
174
+
175
+ ```python
176
+ class TestCAMResult:
177
+ def test_success_result(self):
178
+ r = CAMResult(
179
+ success=True,
180
+ gcode="G21 G90\nG00 X0 Y0 Z10\nM30",
181
+ operations=["pocket", "profile"],
182
+ tool_config=ToolConfig(diameter=6, h_feed=800),
183
+ post_processor="grbl",
184
+ )
185
+ assert r.success is True
186
+ assert "G21" in r.gcode
187
+ assert r.operations == ["pocket", "profile"]
188
+ assert r.error is None
189
+
190
+ def test_failure_result(self):
191
+ r = CAMResult(success=False, error="ocp-freecad-cam not available")
192
+ assert r.success is False
193
+ assert r.gcode is None
194
+ assert r.error == "ocp-freecad-cam not available"
195
+
196
+ def test_model_dump(self):
197
+ r = CAMResult(
198
+ success=True, gcode="G21 G90\nM30",
199
+ operations=["pocket"], tool_config=ToolConfig(diameter=6),
200
+ )
201
+ d = r.model_dump()
202
+ assert d["success"] is True
203
+ assert d["gcode"] == "G21 G90\nM30"
204
+ assert d["operations"] == ["pocket"]
205
+ assert d["tool_config"]["diameter"] == 6
206
+ assert d["post_processor"] == "grbl"
207
+ assert d["error"] is None
208
+ ```
209
+
210
+ Update `TestGenerateGcode`:
211
+
212
+ ```python
213
+ class TestGenerateGcode:
214
+ def test_returns_failure_when_ocp_not_available(self):
215
+ mock_shape = MagicMock()
216
+ result = generate_gcode(
217
+ shape=mock_shape,
218
+ operations=["pocket"],
219
+ tool_config=ToolConfig(diameter=6, h_feed=800, v_feed=200, speed=18000),
220
+ post_processor="grbl",
221
+ )
222
+ assert result.success is False
223
+ assert "not available" in result.error.lower() or "not installed" in result.error.lower()
224
+
225
+ def test_returns_failure_on_empty_operations(self):
226
+ mock_shape = MagicMock()
227
+ result = generate_gcode(shape=mock_shape, operations=[])
228
+ assert result.success is False
229
+ assert "no operations" in result.error.lower()
230
+
231
+ def test_uses_default_tool_config_when_none(self):
232
+ mock_shape = MagicMock()
233
+ result = generate_gcode(shape=mock_shape, operations=["pocket"], tool_config=None)
234
+ assert result.tool_config is not None
235
+ assert result.tool_config.diameter > 0
236
+
237
+ def test_uses_default_post_processor(self):
238
+ mock_shape = MagicMock()
239
+ result = generate_gcode(shape=mock_shape, operations=["pocket"])
240
+ assert result.post_processor == "grbl"
241
+ ```
242
+
243
+ Update `TestCAMPlan.test_to_tool_config`:
244
+
245
+ ```python
246
+ def test_to_tool_config(self):
247
+ plan = CAMPlan(operations=["pocket"])
248
+ config = plan.to_tool_config()
249
+ assert isinstance(config, ToolConfig)
250
+ assert config.diameter == 6.0
251
+ assert config.h_feed == 800
252
+ assert config.v_feed == 200
253
+ assert config.speed == 18000
254
+ ```
255
+
256
+ - [ ] **Step 6: Run all CAM and tool tests**
257
+
258
+ Run: `cd /home/daniel/NeuralCAD && python -m pytest tests/test_cam.py tests/test_tools.py -v`
259
+ Expected: ALL PASS
260
+
261
+ - [ ] **Step 7: Commit**
262
+
263
+ ```bash
264
+ git add core/cam.py agents/tools.py tests/test_cam.py
265
+ git commit -m "refactor: add ToolConfig pydantic model, replace tool_config dicts"
266
+ ```
267
+
268
+ ---
269
+
270
+ ### Task 2: Convert AgentDef from Dataclass to Pydantic
271
+
272
+ **Files:**
273
+ - Modify: `agents/definitions.py`
274
+ - Test: `tests/test_types.py` (verify no breakage from import changes)
275
+
276
+ - [ ] **Step 1: Write failing test for AgentDef as Pydantic model**
277
+
278
+ In `tests/test_types.py`, add:
279
+
280
+ ```python
281
+ from agents.definitions import AgentDef
282
+
283
+ class TestAgentDefModel:
284
+ def test_create(self):
285
+ ad = AgentDef(id="design", name="Design", role="Designer", color="#fff", avatar="D", goal="g", backstory="b")
286
+ assert ad.id == "design"
287
+ assert ad.name == "Design"
288
+
289
+ def test_model_dump(self):
290
+ ad = AgentDef(id="cad", name="CAD", role="Coder", color="#000", avatar="C", goal="g", backstory="b")
291
+ d = ad.model_dump()
292
+ assert d["id"] == "cad"
293
+ assert "role" in d
294
+ ```
295
+
296
+ - [ ] **Step 2: Run test to verify it fails**
297
+
298
+ Run: `cd /home/daniel/NeuralCAD && python -m pytest tests/test_types.py::TestAgentDefModel -v`
299
+ Expected: FAIL with `AttributeError: 'AgentDef' object has no attribute 'model_dump'`
300
+
301
+ - [ ] **Step 3: Convert AgentDef to Pydantic BaseModel**
302
+
303
+ In `agents/definitions.py`, replace:
304
+
305
+ ```python
306
+ from dataclasses import dataclass
307
+ from config.settings import settings
308
+
309
+
310
+ @dataclass
311
+ class AgentDef:
312
+ """Definition of a chat agent."""
313
+ id: str
314
+ name: str
315
+ role: str
316
+ color: str
317
+ avatar: str
318
+ goal: str
319
+ backstory: str
320
+ ```
321
+
322
+ With:
323
+
324
+ ```python
325
+ from pydantic import BaseModel
326
+
327
+ from config.settings import settings
328
+
329
+
330
+ class AgentDef(BaseModel):
331
+ """Definition of a chat agent."""
332
+ id: str
333
+ name: str
334
+ role: str
335
+ color: str
336
+ avatar: str
337
+ goal: str
338
+ backstory: str
339
+ ```
340
+
341
+ - [ ] **Step 4: Run tests**
342
+
343
+ Run: `cd /home/daniel/NeuralCAD && python -m pytest tests/test_types.py::TestAgentDefModel tests/test_agent_flow.py -v`
344
+ Expected: ALL PASS
345
+
346
+ - [ ] **Step 5: Commit**
347
+
348
+ ```bash
349
+ git add agents/definitions.py tests/test_types.py
350
+ git commit -m "refactor: convert AgentDef from dataclass to Pydantic BaseModel"
351
+ ```
352
+
353
+ ---
354
+
355
+ ### Task 3: Convert PipelineResult from Dataclass to Pydantic
356
+
357
+ **Files:**
358
+ - Modify: `core/pipeline.py:28-56`
359
+ - Test: `tests/test_pipeline.py`
360
+
361
+ - [ ] **Step 1: Write failing test for PipelineResult.model_dump**
362
+
363
+ In `tests/test_pipeline.py`, add (or modify existing):
364
+
365
+ ```python
366
+ from core.pipeline import PipelineResult
367
+ from core.executor import ExecutionResult
368
+
369
+
370
+ class TestPipelineResultModel:
371
+ def test_model_dump(self):
372
+ exec_result = ExecutionResult(success=True, volume=1000.0, bounding_box=(10, 10, 10), face_count=6, edge_count=12)
373
+ pr = PipelineResult(
374
+ prompt="test",
375
+ generated_code="code",
376
+ execution=exec_result,
377
+ retry_count=0,
378
+ )
379
+ d = pr.model_dump()
380
+ assert d["prompt"] == "test"
381
+ assert d["retry_count"] == 0
382
+
383
+ def test_default_exported_files(self):
384
+ exec_result = ExecutionResult(success=False, error="fail")
385
+ pr = PipelineResult(prompt="test", generated_code="code", execution=exec_result)
386
+ assert pr.exported_files == {}
387
+ assert pr.validation is None
388
+ ```
389
+
390
+ - [ ] **Step 2: Run test to verify it fails**
391
+
392
+ Run: `cd /home/daniel/NeuralCAD && python -m pytest tests/test_pipeline.py::TestPipelineResultModel -v`
393
+ Expected: FAIL with `AttributeError: 'PipelineResult' object has no attribute 'model_dump'`
394
+
395
+ - [ ] **Step 3: Convert PipelineResult to Pydantic**
396
+
397
+ In `core/pipeline.py`, replace the dataclass:
398
+
399
+ ```python
400
+ from dataclasses import dataclass
401
+ from pathlib import Path
402
+ from typing import Optional
403
+ ```
404
+
405
+ With:
406
+
407
+ ```python
408
+ from pathlib import Path
409
+ from typing import Optional
410
+
411
+ from pydantic import BaseModel, Field
412
+ ```
413
+
414
+ Replace the `PipelineResult` class:
415
+
416
+ ```python
417
+ class PipelineResult(BaseModel):
418
+ model_config = {"arbitrary_types_allowed": True}
419
+
420
+ prompt: str
421
+ generated_code: str
422
+ execution: ExecutionResult
423
+ validation: Optional[CNCValidationResult] = None
424
+ exported_files: dict[str, Path] = Field(default_factory=dict)
425
+ retry_count: int = 0
426
+
427
+ def summary(self) -> str:
428
+ lines = [
429
+ "=" * 60,
430
+ "TEXT-TO-CNC PIPELINE RESULT",
431
+ "=" * 60,
432
+ f"Prompt: {self.prompt}",
433
+ f"Retries: {self.retry_count}",
434
+ "",
435
+ "-- Execution --",
436
+ self.execution.summary(),
437
+ "",
438
+ ]
439
+ if self.validation:
440
+ lines += ["-- CNC Validation --", self.validation.summary(), ""]
441
+ if self.exported_files:
442
+ lines += ["-- Exported Files --"]
443
+ for fmt, path in self.exported_files.items():
444
+ lines.append(f" {fmt.upper()}: {path}")
445
+ lines.append("=" * 60)
446
+ return "\n".join(lines)
447
+ ```
448
+
449
+ Note: `arbitrary_types_allowed` is needed because `ExecutionResult` contains `cq.Workplane`.
450
+
451
+ - [ ] **Step 4: Run tests**
452
+
453
+ Run: `cd /home/daniel/NeuralCAD && python -m pytest tests/test_pipeline.py -v`
454
+ Expected: ALL PASS
455
+
456
+ - [ ] **Step 5: Commit**
457
+
458
+ ```bash
459
+ git add core/pipeline.py tests/test_pipeline.py
460
+ git commit -m "refactor: convert PipelineResult from dataclass to Pydantic BaseModel"
461
+ ```
462
+
463
+ ---
464
+
465
+ ### Task 4: Remove Duplicate Dataclasses from core/types.py
466
+
467
+ **Files:**
468
+ - Modify: `core/types.py:1-87`
469
+ - Modify: `tests/test_types.py`
470
+
471
+ The `AgentResponse` dataclass in `core/types.py` is duplicated by the Pydantic version in `agents/agent_flow.py`. The `ChatResult` dataclass will be replaced by `ChatTurnResponse` in Task 6. Nothing in the codebase imports `AgentResponse` or `ChatResult` from `core/types.py` (orchestrators import from `agents/agent_flow.py`).
472
+
473
+ - [ ] **Step 1: Verify no imports of AgentResponse/ChatResult from core/types**
474
+
475
+ Run: `cd /home/daniel/NeuralCAD && grep -rn "from core.types import.*AgentResponse\|from core.types import.*ChatResult" --include="*.py" | grep -v test | grep -v __pycache__`
476
+ Expected: No matches (only test files reference them)
477
+
478
+ - [ ] **Step 2: Remove AgentResponse and ChatResult from core/types.py**
479
+
480
+ Replace `core/types.py` with:
481
+
482
+ ```python
483
+ """Shared types, enums, and ABCs for NeuralCAD."""
484
+
485
+ from __future__ import annotations
486
+
487
+ from abc import ABC, abstractmethod
488
+ from enum import Enum
489
+ from pathlib import Path
490
+
491
+
492
+ class BackendName(str, Enum):
493
+ MOCK = "mock"
494
+ ANTHROPIC = "anthropic"
495
+ OPENAI = "openai"
496
+ GEMINI = "gemini"
497
+
498
+
499
+ class AgentId(str, Enum):
500
+ DESIGN = "design"
501
+ ENGINEERING = "engineering"
502
+ CNC = "cnc"
503
+ CAD = "cad"
504
+
505
+
506
+ class LLMBackend(ABC):
507
+ """Abstract base class for LLM code generation backends."""
508
+
509
+ @abstractmethod
510
+ def generate(self, messages: list[dict]) -> str:
511
+ """Generate text from a list of messages."""
512
+ ...
513
+
514
+ def generate_with_image(self, messages: list[dict], image_path: str | Path) -> str:
515
+ """Generate text from messages that include an image."""
516
+ raise NotImplementedError(
517
+ f"{type(self).__name__} does not support image input"
518
+ )
519
+
520
+ @staticmethod
521
+ def split_system_message(messages: list[dict]) -> tuple[str, list[dict]]:
522
+ """Extract system message from a message list."""
523
+ system_msg = ""
524
+ user_messages = []
525
+ for m in messages:
526
+ if m["role"] == "system":
527
+ system_msg = m["content"]
528
+ else:
529
+ user_messages.append(m)
530
+ return system_msg, user_messages
531
+ ```
532
+
533
+ - [ ] **Step 3: Update test_types.py**
534
+
535
+ Remove `TestAgentResponse` and `TestChatResult` classes. Keep `TestEnums`, `TestLLMBackendABC`, and the new `TestAgentDefModel`. Update imports:
536
+
537
+ ```python
538
+ """Tests for core/types.py — enums and ABC."""
539
+ import pytest
540
+ from core.types import BackendName, AgentId, LLMBackend
541
+ from agents.definitions import AgentDef
542
+
543
+
544
+ class TestEnums:
545
+ def test_backend_names(self):
546
+ assert BackendName.MOCK == "mock"
547
+ assert BackendName.ANTHROPIC == "anthropic"
548
+ assert BackendName.OPENAI == "openai"
549
+ assert BackendName.GEMINI == "gemini"
550
+
551
+ def test_agent_ids(self):
552
+ assert AgentId.DESIGN == "design"
553
+ assert AgentId.ENGINEERING == "engineering"
554
+ assert AgentId.CNC == "cnc"
555
+ assert AgentId.CAD == "cad"
556
+
557
+ def test_backend_name_is_string(self):
558
+ assert isinstance(BackendName.MOCK, str)
559
+ assert BackendName.MOCK in {"mock", "anthropic"}
560
+
561
+
562
+ class TestAgentDefModel:
563
+ def test_create(self):
564
+ ad = AgentDef(id="design", name="Design", role="Designer", color="#fff", avatar="D", goal="g", backstory="b")
565
+ assert ad.id == "design"
566
+ assert ad.name == "Design"
567
+
568
+ def test_model_dump(self):
569
+ ad = AgentDef(id="cad", name="CAD", role="Coder", color="#000", avatar="C", goal="g", backstory="b")
570
+ d = ad.model_dump()
571
+ assert d["id"] == "cad"
572
+ assert "role" in d
573
+
574
+
575
+ class TestLLMBackendABC:
576
+ def test_cannot_instantiate(self):
577
+ with pytest.raises(TypeError):
578
+ LLMBackend()
579
+
580
+ def test_subclass_must_implement_generate(self):
581
+ class Incomplete(LLMBackend):
582
+ pass
583
+ with pytest.raises(TypeError):
584
+ Incomplete()
585
+
586
+ def test_subclass_with_generate(self):
587
+ class Complete(LLMBackend):
588
+ def generate(self, messages):
589
+ return "ok"
590
+ b = Complete()
591
+ assert b.generate([]) == "ok"
592
+
593
+ def test_split_system_message(self):
594
+ msgs = [
595
+ {"role": "system", "content": "You are a bot"},
596
+ {"role": "user", "content": "hello"},
597
+ ]
598
+ system, rest = LLMBackend.split_system_message(msgs)
599
+ assert system == "You are a bot"
600
+ assert len(rest) == 1
601
+ assert rest[0]["role"] == "user"
602
+
603
+ def test_split_system_message_no_system(self):
604
+ msgs = [{"role": "user", "content": "hello"}]
605
+ system, rest = LLMBackend.split_system_message(msgs)
606
+ assert system == ""
607
+ assert len(rest) == 1
608
+ ```
609
+
610
+ - [ ] **Step 4: Run tests**
611
+
612
+ Run: `cd /home/daniel/NeuralCAD && python -m pytest tests/test_types.py -v`
613
+ Expected: ALL PASS
614
+
615
+ - [ ] **Step 5: Commit**
616
+
617
+ ```bash
618
+ git add core/types.py tests/test_types.py
619
+ git commit -m "refactor: remove duplicate AgentResponse/ChatResult dataclasses from core/types"
620
+ ```
621
+
622
+ ---
623
+
624
+ ### Task 5: Add PreviewData and ChatTurnResponse Models
625
+
626
+ **Files:**
627
+ - Modify: `agents/agent_flow.py` (add models after `AgentFlowState`)
628
+ - Test: `tests/test_agent_flow.py`
629
+
630
+ - [ ] **Step 1: Write failing tests for PreviewData and ChatTurnResponse**
631
+
632
+ In `tests/test_agent_flow.py`, add:
633
+
634
+ ```python
635
+ from agents.agent_flow import PreviewData, ChatTurnResponse
636
+ from agents.design_state import DesignState
637
+ from agents.gap_analyzer import QuestionCard
638
+
639
+
640
+ class TestPreviewData:
641
+ def test_success_preview(self):
642
+ p = PreviewData(
643
+ success=True,
644
+ part_name="bracket",
645
+ stl_url="/api/models/bracket.stl",
646
+ step_url="/api/models/bracket.step",
647
+ execution={"success": True, "volume_mm3": 1000.0},
648
+ validation={"machinable": True, "axis_recommendation": "3-axis"},
649
+ )
650
+ assert p.success is True
651
+ assert p.part_name == "bracket"
652
+
653
+ def test_failure_preview(self):
654
+ p = PreviewData(success=False, error="Execution failed")
655
+ assert p.success is False
656
+ assert p.error == "Execution failed"
657
+
658
+ def test_model_dump(self):
659
+ p = PreviewData(success=True, part_name="gear")
660
+ d = p.model_dump()
661
+ assert d["success"] is True
662
+ assert d["cam"] is None
663
+ assert d["gcode_url"] is None
664
+
665
+
666
+ class TestChatTurnResponse:
667
+ def test_minimal(self):
668
+ r = ChatTurnResponse(design_state=DesignState())
669
+ assert r.responses == []
670
+ assert r.preview is None
671
+ assert r.question_cards == []
672
+
673
+ def test_full(self):
674
+ resp = AgentResponse(agent_id="design", agent_name="D", message="hi", color="#fff", avatar="D")
675
+ preview = PreviewData(success=True, part_name="test")
676
+ state = DesignState(material="aluminum")
677
+ card = QuestionCard(category="material", question="What material?", responsible_agent="engineering", agent_name="Eng", agent_color="#00e676")
678
+ r = ChatTurnResponse(responses=[resp], preview=preview, design_state=state, question_cards=[card])
679
+ assert len(r.responses) == 1
680
+ assert r.preview.part_name == "test"
681
+ assert r.design_state.material == "aluminum"
682
+ assert len(r.question_cards) == 1
683
+
684
+ def test_model_dump_roundtrip(self):
685
+ state = DesignState(part_name="bracket", material="steel")
686
+ r = ChatTurnResponse(design_state=state)
687
+ d = r.model_dump()
688
+ assert d["design_state"]["part_name"] == "bracket"
689
+ assert d["responses"] == []
690
+ assert d["preview"] is None
691
+ ```
692
+
693
+ - [ ] **Step 2: Run test to verify it fails**
694
+
695
+ Run: `cd /home/daniel/NeuralCAD && python -m pytest tests/test_agent_flow.py::TestPreviewData -v`
696
+ Expected: FAIL with `ImportError: cannot import name 'PreviewData'`
697
+
698
+ - [ ] **Step 3: Add PreviewData and ChatTurnResponse to agents/agent_flow.py**
699
+
700
+ After the `AgentFlowState.model_rebuild()` line (line 89), add:
701
+
702
+ ```python
703
+ from agents.gap_analyzer import QuestionCard
704
+
705
+
706
+ class PreviewData(BaseModel):
707
+ """Preview data for a generated CAD model, sent to the frontend."""
708
+ success: bool
709
+ part_name: str = ""
710
+ stl_url: str = ""
711
+ step_url: str = ""
712
+ threemf_url: str = ""
713
+ execution: dict = Field(default_factory=dict)
714
+ validation: dict = Field(default_factory=dict)
715
+ cam: dict | None = None
716
+ gcode_url: str | None = None
717
+ error: str | None = None
718
+
719
+
720
+ class ChatTurnResponse(BaseModel):
721
+ """Unified response envelope from all orchestrator chat_turn() methods."""
722
+ responses: list[AgentResponse] = Field(default_factory=list)
723
+ preview: PreviewData | None = None
724
+ design_state: "DesignState"
725
+ question_cards: list[QuestionCard] = Field(default_factory=list)
726
+ ```
727
+
728
+ Add the forward-ref resolution after the class (similar to CAMPlan pattern):
729
+
730
+ ```python
731
+ from agents.design_state import DesignState # noqa: E402
732
+ ChatTurnResponse.model_rebuild()
733
+ ```
734
+
735
+ - [ ] **Step 4: Run tests**
736
+
737
+ Run: `cd /home/daniel/NeuralCAD && python -m pytest tests/test_agent_flow.py -v`
738
+ Expected: ALL PASS
739
+
740
+ - [ ] **Step 5: Commit**
741
+
742
+ ```bash
743
+ git add agents/agent_flow.py tests/test_agent_flow.py
744
+ git commit -m "feat: add PreviewData and ChatTurnResponse pydantic models"
745
+ ```
746
+
747
+ ---
748
+
749
+ ### Task 6: Update analyze_gaps and extract_decisions to Accept AgentResponse
750
+
751
+ **Files:**
752
+ - Modify: `agents/gap_analyzer.py:88` (`analyze_gaps` signature)
753
+ - Modify: `agents/design_state.py:136-140` (`update_from_messages` signature), `agents/design_state.py:275-281` (`extract_decisions` signature)
754
+ - Test: `tests/test_gap_analyzer.py`, `tests/test_design_state.py`
755
+
756
+ - [ ] **Step 1: Update test_gap_analyzer.py to use AgentResponse objects**
757
+
758
+ Replace all dict literals with `AgentResponse` objects. Import at top:
759
+
760
+ ```python
761
+ from agents.agent_flow import AgentResponse
762
+ ```
763
+
764
+ Replace every test's response dicts. For example, in `test_no_gaps_when_no_not_ready`:
765
+
766
+ ```python
767
+ def test_no_gaps_when_no_not_ready(self):
768
+ responses = [
769
+ AgentResponse(agent_id="design", agent_name="Design", message="I suggest an L-bracket design.", color="#7c3aed", avatar="DA"),
770
+ AgentResponse(agent_id="engineering", agent_name="Engineering", message="Aluminum 6061 would work well.", color="#2979ff", avatar="EA"),
771
+ ]
772
+ result = analyze_gaps(responses)
773
+ assert not result.has_gaps
774
+ assert result.missing_items == []
775
+ ```
776
+
777
+ For `test_detects_not_ready_from_cad`:
778
+
779
+ ```python
780
+ def test_detects_not_ready_from_cad(self):
781
+ responses = [
782
+ AgentResponse(agent_id="cad", agent_name="CAD", message="NOT READY: Need dimensions (width, height) and material selection.", color="#ffab40", avatar="CC"),
783
+ ]
784
+ result = analyze_gaps(responses)
785
+ assert result.has_gaps
786
+ categories = [item.category for item in result.missing_items]
787
+ assert "dimension" in categories
788
+ assert "material" in categories
789
+ ```
790
+
791
+ Apply the same pattern to all other test methods: `test_detects_not_ready_from_cnc`, `test_detects_not_ready_from_cam`, `test_deduplicates_across_agents`, `test_case_insensitive_not_ready`, `test_no_false_positive_on_regular_message`. Each dict `{"agent_id": ..., "message": ...}` becomes `AgentResponse(agent_id=..., agent_name="X", message=..., color="#000", avatar="X")`.
792
+
793
+ - [ ] **Step 2: Update test_design_state.py to use AgentResponse objects**
794
+
795
+ Import at top:
796
+
797
+ ```python
798
+ from agents.agent_flow import AgentResponse
799
+ ```
800
+
801
+ Replace every response dict in `TestExtractDecisions`. For example:
802
+
803
+ ```python
804
+ def test_extracts_material(self):
805
+ responses = [
806
+ AgentResponse(agent_id="engineering", agent_name="Engineering", message="I recommend aluminum 6061 for this application.", color="#2979ff", avatar="EA"),
807
+ ]
808
+ state = extract_decisions(responses, DesignState())
809
+ assert "aluminum" in state.material.lower()
810
+ ```
811
+
812
+ Apply to all: `test_extracts_fastener_features`, `test_extracts_axis_recommendation`, `test_preserves_existing_state`, `test_extracts_decisions_from_agreement`, `test_no_duplicate_features`. Each dict becomes an `AgentResponse(...)`.
813
+
814
+ - [ ] **Step 3: Run tests to verify they fail**
815
+
816
+ Run: `cd /home/daniel/NeuralCAD && python -m pytest tests/test_gap_analyzer.py tests/test_design_state.py -v`
817
+ Expected: FAIL — `analyze_gaps` and `extract_decisions` still expect dicts
818
+
819
+ - [ ] **Step 4: Update analyze_gaps signature in gap_analyzer.py**
820
+
821
+ Change the function signature and body. In `agents/gap_analyzer.py`:
822
+
823
+ Add import at top:
824
+
825
+ ```python
826
+ from __future__ import annotations
827
+ from typing import TYPE_CHECKING
828
+
829
+ if TYPE_CHECKING:
830
+ from agents.agent_flow import AgentResponse as AgentResponseType
831
+ ```
832
+
833
+ Wait — we need to avoid circular imports here. `agents/agent_flow.py` imports from `agents/definitions.py` and `config/settings.py`. `agents/gap_analyzer.py` imports from `config/settings.py` and `agents/definitions.py`. There's no circular dependency — `gap_analyzer` doesn't import `agent_flow` and vice versa. So we can import directly.
834
+
835
+ Change `analyze_gaps`:
836
+
837
+ ```python
838
+ def analyze_gaps(responses: list[AgentResponse]) -> GapAnalysis:
839
+ ```
840
+
841
+ Add the import at the top of the file:
842
+
843
+ ```python
844
+ from agents.agent_flow import AgentResponse
845
+ ```
846
+
847
+ Update the body — change `response.get("message", "")` to `response.message` and `response.get("agent_id", "")` to `response.agent_id`:
848
+
849
+ ```python
850
+ for response in responses:
851
+ message: str = response.message
852
+ agent_id: str = response.agent_id
853
+ ```
854
+
855
+ - [ ] **Step 5: Update update_from_messages and extract_decisions in design_state.py**
856
+
857
+ In `agents/design_state.py`, add a forward-reference import to avoid circular import (design_state is imported by agent_flow):
858
+
859
+ ```python
860
+ from __future__ import annotations
861
+ from typing import TYPE_CHECKING
862
+
863
+ if TYPE_CHECKING:
864
+ from agents.agent_flow import AgentResponse
865
+ ```
866
+
867
+ Change `update_from_messages` signature:
868
+
869
+ ```python
870
+ def update_from_messages(
871
+ self,
872
+ agent_responses: list[AgentResponse],
873
+ user_message: str = "",
874
+ ) -> DesignState:
875
+ ```
876
+
877
+ Update body — change `r.get("message", "")` to `r.message`:
878
+
879
+ ```python
880
+ all_text = user_message + " " + " ".join(r.message for r in agent_responses)
881
+ ```
882
+
883
+ And in the decisions extraction loop:
884
+
885
+ ```python
886
+ for resp in agent_responses:
887
+ msg = resp.message
888
+ ```
889
+
890
+ Change `extract_decisions` wrapper:
891
+
892
+ ```python
893
+ def extract_decisions(
894
+ agent_responses: list[AgentResponse],
895
+ current_state: DesignState,
896
+ user_message: str = "",
897
+ ) -> DesignState:
898
+ ```
899
+
900
+ Note: `design_state.py` already has `from __future__ import annotations` at line 1, so the TYPE_CHECKING import will work fine for type hints without runtime import.
901
+
902
+ - [ ] **Step 6: Run tests**
903
+
904
+ Run: `cd /home/daniel/NeuralCAD && python -m pytest tests/test_gap_analyzer.py tests/test_design_state.py -v`
905
+ Expected: ALL PASS
906
+
907
+ - [ ] **Step 7: Commit**
908
+
909
+ ```bash
910
+ git add agents/gap_analyzer.py agents/design_state.py tests/test_gap_analyzer.py tests/test_design_state.py
911
+ git commit -m "refactor: type analyze_gaps and extract_decisions with AgentResponse"
912
+ ```
913
+
914
+ ---
915
+
916
+ ### Task 7: Update ContextVar State to Use DesignState
917
+
918
+ **Files:**
919
+ - Modify: `agents/tools.py:29-43` (ContextVar and accessors)
920
+ - Modify: `agents/tools.py:112-178` (QueryDesignStateTool)
921
+ - Test: `tests/test_tools.py`
922
+
923
+ - [ ] **Step 1: Update ContextVar and accessors in agents/tools.py**
924
+
925
+ Change lines 29-43:
926
+
927
+ ```python
928
+ from agents.design_state import DesignState
929
+
930
+ _last_shape_var: ContextVar[object | None] = ContextVar("last_shape", default=None)
931
+ _design_state_var: ContextVar[DesignState | None] = ContextVar("design_state", default=None)
932
+
933
+
934
+ def set_last_shape(shape):
935
+ _last_shape_var.set(shape)
936
+
937
+ def get_last_shape():
938
+ return _last_shape_var.get()
939
+
940
+ def set_design_state(state: DesignState):
941
+ _design_state_var.set(state)
942
+
943
+ def get_design_state() -> DesignState | None:
944
+ return _design_state_var.get()
945
+ ```
946
+
947
+ - [ ] **Step 2: Update QueryDesignStateTool to use DesignState directly**
948
+
949
+ Change `QueryDesignStateTool._run()` to stop reconstructing from dict:
950
+
951
+ ```python
952
+ def _run(self, check: str = "all") -> str:
953
+ from agents.design_state import compute_score
954
+ from config.settings import settings
955
+
956
+ if check not in VALID_CHECKS:
957
+ return json.dumps({"error": f"Invalid check: {check!r}. Valid: {sorted(VALID_CHECKS)}"})
958
+
959
+ state = get_design_state()
960
+ if state is None:
961
+ return json.dumps({"error": "No design state available."})
962
+
963
+ score = compute_score(state)
964
+ threshold = settings.planning.threshold
965
+
966
+ known = {}
967
+ missing = []
968
+ # ... rest unchanged — state is already a DesignState ...
969
+ ```
970
+
971
+ Remove the `from agents.design_state import DesignState` line inside `_run()` (it's now at module level) and remove `state = DesignState(**state_dict)` line — `state` is already a `DesignState`.
972
+
973
+ - [ ] **Step 3: Run tests**
974
+
975
+ Run: `cd /home/daniel/NeuralCAD && python -m pytest tests/test_tools.py -v`
976
+ Expected: ALL PASS
977
+
978
+ - [ ] **Step 4: Commit**
979
+
980
+ ```bash
981
+ git add agents/tools.py
982
+ git commit -m "refactor: type ContextVar design state as DesignState"
983
+ ```
984
+
985
+ ---
986
+
987
+ ### Task 8: Update BaseOrchestrator and MockChatBackend
988
+
989
+ **Files:**
990
+ - Modify: `agents/base.py`
991
+ - Modify: `agents/orchestrator.py`
992
+ - Test: `tests/test_mock_orchestrator.py`, `tests/test_base_orchestrator.py`
993
+
994
+ - [ ] **Step 1: Update BaseOrchestrator.chat_turn signature**
995
+
996
+ In `agents/base.py`:
997
+
998
+ ```python
999
+ """Base orchestrator — abstract interface for all chat orchestrators."""
1000
+
1001
+ from __future__ import annotations
1002
+
1003
+ from abc import ABC, abstractmethod
1004
+ from pathlib import Path
1005
+ from typing import TYPE_CHECKING
1006
+
1007
+ from config.settings import settings
1008
+
1009
+ if TYPE_CHECKING:
1010
+ from agents.agent_flow import ChatTurnResponse
1011
+ from agents.design_state import DesignState
1012
+
1013
+
1014
+ class BaseOrchestrator(ABC):
1015
+ """Abstract base for MockChatBackend and CrewOrchestrator."""
1016
+
1017
+ def __init__(self, output_dir: Path | str | None = None):
1018
+ self.output_dir = Path(output_dir) if output_dir else settings.output_dir
1019
+ self.output_dir.mkdir(parents=True, exist_ok=True)
1020
+
1021
+ @abstractmethod
1022
+ def chat_turn(
1023
+ self,
1024
+ message: str,
1025
+ history: list[dict],
1026
+ mentions: list[str] | None = None,
1027
+ design_state: DesignState | None = None,
1028
+ plan_context: bool = False,
1029
+ ) -> ChatTurnResponse:
1030
+ """Run one chat turn. Returns ChatTurnResponse."""
1031
+ ...
1032
+ ```
1033
+
1034
+ - [ ] **Step 2: Update _format_response and MockChatBackend in orchestrator.py**
1035
+
1036
+ Replace `_format_response()` with usage of `AgentResponse.from_agent()`. In `agents/orchestrator.py`:
1037
+
1038
+ Remove the `_format_response` function entirely.
1039
+
1040
+ Add import:
1041
+
1042
+ ```python
1043
+ from agents.agent_flow import AgentResponse, ChatTurnResponse, PreviewData
1044
+ ```
1045
+
1046
+ Update `_execute_cad_code` to return `PreviewData | None`:
1047
+
1048
+ ```python
1049
+ def _execute_cad_code(
1050
+ code: str,
1051
+ prompt: str,
1052
+ output_dir: Path,
1053
+ backend: object | None = None,
1054
+ max_retries: int = 2,
1055
+ cam_plan: "CAMPlan | None" = None,
1056
+ ) -> PreviewData | None:
1057
+ ```
1058
+
1059
+ Replace the dict construction with `PreviewData(...)`:
1060
+
1061
+ ```python
1062
+ if not exec_result.success:
1063
+ return PreviewData(success=False, error=exec_result.error)
1064
+
1065
+ # ...
1066
+
1067
+ preview_data = PreviewData(
1068
+ success=True,
1069
+ part_name=part_name,
1070
+ stl_url=f"/api/models/{part_name}.stl",
1071
+ step_url=f"/api/models/{part_name}.step",
1072
+ execution=exec_result.model_dump(by_alias=True),
1073
+ validation=validation.model_dump(),
1074
+ )
1075
+
1076
+ if cam_plan:
1077
+ cam_operations = cam_plan.operations
1078
+ cam_tool = cam_plan.to_tool_config()
1079
+ cam_post = cam_plan.post_processor
1080
+ if cam_operations:
1081
+ cam_result = generate_gcode(
1082
+ shape=exec_result.result,
1083
+ operations=cam_operations,
1084
+ tool_config=cam_tool,
1085
+ post_processor=cam_post,
1086
+ )
1087
+ preview_data.cam = cam_result.model_dump()
1088
+ if cam_result.success and cam_result.gcode:
1089
+ gcode_path = output_dir / f"{part_name}.gcode"
1090
+ gcode_path.write_text(cam_result.gcode)
1091
+ preview_data.gcode_url = f"/api/models/{part_name}.gcode"
1092
+
1093
+ return preview_data
1094
+ ```
1095
+
1096
+ Update `MockChatBackend.chat_turn()`:
1097
+
1098
+ ```python
1099
+ def chat_turn(
1100
+ self,
1101
+ message: str,
1102
+ history: list[dict],
1103
+ mentions: list[str] | None = None,
1104
+ max_history: int = 30,
1105
+ design_state: DesignState | None = None,
1106
+ plan_context: bool = False,
1107
+ ) -> ChatTurnResponse:
1108
+ """Return ChatTurnResponse."""
1109
+ state = design_state if isinstance(design_state, DesignState) else DesignState(**(design_state or {}))
1110
+ lower = message.lower()
1111
+
1112
+ if mentions:
1113
+ active = mentions
1114
+ else:
1115
+ active = route_agents(message, mentions=[], is_approved_phase=False)
1116
+
1117
+ responses: list[AgentResponse] = []
1118
+ preview = None
1119
+
1120
+ if "design" in active:
1121
+ responses.append(AgentResponse.from_agent("design", self._design_response(lower)))
1122
+
1123
+ if "engineering" in active:
1124
+ responses.append(AgentResponse.from_agent("engineering", self._engineering_response(lower)))
1125
+
1126
+ if "cnc" in active:
1127
+ responses.append(AgentResponse.from_agent("cnc", self._cnc_response(lower)))
1128
+
1129
+ if "cad" in active:
1130
+ from core.cadquery_prompts import build_messages
1131
+ mock = MockBackend()
1132
+ code = mock.generate(build_messages(message))
1133
+ responses.append(
1134
+ AgentResponse.from_agent("cad", "Model generated. Click the 3D viewer to inspect it.", code=code)
1135
+ )
1136
+ preview = _execute_cad_code(code, message, self.output_dir)
1137
+
1138
+ updated_state = extract_decisions(responses, state, message)
1139
+
1140
+ return ChatTurnResponse(responses=responses, preview=preview, design_state=updated_state)
1141
+ ```
1142
+
1143
+ - [ ] **Step 3: Update test_mock_orchestrator.py**
1144
+
1145
+ Tests now assert on `ChatTurnResponse` attributes:
1146
+
1147
+ ```python
1148
+ """Tests for agents/orchestrator.py — MockChatBackend and helpers."""
1149
+
1150
+ from agents.orchestrator import MockChatBackend
1151
+ from agents.agent_flow import AgentResponse, ChatTurnResponse
1152
+ from agents.definitions import AGENTS
1153
+
1154
+
1155
+ class TestMockChatBackend:
1156
+ def test_response_shape(self, tmp_output_dir):
1157
+ mock = MockChatBackend(output_dir=tmp_output_dir)
1158
+ result = mock.chat_turn("I need a bracket", history=[])
1159
+ assert isinstance(result, ChatTurnResponse)
1160
+ assert isinstance(result.responses, list)
1161
+ assert len(result.responses) > 0
1162
+ assert isinstance(result.responses[0], AgentResponse)
1163
+
1164
+ def test_bracket_routes_to_design(self, tmp_output_dir):
1165
+ mock = MockChatBackend(output_dir=tmp_output_dir)
1166
+ result = mock.chat_turn("Design a mounting bracket", history=[])
1167
+ agent_ids = [r.agent_id for r in result.responses]
1168
+ assert "design" in agent_ids
1169
+
1170
+ def test_mention_overrides_routing(self, tmp_output_dir):
1171
+ mock = MockChatBackend(output_dir=tmp_output_dir)
1172
+ result = mock.chat_turn("What do you think?", history=[], mentions=["cnc"])
1173
+ agent_ids = [r.agent_id for r in result.responses]
1174
+ assert agent_ids == ["cnc"]
1175
+
1176
+ def test_cad_mention_generates_code(self, tmp_output_dir):
1177
+ mock = MockChatBackend(output_dir=tmp_output_dir)
1178
+ result = mock.chat_turn("Generate a 50mm cube", history=[], mentions=["cad"])
1179
+ agent_ids = [r.agent_id for r in result.responses]
1180
+ assert "cad" in agent_ids
1181
+ cad_resp = next(r for r in result.responses if r.agent_id == "cad")
1182
+ assert cad_resp.code is not None
1183
+ assert "result" in cad_resp.code
1184
+
1185
+ def test_design_state_updated(self, tmp_output_dir):
1186
+ mock = MockChatBackend(output_dir=tmp_output_dir)
1187
+ result = mock.chat_turn("Make it 60mm wide in aluminum", history=[])
1188
+ assert result.design_state is not None
1189
+
1190
+ def test_engineering_keywords_trigger_engineering(self, tmp_output_dir):
1191
+ mock = MockChatBackend(output_dir=tmp_output_dir)
1192
+ result = mock.chat_turn("Use M6 bolts with 3mm wall thickness", history=[])
1193
+ agent_ids = [r.agent_id for r in result.responses]
1194
+ assert "engineering" in agent_ids
1195
+
1196
+ def test_cnc_keywords_trigger_cnc(self, tmp_output_dir):
1197
+ mock = MockChatBackend(output_dir=tmp_output_dir)
1198
+ result = mock.chat_turn("Can this be machined on a CNC mill?", history=[])
1199
+ agent_ids = [r.agent_id for r in result.responses]
1200
+ assert "cnc" in agent_ids
1201
+
1202
+ def test_generic_message_default_agents(self, tmp_output_dir):
1203
+ mock = MockChatBackend(output_dir=tmp_output_dir)
1204
+ result = mock.chat_turn("Hello there", history=[])
1205
+ agent_ids = [r.agent_id for r in result.responses]
1206
+ assert "design" in agent_ids
1207
+ assert "engineering" in agent_ids
1208
+ ```
1209
+
1210
+ - [ ] **Step 4: Run tests**
1211
+
1212
+ Run: `cd /home/daniel/NeuralCAD && python -m pytest tests/test_mock_orchestrator.py tests/test_base_orchestrator.py -v`
1213
+ Expected: ALL PASS
1214
+
1215
+ - [ ] **Step 5: Commit**
1216
+
1217
+ ```bash
1218
+ git add agents/base.py agents/orchestrator.py tests/test_mock_orchestrator.py
1219
+ git commit -m "refactor: update BaseOrchestrator and MockChatBackend to return ChatTurnResponse"
1220
+ ```
1221
+
1222
+ ---
1223
+
1224
+ ### Task 9: Update CrewOrchestrator to Return ChatTurnResponse
1225
+
1226
+ **Files:**
1227
+ - Modify: `agents/crew_orchestrator.py`
1228
+ - Test: `tests/test_crew_orchestrator.py`
1229
+
1230
+ This is the largest change. The crew orchestrator currently builds dicts at multiple points and serializes AgentResponse objects to dicts. We reverse that: keep everything as typed models.
1231
+
1232
+ - [ ] **Step 1: Update CrewOrchestrator imports**
1233
+
1234
+ At the top of `agents/crew_orchestrator.py`, update imports:
1235
+
1236
+ ```python
1237
+ from agents.agent_flow import AgentResponse, ChatTurnResponse, PreviewData
1238
+ ```
1239
+
1240
+ - [ ] **Step 2: Update chat_turn method**
1241
+
1242
+ Change the signature to match the base:
1243
+
1244
+ ```python
1245
+ def chat_turn(
1246
+ self,
1247
+ message: str,
1248
+ history: list[dict],
1249
+ mentions: list[str] | None = None,
1250
+ max_history: int = 30,
1251
+ design_state: DesignState | None = None,
1252
+ plan_context: bool = False,
1253
+ ) -> ChatTurnResponse:
1254
+ ```
1255
+
1256
+ Update the plan trigger early return:
1257
+
1258
+ ```python
1259
+ state = design_state if isinstance(design_state, DesignState) else DesignState(**(design_state or {}))
1260
+ if state.phase == "exploring" and _is_plan_trigger(message):
1261
+ score = compute_score(state)
1262
+ plan = DesignPlan.from_state(state, confidence_score=score)
1263
+ state.phase = "planning"
1264
+ state.plan = plan
1265
+ return ChatTurnResponse(design_state=state)
1266
+ ```
1267
+
1268
+ Update error fallback return:
1269
+
1270
+ ```python
1271
+ return ChatTurnResponse(
1272
+ responses=[AgentResponse.from_agent(
1273
+ "design",
1274
+ f"Backend error: {exc}. Fallback also failed: {fallback_exc}. "
1275
+ f"Please check that your API key is set correctly.",
1276
+ )],
1277
+ design_state=DesignState(**(design_state.model_dump() if isinstance(design_state, DesignState) else design_state or {})),
1278
+ )
1279
+ ```
1280
+
1281
+ - [ ] **Step 3: Update _run_crew method**
1282
+
1283
+ Change return type and body:
1284
+
1285
+ ```python
1286
+ def _run_crew(
1287
+ self,
1288
+ message: str,
1289
+ history: list[dict],
1290
+ mentions: list[str] | None,
1291
+ max_history: int,
1292
+ design_state: DesignState | None,
1293
+ plan_context: bool = False,
1294
+ ) -> ChatTurnResponse:
1295
+ ```
1296
+
1297
+ Update `set_design_state` call — pass the DesignState directly (not `.model_dump()`):
1298
+
1299
+ ```python
1300
+ set_design_state(state)
1301
+ ```
1302
+
1303
+ Remove the `responses = [r.model_dump() for r in agent_responses]` line. Keep responses as `list[AgentResponse]`.
1304
+
1305
+ Build `PreviewData` instead of dict for preview:
1306
+
1307
+ ```python
1308
+ preview = None
1309
+ if cad_code:
1310
+ from agents.tools import get_last_shape
1311
+ shape = get_last_shape()
1312
+ if shape is not None:
1313
+ from core.executor import export_all
1314
+ from core.validator import validate_for_cnc
1315
+
1316
+ part_name = derive_part_name(message)
1317
+ base_path = self.output_dir / part_name
1318
+ try:
1319
+ export_all(shape, base_path)
1320
+ except Exception:
1321
+ pass
1322
+
1323
+ execution_data = {"success": True}
1324
+ try:
1325
+ bb = shape.val().BoundingBox()
1326
+ execution_data["volume_mm3"] = shape.val().Volume()
1327
+ execution_data["bounding_box_mm"] = [bb.xlen, bb.ylen, bb.zlen]
1328
+ execution_data["face_count"] = len(shape.faces().vals())
1329
+ execution_data["edge_count"] = len(shape.edges().vals())
1330
+ except Exception:
1331
+ pass
1332
+
1333
+ validation = validate_for_cnc(shape, part_name=part_name)
1334
+ preview = PreviewData(
1335
+ success=True,
1336
+ part_name=part_name,
1337
+ stl_url=f"/api/models/{part_name}.stl",
1338
+ step_url=f"/api/models/{part_name}.step",
1339
+ threemf_url=f"/api/models/{part_name}.3mf",
1340
+ execution=execution_data,
1341
+ validation=validation.model_dump(),
1342
+ )
1343
+ ```
1344
+
1345
+ Update G-code generation to use `preview.part_name` etc.:
1346
+
1347
+ ```python
1348
+ if preview and preview.success and cam_plan:
1349
+ from core.cam import generate_gcode
1350
+ from agents.tools import get_last_shape
1351
+ shape = get_last_shape()
1352
+ if shape is not None:
1353
+ cam_result = generate_gcode(
1354
+ shape=shape,
1355
+ operations=cam_plan.operations,
1356
+ tool_config=cam_plan.to_tool_config(),
1357
+ post_processor=cam_plan.post_processor,
1358
+ )
1359
+ preview.cam = cam_result.model_dump()
1360
+ if cam_result.success and cam_result.gcode:
1361
+ gcode_path = self.output_dir / f"{preview.part_name}.gcode"
1362
+ gcode_path.write_text(cam_result.gcode)
1363
+ preview.gcode_url = f"/api/models/{preview.part_name}.gcode"
1364
+ ```
1365
+
1366
+ Pass `agent_responses` (list of `AgentResponse`) directly to `extract_decisions` and `analyze_gaps`:
1367
+
1368
+ ```python
1369
+ updated_state = extract_decisions(agent_responses, state, message)
1370
+
1371
+ gap_result = analyze_gaps(agent_responses)
1372
+ question_cards = []
1373
+ if gap_result.has_gaps:
1374
+ question_cards = generate_question_cards(gap_result, updated_state, user_message=message)
1375
+ ```
1376
+
1377
+ Update the NOT READY check to use `AgentResponse` attributes:
1378
+
1379
+ ```python
1380
+ if state.phase == "approved":
1381
+ for r in agent_responses:
1382
+ if r.agent_id == "cad" and r.message.upper().startswith("NOT READY:"):
1383
+ updated_state.phase = "exploring"
1384
+ updated_state.plan = None
1385
+ break
1386
+ ```
1387
+
1388
+ Return `ChatTurnResponse`:
1389
+
1390
+ ```python
1391
+ return ChatTurnResponse(
1392
+ responses=agent_responses,
1393
+ preview=preview,
1394
+ design_state=updated_state,
1395
+ question_cards=question_cards,
1396
+ )
1397
+ ```
1398
+
1399
+ - [ ] **Step 4: Update _fallback method**
1400
+
1401
+ ```python
1402
+ def _fallback(
1403
+ self,
1404
+ message: str,
1405
+ history: list[dict],
1406
+ mentions: list[str] | None,
1407
+ max_history: int,
1408
+ design_state: DesignState | None,
1409
+ plan_context: bool = False,
1410
+ ) -> ChatTurnResponse:
1411
+ """Fall back to MockChatBackend."""
1412
+ from agents.tools import set_design_state
1413
+ from agents.orchestrator import MockChatBackend
1414
+
1415
+ state = design_state if isinstance(design_state, DesignState) else DesignState(**(design_state or {}))
1416
+ state = state.update_from_messages([], user_message=message)
1417
+ set_design_state(state)
1418
+
1419
+ mock = MockChatBackend(output_dir=self.output_dir)
1420
+ result = mock.chat_turn(message, history, mentions, design_state=state, plan_context=plan_context)
1421
+ if not result.question_cards:
1422
+ gap_result = analyze_gaps(result.responses)
1423
+ if gap_result.has_gaps:
1424
+ result.question_cards = generate_question_cards(gap_result, state, user_message=message)
1425
+ return result
1426
+ ```
1427
+
1428
+ - [ ] **Step 5: Update test_crew_orchestrator.py**
1429
+
1430
+ Update tests to assert on `ChatTurnResponse` attributes:
1431
+
1432
+ ```python
1433
+ class TestCrewOrchestratorFallback:
1434
+ def test_falls_back_when_crewai_unavailable(self, tmp_output_dir):
1435
+ orch = CrewOrchestrator(backend_name="gemini", output_dir=tmp_output_dir)
1436
+ orch._crew_available = False
1437
+ result = orch.chat_turn("test", history=[])
1438
+ assert isinstance(result, ChatTurnResponse)
1439
+ assert result.preview is None or isinstance(result.preview, PreviewData)
1440
+
1441
+ def test_response_format(self, tmp_output_dir):
1442
+ orch = CrewOrchestrator(backend_name="gemini", output_dir=tmp_output_dir)
1443
+ orch._crew_available = False
1444
+ result = orch.chat_turn("I need a bracket", history=[])
1445
+ assert isinstance(result.responses, list)
1446
+ assert isinstance(result.design_state, DesignState)
1447
+ ```
1448
+
1449
+ Add import at top:
1450
+
1451
+ ```python
1452
+ from agents.agent_flow import ChatTurnResponse, PreviewData
1453
+ from agents.design_state import DesignState
1454
+ ```
1455
+
1456
+ Update `TestGapAnalysis`:
1457
+
1458
+ ```python
1459
+ class TestGapAnalysis:
1460
+ def test_not_ready_produces_question_cards(self):
1461
+ orch = CrewOrchestrator(backend_name="mock")
1462
+ result = orch.chat_turn(message="generate a bracket", history=[], design_state=None)
1463
+ assert isinstance(result.question_cards, list)
1464
+
1465
+ def test_no_question_cards_when_no_gaps(self):
1466
+ orch = CrewOrchestrator(backend_name="mock")
1467
+ result = orch.chat_turn(
1468
+ message="I need a bracket", history=[],
1469
+ design_state=DesignState(material="aluminum", dimensions={"width": 60}),
1470
+ )
1471
+ assert isinstance(result.question_cards, list)
1472
+
1473
+ def test_plan_trigger_includes_question_cards_key(self):
1474
+ orch = CrewOrchestrator(backend_name="mock")
1475
+ result = orch.chat_turn(
1476
+ message="show plan", history=[],
1477
+ design_state=DesignState(material="aluminum"),
1478
+ )
1479
+ assert result.question_cards == []
1480
+ ```
1481
+
1482
+ Update `TestPlanningPhase`:
1483
+
1484
+ ```python
1485
+ class TestPlanningPhase:
1486
+ def test_manual_plan_trigger(self):
1487
+ orch = CrewOrchestrator(backend_name="mock")
1488
+ state = DesignState(
1489
+ part_name="bracket",
1490
+ material="aluminum 6061",
1491
+ dimensions={"width": 60, "height": 40, "depth": 20},
1492
+ axis_recommendation="3-axis",
1493
+ )
1494
+ result = orch.chat_turn(message="show plan", history=[], design_state=state)
1495
+ assert result.design_state.phase == "planning"
1496
+ assert result.design_state.plan is not None
1497
+ assert result.design_state.plan.material == "aluminum 6061"
1498
+
1499
+ def test_approved_phase_keeps_approved(self):
1500
+ orch = CrewOrchestrator(backend_name="mock")
1501
+ plan = DesignPlan(
1502
+ part_name="bracket", description="test", material="aluminum",
1503
+ dimensions={"width": 60}, features=[], constraints=[],
1504
+ axis_recommendation="3-axis", machining_notes=[],
1505
+ confidence_score=9.0,
1506
+ )
1507
+ state = DesignState(
1508
+ phase="approved", plan=plan,
1509
+ material="aluminum", dimensions={"width": 60},
1510
+ )
1511
+ result = orch.chat_turn(message="Generate the approved design", history=[], design_state=state)
1512
+ assert isinstance(result.responses, list)
1513
+
1514
+ def test_planning_phase_resets_on_message(self):
1515
+ orch = CrewOrchestrator(backend_name="mock")
1516
+ plan = DesignPlan(
1517
+ part_name="bracket", description="", material="steel",
1518
+ dimensions={}, features=[], constraints=[],
1519
+ axis_recommendation="", machining_notes=[],
1520
+ confidence_score=5.0,
1521
+ )
1522
+ state = DesignState(phase="planning", plan=plan, material="steel")
1523
+ result = orch.chat_turn(message="actually change the material", history=[], design_state=state)
1524
+ assert result.design_state.phase in ("exploring", "planning")
1525
+ ```
1526
+
1527
+ - [ ] **Step 6: Run tests**
1528
+
1529
+ Run: `cd /home/daniel/NeuralCAD && python -m pytest tests/test_crew_orchestrator.py tests/test_mock_orchestrator.py -v`
1530
+ Expected: ALL PASS
1531
+
1532
+ - [ ] **Step 7: Commit**
1533
+
1534
+ ```bash
1535
+ git add agents/crew_orchestrator.py tests/test_crew_orchestrator.py
1536
+ git commit -m "refactor: update CrewOrchestrator to return ChatTurnResponse"
1537
+ ```
1538
+
1539
+ ---
1540
+
1541
+ ### Task 10: Update Server Routes to Use Typed Models
1542
+
1543
+ **Files:**
1544
+ - Modify: `server/routes.py`
1545
+ - Test: `tests/test_api_routes.py`
1546
+
1547
+ The server routes are the HTTP boundary. They receive JSON (dicts) from clients and return JSON. The key changes: type request model fields, use `ChatTurnResponse.model_dump()` for JSON serialization.
1548
+
1549
+ - [ ] **Step 1: Update request models in server/routes.py**
1550
+
1551
+ ```python
1552
+ from agents.design_state import DesignState, DesignPlan
1553
+
1554
+
1555
+ class ChatRequest(BaseModel):
1556
+ message: str = Field(..., min_length=1)
1557
+ history: list[ChatMessage] = Field(default_factory=list)
1558
+ mentions: list[str] = Field(default_factory=list)
1559
+ backend: str = "gemini"
1560
+ design_state: DesignState = Field(default_factory=DesignState)
1561
+ plan_context: bool = False
1562
+
1563
+
1564
+ class PlanApproveRequest(BaseModel):
1565
+ plan: DesignPlan
1566
+ design_state: DesignState = Field(default_factory=DesignState)
1567
+
1568
+
1569
+ class PlanRejectRequest(BaseModel):
1570
+ design_state: DesignState = Field(default_factory=DesignState)
1571
+ ```
1572
+
1573
+ - [ ] **Step 2: Update chat endpoint**
1574
+
1575
+ ```python
1576
+ @router.post("/api/chat")
1577
+ async def chat(body: ChatRequest):
1578
+ """Multi-agent chat turn."""
1579
+ message = body.message.strip()
1580
+ history = [m.model_dump() for m in body.history]
1581
+ backend_name = body.backend
1582
+
1583
+ raw_mentions = body.mentions
1584
+ if not raw_mentions:
1585
+ message, raw_mentions = parse_mentions(message)
1586
+
1587
+ mentions = raw_mentions if raw_mentions else None
1588
+
1589
+ orchestrator = get_orchestrator(backend_name, output_dir=OUTPUT_DIR)
1590
+
1591
+ try:
1592
+ result = orchestrator.chat_turn(
1593
+ message=message,
1594
+ history=history,
1595
+ mentions=mentions,
1596
+ design_state=body.design_state,
1597
+ plan_context=body.plan_context,
1598
+ )
1599
+ return JSONResponse(result.model_dump())
1600
+ except Exception as e:
1601
+ import logging
1602
+ logging.exception("Chat turn failed")
1603
+ return JSONResponse(
1604
+ {"error": f"Chat turn failed: {e}"},
1605
+ status_code=500,
1606
+ )
1607
+ ```
1608
+
1609
+ - [ ] **Step 3: Update plan endpoints**
1610
+
1611
+ ```python
1612
+ @router.post("/api/plan/approve")
1613
+ async def plan_approve(body: PlanApproveRequest):
1614
+ """Approve (possibly edited) design plan, merge into state."""
1615
+ plan = body.plan
1616
+ state = body.design_state
1617
+ state.part_name = plan.part_name
1618
+ state.description = plan.description
1619
+ state.material = plan.material
1620
+ state.dimensions = dict(plan.dimensions)
1621
+ state.features = list(plan.features)
1622
+ state.constraints = list(plan.constraints)
1623
+ state.axis_recommendation = plan.axis_recommendation
1624
+ state.phase = "approved"
1625
+ state.plan = plan
1626
+ return JSONResponse({"design_state": state.model_dump()})
1627
+
1628
+
1629
+ @router.post("/api/plan/reject")
1630
+ async def plan_reject(body: PlanRejectRequest):
1631
+ """Reject plan, reset to exploring."""
1632
+ state = body.design_state
1633
+ state.phase = "exploring"
1634
+ state.plan = None
1635
+ return JSONResponse({"design_state": state.model_dump()})
1636
+ ```
1637
+
1638
+ - [ ] **Step 4: Update report endpoint to access ChatMessage fields directly**
1639
+
1640
+ ```python
1641
+ @router.post("/api/report")
1642
+ async def report(body: ReportRequest):
1643
+ """Generate a design report from conversation history."""
1644
+ part_name = body.part_name
1645
+ report_sections = [f"# Design Report: {part_name}\n"]
1646
+
1647
+ design_decisions = []
1648
+ engineering_specs = []
1649
+ cnc_notes = []
1650
+
1651
+ for msg in body.history:
1652
+ if msg.agent_id == "design":
1653
+ design_decisions.append(msg.content)
1654
+ elif msg.agent_id == "engineering":
1655
+ engineering_specs.append(msg.content)
1656
+ elif msg.agent_id == "cnc":
1657
+ cnc_notes.append(msg.content)
1658
+ # ... rest unchanged ...
1659
+ ```
1660
+
1661
+ - [ ] **Step 5: Run API tests**
1662
+
1663
+ Run: `cd /home/daniel/NeuralCAD && python -m pytest tests/test_api_routes.py -v`
1664
+ Expected: ALL PASS (JSON shape stays the same — Pydantic serializes to the same structure)
1665
+
1666
+ - [ ] **Step 6: Commit**
1667
+
1668
+ ```bash
1669
+ git add server/routes.py
1670
+ git commit -m "refactor: type server route request models with DesignState/DesignPlan"
1671
+ ```
1672
+
1673
+ ---
1674
+
1675
+ ### Task 11: Update conftest Fixtures and Run Full Test Suite
1676
+
1677
+ **Files:**
1678
+ - Modify: `tests/conftest.py`
1679
+
1680
+ - [ ] **Step 1: Update conftest fixtures**
1681
+
1682
+ ```python
1683
+ """Shared fixtures for NeuralCAD tests."""
1684
+
1685
+ import pytest
1686
+ from pathlib import Path
1687
+
1688
+ from agents.design_state import DesignState
1689
+
1690
+
1691
+ @pytest.fixture
1692
+ def tmp_output_dir(tmp_path):
1693
+ """Temporary output directory for model files."""
1694
+ out = tmp_path / "output"
1695
+ out.mkdir()
1696
+ return out
1697
+
1698
+
1699
+ @pytest.fixture
1700
+ def sample_history():
1701
+ """A typical multi-turn conversation history."""
1702
+ return [
1703
+ {"role": "user", "content": "I need a servo bracket for an MG996R"},
1704
+ {"role": "agent", "agent_id": "design", "content": "I'd suggest an L-bracket with a servo pocket on the vertical face."},
1705
+ {"role": "agent", "agent_id": "engineering", "content": "3mm wall thickness in aluminum 6061-T6 should handle the load."},
1706
+ {"role": "user", "content": "Make it 60mm wide with M4 base mounting holes"},
1707
+ ]
1708
+
1709
+
1710
+ @pytest.fixture
1711
+ def empty_design_state():
1712
+ """Empty design state."""
1713
+ return DesignState()
1714
+
1715
+
1716
+ @pytest.fixture
1717
+ def populated_design_state():
1718
+ """Design state with some decisions already made."""
1719
+ return DesignState(
1720
+ part_name="servo_bracket",
1721
+ material="aluminum 6061",
1722
+ dimensions={"width": 60.0},
1723
+ features=["4x M4 holes"],
1724
+ decisions=["L-bracket form factor"],
1725
+ )
1726
+
1727
+
1728
+ class FakeLLMBackend:
1729
+ """A controllable fake LLM backend for testing orchestrators."""
1730
+
1731
+ def __init__(self, response: str = '{"agents": []}'):
1732
+ self.response = response
1733
+ self.calls: list[list[dict]] = []
1734
+
1735
+ def generate(self, messages: list[dict]) -> str:
1736
+ self.calls.append(messages)
1737
+ return self.response
1738
+
1739
+
1740
+ @pytest.fixture
1741
+ def fake_backend():
1742
+ """FakeLLMBackend factory — call with desired JSON response."""
1743
+ def _make(response: str = '{"agents": []}'):
1744
+ return FakeLLMBackend(response)
1745
+ return _make
1746
+ ```
1747
+
1748
+ - [ ] **Step 2: Run the full test suite**
1749
+
1750
+ Run: `cd /home/daniel/NeuralCAD && python -m pytest tests/ -v --tb=short`
1751
+ Expected: ALL PASS
1752
+
1753
+ - [ ] **Step 3: Commit**
1754
+
1755
+ ```bash
1756
+ git add tests/conftest.py
1757
+ git commit -m "refactor: update test fixtures to use Pydantic models"
1758
+ ```
1759
+
1760
+ ---
1761
+
1762
+ ### Task 12: Update MCP Server to Use Typed Models
1763
+
1764
+ **Files:**
1765
+ - Modify: `server/mcp.py`
1766
+
1767
+ The MCP tools return JSON strings, so model usage is internal. The main change is using `ToolConfig` for the validate endpoint.
1768
+
1769
+ - [ ] **Step 1: Update validate_cnc_model config parameter**
1770
+
1771
+ In `server/mcp.py`, the `validate_cnc_model` function builds a config dict on line 178-181. Update to pass it through `_get_validation_config`:
1772
+
1773
+ ```python
1774
+ if exec_result.success:
1775
+ config = {
1776
+ "min_wall_thickness_mm": min_wall_thickness_mm,
1777
+ "max_part_size_mm": max_part_size_mm,
1778
+ }
1779
+ validation = validate_for_cnc(exec_result.result, part_name=part_name, config=config)
1780
+ ```
1781
+
1782
+ This stays as-is since `validate_for_cnc` already uses `_get_validation_config(overrides)` internally, and the config dict serves as override kwargs. No change needed here.
1783
+
1784
+ - [ ] **Step 2: Run full test suite one final time**
1785
+
1786
+ Run: `cd /home/daniel/NeuralCAD && python -m pytest tests/ -v --tb=short`
1787
+ Expected: ALL PASS
1788
+
1789
+ - [ ] **Step 3: Commit**
1790
+
1791
+ ```bash
1792
+ git commit --allow-empty -m "refactor: verify MCP server compatible with pydantic unification"
1793
+ ```
1794
+
1795
+ ---
1796
+
1797
+ ### Task 13: Final Cleanup — Remove Dead Imports and Verify
1798
+
1799
+ **Files:**
1800
+ - All modified files
1801
+
1802
+ - [ ] **Step 1: Check for dead imports**
1803
+
1804
+ Run: `cd /home/daniel/NeuralCAD && python -m pytest tests/ -v --tb=short 2>&1 | head -80`
1805
+
1806
+ Verify no `ImportError` or `AttributeError` warnings.
1807
+
1808
+ - [ ] **Step 2: Verify no remaining dict returns from orchestrators**
1809
+
1810
+ Run: `cd /home/daniel/NeuralCAD && grep -rn "-> dict" agents/base.py agents/orchestrator.py agents/crew_orchestrator.py`
1811
+ Expected: No matches (all return `ChatTurnResponse` now)
1812
+
1813
+ - [ ] **Step 3: Verify no remaining dict parameters for design_state**
1814
+
1815
+ Run: `cd /home/daniel/NeuralCAD && grep -rn "design_state: dict" agents/ server/`
1816
+ Expected: No matches
1817
+
1818
+ - [ ] **Step 4: Run full test suite**
1819
+
1820
+ Run: `cd /home/daniel/NeuralCAD && python -m pytest tests/ -v`
1821
+ Expected: ALL PASS
1822
+
1823
+ - [ ] **Step 5: Final commit**
1824
+
1825
+ ```bash
1826
+ git add -A
1827
+ git commit -m "refactor: complete pydantic unification — all models typed, no raw dicts"
1828
+ ```