CallMeDaniel Claude Opus 4.6 (1M context) commited on
Commit
58e84f7
·
1 Parent(s): 9f111c1

docs: add CrewAI Flow refactor implementation plan

Browse files

8 tasks: AgentResponse model, FlowState, routing, readiness checks,
full AgentDispatchFlow, crew_orchestrator integration, RoutingEngine
removal, and final validation. TDD throughout.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

docs/superpowers/plans/2026-04-12-crewai-flow-refactor.md ADDED
@@ -0,0 +1,1469 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # CrewAI Flow Refactor Implementation Plan
2
+
3
+ > **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
4
+
5
+ **Goal:** Replace `Crew.kickoff()` and `RoutingEngine` with CrewAI Flow's `@start`/`@listen`/`@router` decorators for agent dispatch, enabling parallel advisory agents and router-gated generation.
6
+
7
+ **Architecture:** New `agents/agent_flow.py` contains typed models (`AgentResponse`, `AgentFlowState`) and `AgentDispatchFlow` — a Flow with two routers: `route_message` (replaces `RoutingEngine`) and `check_readiness` (gates CAD/CAM). Advisory agents run in one Crew; CAD and CAM run as single-agent Crews gated by readiness. The outer `_run_crew()` in `crew_orchestrator.py` delegates dispatch to the Flow and keeps post-processing.
8
+
9
+ **Tech Stack:** CrewAI 1.14 (`crewai.flow.flow.Flow`, `@start`, `@listen`, `@router`), Pydantic BaseModel, existing CrewAI Agent/Task/Crew/LLM
10
+
11
+ **Spec:** `docs/superpowers/specs/2026-04-12-crewai-flow-refactor-design.md`
12
+
13
+ ---
14
+
15
+ ### Task 1: AgentResponse model + tests
16
+
17
+ **Files:**
18
+ - Create: `agents/agent_flow.py`
19
+ - Create: `tests/test_agent_flow.py`
20
+
21
+ - [ ] **Step 1: Write failing tests for AgentResponse**
22
+
23
+ ```python
24
+ # tests/test_agent_flow.py
25
+ """Tests for agents/agent_flow.py — AgentResponse, AgentFlowState, AgentDispatchFlow."""
26
+
27
+ from agents.agent_flow import AgentResponse
28
+
29
+
30
+ class TestAgentResponse:
31
+ def test_from_agent_populates_all_fields(self):
32
+ resp = AgentResponse.from_agent("design", "Great bracket idea.")
33
+ assert resp.agent_id == "design"
34
+ assert resp.agent_name == "Design Agent"
35
+ assert resp.color == "#7c3aed"
36
+ assert resp.avatar == "DA"
37
+ assert resp.message == "Great bracket idea."
38
+ assert resp.code is None
39
+
40
+ def test_from_agent_with_code(self):
41
+ resp = AgentResponse.from_agent("cad", "Model generated.", code="import cq")
42
+ assert resp.code == "import cq"
43
+ assert resp.agent_id == "cad"
44
+
45
+ def test_from_agent_engineering(self):
46
+ resp = AgentResponse.from_agent("engineering", "Use 3mm walls.")
47
+ assert resp.agent_name == "Engineering Agent"
48
+ assert resp.color == "#00b4d8"
49
+
50
+ def test_model_dump_matches_format_response(self):
51
+ resp = AgentResponse.from_agent("cnc", "Looks machinable.")
52
+ d = resp.model_dump()
53
+ assert set(d.keys()) == {"agent_id", "agent_name", "message", "color", "avatar", "code"}
54
+ ```
55
+
56
+ - [ ] **Step 2: Run tests to verify they fail**
57
+
58
+ Run: `pytest tests/test_agent_flow.py -v`
59
+ Expected: FAIL — `ModuleNotFoundError: No module named 'agents.agent_flow'`
60
+
61
+ - [ ] **Step 3: Implement AgentResponse model**
62
+
63
+ ```python
64
+ # agents/agent_flow.py
65
+ """CrewAI Flow for multi-agent dispatch.
66
+
67
+ Replaces Crew.kickoff() and RoutingEngine with a Flow that routes messages
68
+ to advisory agents (parallel) and gates generation agents (CAD/CAM) via
69
+ readiness checks.
70
+ """
71
+
72
+ from __future__ import annotations
73
+
74
+ import re
75
+ from pydantic import BaseModel, Field
76
+
77
+ from agents.definitions import AGENTS
78
+
79
+
80
+ class AgentResponse(BaseModel):
81
+ """Typed agent response — replaces raw dicts from _format_response()."""
82
+ agent_id: str
83
+ agent_name: str
84
+ message: str
85
+ color: str
86
+ avatar: str
87
+ code: str | None = None
88
+
89
+ @classmethod
90
+ def from_agent(cls, agent_id: str, message: str,
91
+ code: str | None = None) -> AgentResponse:
92
+ """Build from agent_id, looking up metadata from AGENTS registry."""
93
+ agent_def = AGENTS[agent_id]
94
+ return cls(
95
+ agent_id=agent_id,
96
+ agent_name=agent_def.name,
97
+ message=message,
98
+ color=agent_def.color,
99
+ avatar=agent_def.avatar,
100
+ code=code,
101
+ )
102
+ ```
103
+
104
+ - [ ] **Step 4: Run tests to verify they pass**
105
+
106
+ Run: `pytest tests/test_agent_flow.py -v`
107
+ Expected: 4 passed
108
+
109
+ - [ ] **Step 5: Commit**
110
+
111
+ ```bash
112
+ git add agents/agent_flow.py tests/test_agent_flow.py
113
+ git commit -m "feat: add AgentResponse model with from_agent factory"
114
+ ```
115
+
116
+ ---
117
+
118
+ ### Task 2: AgentFlowState model + extract_code utility
119
+
120
+ **Files:**
121
+ - Modify: `agents/agent_flow.py`
122
+ - Modify: `tests/test_agent_flow.py`
123
+
124
+ - [ ] **Step 1: Write failing tests for AgentFlowState and extract_code**
125
+
126
+ ```python
127
+ # tests/test_agent_flow.py — append to file
128
+ from agents.agent_flow import AgentFlowState, extract_code
129
+
130
+
131
+ class TestAgentFlowState:
132
+ def test_defaults(self):
133
+ state = AgentFlowState()
134
+ assert state.message == ""
135
+ assert state.active_agent_ids == []
136
+ assert state.advisor_responses == []
137
+ assert state.cad_response is None
138
+ assert state.cam_response is None
139
+ assert state.cad_code is None
140
+ assert state.cam_plan is None
141
+ assert state.mentions == []
142
+ assert state.is_approved_phase is False
143
+
144
+ def test_with_inputs(self):
145
+ state = AgentFlowState(
146
+ message="build a bracket",
147
+ model_str="gemini/gemini-2.5-flash",
148
+ mentions=["design"],
149
+ is_approved_phase=True,
150
+ )
151
+ assert state.message == "build a bracket"
152
+ assert state.mentions == ["design"]
153
+ assert state.is_approved_phase is True
154
+
155
+
156
+ class TestExtractCode:
157
+ def test_extracts_fenced_python(self):
158
+ text = "Here is code:\n```python\nimport cadquery as cq\nresult = cq.Workplane('XY').box(10,10,10)\n```\nDone."
159
+ code = extract_code(text)
160
+ assert code is not None
161
+ assert "import cadquery" in code
162
+ assert "result =" in code
163
+
164
+ def test_extracts_unfenced_cq_code(self):
165
+ text = "import cadquery as cq\nresult = cq.Workplane('XY').box(5,5,5)"
166
+ code = extract_code(text)
167
+ assert code is not None
168
+ assert "cq.Workplane" in code
169
+
170
+ def test_returns_none_for_plain_text(self):
171
+ text = "NOT READY: I need dimensions."
172
+ assert extract_code(text) is None
173
+
174
+ def test_extracts_generic_fenced_block(self):
175
+ text = "```\nimport cadquery as cq\nresult = cq.Workplane('XY').box(1,1,1)\n```"
176
+ code = extract_code(text)
177
+ assert code is not None
178
+ ```
179
+
180
+ - [ ] **Step 2: Run tests to verify they fail**
181
+
182
+ Run: `pytest tests/test_agent_flow.py::TestAgentFlowState -v && pytest tests/test_agent_flow.py::TestExtractCode -v`
183
+ Expected: FAIL — `ImportError: cannot import name 'AgentFlowState'`
184
+
185
+ - [ ] **Step 3: Implement AgentFlowState and extract_code**
186
+
187
+ Add to `agents/agent_flow.py` after the `AgentResponse` class:
188
+
189
+ ```python
190
+ class AgentFlowState(BaseModel):
191
+ """Orchestration state for a single chat turn. Lives only during Flow execution."""
192
+ # Input (set by _run_crew before kickoff)
193
+ message: str = ""
194
+ context: str = ""
195
+ model_str: str = ""
196
+ mentions: list[str] = Field(default_factory=list)
197
+ is_approved_phase: bool = False
198
+
199
+ # Set by route_message router
200
+ active_agent_ids: list[str] = Field(default_factory=list)
201
+
202
+ # Set by prepare_agents
203
+ knowledge_sources_data: list[str] = Field(default_factory=list)
204
+
205
+ # Accumulated output (set by Flow steps)
206
+ advisor_responses: list[AgentResponse] = Field(default_factory=list)
207
+ cad_response: AgentResponse | None = None
208
+ cam_response: AgentResponse | None = None
209
+
210
+ # Generation artifacts
211
+ cad_code: str | None = None
212
+ cam_plan: "CAMPlan | None" = None
213
+
214
+
215
+ def extract_code(text: str) -> str | None:
216
+ """Extract Python code from LLM output.
217
+
218
+ Checks for markdown fenced blocks first, then bare CadQuery markers.
219
+ """
220
+ match = re.search(r"```(?:python)?\s*\n(.*?)```", text, re.DOTALL)
221
+ if match:
222
+ return match.group(1).strip()
223
+ if any(marker in text for marker in ["import cadquery", "cq.", "result ="]):
224
+ return text.strip()
225
+ return None
226
+ ```
227
+
228
+ Add forward-reference import at top of file for type annotation:
229
+
230
+ ```python
231
+ from __future__ import annotations
232
+ ```
233
+
234
+ - [ ] **Step 4: Run tests to verify they pass**
235
+
236
+ Run: `pytest tests/test_agent_flow.py -v`
237
+ Expected: 10 passed
238
+
239
+ - [ ] **Step 5: Commit**
240
+
241
+ ```bash
242
+ git add agents/agent_flow.py tests/test_agent_flow.py
243
+ git commit -m "feat: add AgentFlowState model and extract_code utility"
244
+ ```
245
+
246
+ ---
247
+
248
+ ### Task 3: route_message logic + tests
249
+
250
+ **Files:**
251
+ - Modify: `agents/agent_flow.py`
252
+ - Modify: `tests/test_agent_flow.py`
253
+
254
+ - [ ] **Step 1: Write failing tests for routing logic**
255
+
256
+ The routing logic will be implemented as a standalone function first (testable without the full Flow), then wired into the Flow's `@router` step in Task 5.
257
+
258
+ ```python
259
+ # tests/test_agent_flow.py — append to file
260
+ from agents.agent_flow import route_agents
261
+
262
+ class TestRouteAgents:
263
+ def test_approved_phase_locks_to_config(self):
264
+ ids = route_agents("anything", mentions=[], is_approved_phase=True)
265
+ assert ids == ["cad", "cnc"]
266
+
267
+ def test_mentions_override_routing(self):
268
+ ids = route_agents("anything", mentions=["design", "cam"], is_approved_phase=False)
269
+ assert ids == ["design", "cam"]
270
+
271
+ def test_design_keywords(self):
272
+ ids = route_agents("I want a sleek design with smooth shape", mentions=[], is_approved_phase=False)
273
+ assert "design" in ids
274
+
275
+ def test_engineering_keywords(self):
276
+ ids = route_agents("Use M6 bolts with 3mm wall thickness in aluminum", mentions=[], is_approved_phase=False)
277
+ assert "engineering" in ids
278
+
279
+ def test_cnc_keywords(self):
280
+ ids = route_agents("Can this be machined on a 3-axis CNC mill?", mentions=[], is_approved_phase=False)
281
+ assert "cnc" in ids
282
+
283
+ def test_cam_keywords(self):
284
+ ids = route_agents("Generate a toolpath for this part", mentions=[], is_approved_phase=False)
285
+ assert "cam" in ids
286
+
287
+ def test_default_when_no_match(self):
288
+ ids = route_agents("hello there", mentions=[], is_approved_phase=False)
289
+ assert ids == ["design", "engineering"]
290
+
291
+ def test_max_three_agents(self):
292
+ ids = route_agents("design shape in aluminum for CNC machining", mentions=[], is_approved_phase=False)
293
+ assert len(ids) <= 3
294
+
295
+ def test_cad_trigger_adds_cad(self):
296
+ ids = route_agents("Generate a preview", mentions=[], is_approved_phase=False)
297
+ assert "cad" in ids
298
+
299
+ def test_no_cad_trigger_without_keyword(self):
300
+ ids = route_agents("hello there", mentions=[], is_approved_phase=False)
301
+ assert "cad" not in ids
302
+
303
+ def test_cad_not_duplicated_when_already_routed(self):
304
+ ids = route_agents("generate code for this design", mentions=[], is_approved_phase=False)
305
+ assert ids.count("cad") <= 1
306
+ ```
307
+
308
+ - [ ] **Step 2: Run tests to verify they fail**
309
+
310
+ Run: `pytest tests/test_agent_flow.py::TestRouteAgents -v`
311
+ Expected: FAIL — `ImportError: cannot import name 'route_agents'`
312
+
313
+ - [ ] **Step 3: Implement route_agents function**
314
+
315
+ Add to `agents/agent_flow.py`:
316
+
317
+ ```python
318
+ from config.settings import settings
319
+
320
+ # Agent role sets for routing
321
+ ADVISOR_IDS = frozenset({"design", "engineering", "cnc"})
322
+ GENERATOR_IDS = frozenset({"cad", "cam"})
323
+
324
+
325
+ def route_agents(
326
+ message: str,
327
+ mentions: list[str],
328
+ is_approved_phase: bool,
329
+ ) -> list[str]:
330
+ """Select which agents should respond to this message.
331
+
332
+ Replaces RoutingEngine.route() + has_cad_trigger() + approved/mentions logic.
333
+ Reads routing keywords and CAD triggers from config.yaml.
334
+ """
335
+ if is_approved_phase:
336
+ return list(settings.planning.approved_agents)
337
+
338
+ if mentions:
339
+ return list(mentions)
340
+
341
+ # Keyword scoring
342
+ lower = message.lower()
343
+ keywords: dict[str, list[str]] = settings.routing.keywords
344
+ max_agents: int = settings.orchestration.max_active_agents
345
+
346
+ scores: dict[str, int] = {agent_id: 0 for agent_id in keywords}
347
+ for agent_id, kws in keywords.items():
348
+ for kw in kws:
349
+ if kw in lower:
350
+ scores[agent_id] += 1
351
+
352
+ active = [aid for aid, score in sorted(scores.items(), key=lambda x: -x[1]) if score > 0]
353
+ if not active:
354
+ active = ["design", "engineering"]
355
+ active = active[:max_agents]
356
+
357
+ # CAD trigger check
358
+ cad_triggers: list[str] = settings.routing.cad_trigger_keywords
359
+ if "cad" not in active and any(kw in lower for kw in cad_triggers):
360
+ active.append("cad")
361
+
362
+ return active
363
+ ```
364
+
365
+ - [ ] **Step 4: Run tests to verify they pass**
366
+
367
+ Run: `pytest tests/test_agent_flow.py::TestRouteAgents -v`
368
+ Expected: 11 passed
369
+
370
+ - [ ] **Step 5: Run full test suite**
371
+
372
+ Run: `pytest tests/test_agent_flow.py -v`
373
+ Expected: 21 passed
374
+
375
+ - [ ] **Step 6: Commit**
376
+
377
+ ```bash
378
+ git add agents/agent_flow.py tests/test_agent_flow.py
379
+ git commit -m "feat: add route_agents function replacing RoutingEngine"
380
+ ```
381
+
382
+ ---
383
+
384
+ ### Task 4: check_readiness logic + collect_results + tests
385
+
386
+ **Files:**
387
+ - Modify: `agents/agent_flow.py`
388
+ - Modify: `tests/test_agent_flow.py`
389
+
390
+ - [ ] **Step 1: Write failing tests**
391
+
392
+ ```python
393
+ # tests/test_agent_flow.py — append to file
394
+ from agents.agent_flow import check_readiness, collect_responses
395
+
396
+
397
+ class TestCheckReadiness:
398
+ def test_ready_when_advisors_clean(self):
399
+ responses = [
400
+ AgentResponse.from_agent("design", "L-bracket looks good."),
401
+ AgentResponse.from_agent("engineering", "3mm walls in aluminum."),
402
+ ]
403
+ result = check_readiness(responses, active_agent_ids=["design", "engineering", "cad"])
404
+ assert result == "READY"
405
+
406
+ def test_not_ready_when_advisor_flags(self):
407
+ responses = [
408
+ AgentResponse.from_agent("design", "L-bracket looks good."),
409
+ AgentResponse.from_agent("cnc", "NOT READY: Missing dimensions and material."),
410
+ ]
411
+ result = check_readiness(responses, active_agent_ids=["design", "cnc", "cad"])
412
+ assert result == "NOT_READY"
413
+
414
+ def test_skip_generation_when_no_generators(self):
415
+ responses = [
416
+ AgentResponse.from_agent("design", "Nice shape."),
417
+ ]
418
+ result = check_readiness(responses, active_agent_ids=["design", "engineering"])
419
+ assert result == "SKIP_GENERATION"
420
+
421
+ def test_not_ready_case_insensitive(self):
422
+ responses = [
423
+ AgentResponse.from_agent("engineering", "not ready: need wall thickness"),
424
+ ]
425
+ result = check_readiness(responses, active_agent_ids=["engineering", "cad"])
426
+ assert result == "NOT_READY"
427
+
428
+
429
+ class TestCollectResponses:
430
+ def test_merges_all_responses(self):
431
+ advisors = [AgentResponse.from_agent("design", "Shape OK.")]
432
+ cad = AgentResponse.from_agent("cad", "Model generated.", code="result = cq.box()")
433
+ cam = AgentResponse.from_agent("cam", "Machining plan ready.")
434
+ result = collect_responses(advisors, cad, cam)
435
+ assert len(result) == 3
436
+ assert result[0].agent_id == "design"
437
+ assert result[1].agent_id == "cad"
438
+ assert result[2].agent_id == "cam"
439
+
440
+ def test_handles_none_cad_and_cam(self):
441
+ advisors = [AgentResponse.from_agent("engineering", "Specs look good.")]
442
+ result = collect_responses(advisors, None, None)
443
+ assert len(result) == 1
444
+ assert result[0].agent_id == "engineering"
445
+
446
+ def test_handles_empty_advisors(self):
447
+ cad = AgentResponse.from_agent("cad", "NOT READY: need dimensions")
448
+ result = collect_responses([], cad, None)
449
+ assert len(result) == 1
450
+ assert result[0].agent_id == "cad"
451
+ ```
452
+
453
+ - [ ] **Step 2: Run tests to verify they fail**
454
+
455
+ Run: `pytest tests/test_agent_flow.py::TestCheckReadiness tests/test_agent_flow.py::TestCollectResponses -v`
456
+ Expected: FAIL — `ImportError`
457
+
458
+ - [ ] **Step 3: Implement check_readiness and collect_responses**
459
+
460
+ Add to `agents/agent_flow.py`:
461
+
462
+ ```python
463
+ def check_readiness(
464
+ advisor_responses: list[AgentResponse],
465
+ active_agent_ids: list[str],
466
+ ) -> str:
467
+ """Inspect advisor responses and active agents to determine generation path.
468
+
469
+ Returns:
470
+ "READY" — advisors OK, generators should run
471
+ "NOT_READY" — at least one advisor flagged NOT READY
472
+ "SKIP_GENERATION" — no generators (cad/cam) in active list
473
+ """
474
+ has_generators = bool(GENERATOR_IDS & set(active_agent_ids))
475
+ if not has_generators:
476
+ return "SKIP_GENERATION"
477
+
478
+ for resp in advisor_responses:
479
+ if resp.message.strip().upper().startswith("NOT READY:"):
480
+ return "NOT_READY"
481
+
482
+ return "READY"
483
+
484
+
485
+ def collect_responses(
486
+ advisor_responses: list[AgentResponse],
487
+ cad_response: AgentResponse | None,
488
+ cam_response: AgentResponse | None,
489
+ ) -> list[AgentResponse]:
490
+ """Merge all agent responses into a single ordered list."""
491
+ result = list(advisor_responses)
492
+ if cad_response is not None:
493
+ result.append(cad_response)
494
+ if cam_response is not None:
495
+ result.append(cam_response)
496
+ return result
497
+ ```
498
+
499
+ - [ ] **Step 4: Run tests to verify they pass**
500
+
501
+ Run: `pytest tests/test_agent_flow.py -v`
502
+ Expected: 28 passed
503
+
504
+ - [ ] **Step 5: Commit**
505
+
506
+ ```bash
507
+ git add agents/agent_flow.py tests/test_agent_flow.py
508
+ git commit -m "feat: add check_readiness and collect_responses helpers"
509
+ ```
510
+
511
+ ---
512
+
513
+ ### Task 5: AgentDispatchFlow — full Flow class
514
+
515
+ **Files:**
516
+ - Modify: `agents/agent_flow.py`
517
+ - Modify: `tests/test_agent_flow.py`
518
+
519
+ - [ ] **Step 1: Write failing tests for the Flow**
520
+
521
+ These tests validate the Flow wiring without making LLM calls. They test `prepare_agents`, `route_message`, and `no_agents` paths by mocking the Crew execution.
522
+
523
+ ```python
524
+ # tests/test_agent_flow.py — append to file
525
+ from unittest.mock import patch, MagicMock
526
+ from agents.agent_flow import AgentDispatchFlow
527
+
528
+
529
+ class TestAgentDispatchFlow:
530
+ def test_no_agents_path(self):
531
+ """Flow with no matching agents completes without error."""
532
+ flow = AgentDispatchFlow(initial_state=AgentFlowState(
533
+ message="xyzzy",
534
+ context="## User's latest message\nxyzzy",
535
+ model_str="gemini/gemini-2.5-flash",
536
+ ))
537
+ flow.kickoff()
538
+ assert flow.state.active_agent_ids == ["design", "engineering"]
539
+
540
+ def test_approved_phase_sets_generators(self):
541
+ """Approved phase locks agents to config approved_agents."""
542
+ flow = AgentDispatchFlow(initial_state=AgentFlowState(
543
+ message="build it",
544
+ context="## APPROVED PLAN",
545
+ model_str="gemini/gemini-2.5-flash",
546
+ is_approved_phase=True,
547
+ ))
548
+ # Mock _run_single_agent_crew to avoid actual LLM calls
549
+ with patch.object(flow, '_run_single_agent_crew', return_value="NOT READY: need dims"):
550
+ flow.kickoff()
551
+ assert flow.state.active_agent_ids == ["cad", "cnc"]
552
+
553
+ def test_mentions_override(self):
554
+ """Explicit mentions override keyword routing."""
555
+ flow = AgentDispatchFlow(initial_state=AgentFlowState(
556
+ message="check this",
557
+ context="",
558
+ model_str="gemini/gemini-2.5-flash",
559
+ mentions=["cam"],
560
+ ))
561
+ with patch.object(flow, '_run_single_agent_crew', return_value="Machining plan ready."):
562
+ flow.kickoff()
563
+ assert flow.state.active_agent_ids == ["cam"]
564
+
565
+ def test_collect_results_populated(self):
566
+ """Flow populates collected responses from advisor + cad paths."""
567
+ flow = AgentDispatchFlow(initial_state=AgentFlowState(
568
+ message="design a bracket",
569
+ context="",
570
+ model_str="gemini/gemini-2.5-flash",
571
+ ))
572
+ # Simulate: advisors respond, no generators routed
573
+ with patch.object(flow, '_run_advisor_crew', return_value=[
574
+ AgentResponse.from_agent("design", "L-bracket idea."),
575
+ ]):
576
+ flow.kickoff()
577
+ results = collect_responses(
578
+ flow.state.advisor_responses,
579
+ flow.state.cad_response,
580
+ flow.state.cam_response,
581
+ )
582
+ assert len(results) >= 1
583
+ ```
584
+
585
+ - [ ] **Step 2: Run tests to verify they fail**
586
+
587
+ Run: `pytest tests/test_agent_flow.py::TestAgentDispatchFlow -v`
588
+ Expected: FAIL — `ImportError: cannot import name 'AgentDispatchFlow'`
589
+
590
+ - [ ] **Step 3: Implement AgentDispatchFlow**
591
+
592
+ Add to `agents/agent_flow.py`:
593
+
594
+ ```python
595
+ import logging
596
+ from pathlib import Path
597
+
598
+ from crewai.flow.flow import Flow, listen, start, router
599
+
600
+ logger = logging.getLogger(__name__)
601
+
602
+ WIKI_DIR = Path(__file__).parent.parent / "docs" / "wiki"
603
+
604
+
605
+ class AgentDispatchFlow(Flow[AgentFlowState]):
606
+ """Flow-based agent dispatch replacing Crew.kickoff() + RoutingEngine.
607
+
608
+ Graph:
609
+ prepare_agents → route_message (router)
610
+ ├─ HAS_ADVISORS → run_advisors → check_readiness (router)
611
+ │ ├─ READY → run_cad → run_cam → collect_results
612
+ │ ├─ NOT_READY → run_cad_not_ready → skip_cam → collect_results
613
+ │ └─ SKIP_GENERATION → skip_generation → collect_results
614
+ ├─ GENERATORS_ONLY → run_cad_gen_only → run_cam_gen_only → collect_results
615
+ └─ NO_AGENTS → no_agents → collect_results
616
+ """
617
+
618
+ # ── Start ────────────────────────────────────────────────────────────
619
+
620
+ @start()
621
+ def prepare_agents(self):
622
+ """Build LLM config and load wiki knowledge sources."""
623
+ for filename in ("cutting-parameters.md", "gcode-reference.md"):
624
+ path = WIKI_DIR / filename
625
+ if path.exists():
626
+ self.state.knowledge_sources_data.append(path.read_text())
627
+
628
+ # ── Routing ──────────────────────────────────────────────────────────
629
+
630
+ @router(prepare_agents)
631
+ def route_message(self):
632
+ """Select agents and return path: HAS_ADVISORS | GENERATORS_ONLY | NO_AGENTS."""
633
+ self.state.active_agent_ids = route_agents(
634
+ self.state.message,
635
+ self.state.mentions,
636
+ self.state.is_approved_phase,
637
+ )
638
+ if not self.state.active_agent_ids:
639
+ return "NO_AGENTS"
640
+
641
+ has_advisors = bool(ADVISOR_IDS & set(self.state.active_agent_ids))
642
+ has_generators = bool(GENERATOR_IDS & set(self.state.active_agent_ids))
643
+
644
+ if has_advisors:
645
+ return "HAS_ADVISORS"
646
+ if has_generators:
647
+ return "GENERATORS_ONLY"
648
+ return "NO_AGENTS"
649
+
650
+ # ── Advisory path ────────────────────────────────────────────────────
651
+
652
+ @listen("HAS_ADVISORS")
653
+ def run_advisors(self):
654
+ """Run advisory agents (design, engineering, cnc) as one Crew."""
655
+ advisor_ids = [aid for aid in self.state.active_agent_ids if aid in ADVISOR_IDS]
656
+ if not advisor_ids:
657
+ return
658
+
659
+ responses = self._run_advisor_crew(advisor_ids)
660
+ self.state.advisor_responses = responses
661
+
662
+ @router(run_advisors)
663
+ def check_readiness_router(self):
664
+ """Gate generation based on advisor responses."""
665
+ return check_readiness(self.state.advisor_responses, self.state.active_agent_ids)
666
+
667
+ @listen("READY")
668
+ def run_cad(self):
669
+ """Run CAD Coder agent — expects code generation."""
670
+ self._run_cad_step()
671
+
672
+ @listen("NOT_READY")
673
+ def run_cad_not_ready(self):
674
+ """Run CAD Coder agent — expects NOT READY gap list."""
675
+ self._run_cad_step()
676
+
677
+ @listen("SKIP_GENERATION")
678
+ def skip_generation(self):
679
+ """No generators requested — pass through."""
680
+ pass
681
+
682
+ @listen(run_cad)
683
+ def run_cam(self):
684
+ """Run CAM agent after successful CAD generation."""
685
+ self._run_cam_step()
686
+
687
+ @listen(run_cad_not_ready)
688
+ def skip_cam_after_not_ready(self):
689
+ """CAM skipped — CAD wasn't ready."""
690
+ pass
691
+
692
+ # ── Generators-only path ─────────────────────────────────────────────
693
+
694
+ @listen("GENERATORS_ONLY")
695
+ def run_cad_gen_only(self):
696
+ """Run CAD Coder directly (no advisors ran)."""
697
+ self._run_cad_step()
698
+
699
+ @listen(run_cad_gen_only)
700
+ def run_cam_gen_only(self):
701
+ """Run CAM after CAD in generators-only path."""
702
+ self._run_cam_step()
703
+
704
+ # ── No-agents path ───────────────────────────────────────────────────
705
+
706
+ @listen("NO_AGENTS")
707
+ def no_agents(self):
708
+ """Defensive — no agents matched. Pass through."""
709
+ pass
710
+
711
+ # ── Collect ──────────────────────────────────────────────────────────
712
+
713
+ @listen(run_cam, skip_cam_after_not_ready, skip_generation,
714
+ run_cam_gen_only, no_agents)
715
+ def collect_results(self):
716
+ """Merge all responses — accessible via state after kickoff."""
717
+ pass # Responses already on state; caller reads them directly.
718
+
719
+ # ── Private helpers ──────────────────────────────────────────────────
720
+
721
+ def _build_llm(self):
722
+ """Build a CrewAI LLM from state.model_str."""
723
+ from crewai import LLM
724
+ return LLM(model=self.state.model_str, temperature=settings.temperature)
725
+
726
+ def _build_knowledge_sources(self):
727
+ """Build StringKnowledgeSources from loaded wiki data."""
728
+ sources = []
729
+ try:
730
+ from crewai.knowledge.source.string_knowledge_source import StringKnowledgeSource
731
+ for content in self.state.knowledge_sources_data:
732
+ sources.append(StringKnowledgeSource(content=content))
733
+ except ImportError:
734
+ pass
735
+ return sources
736
+
737
+ def _build_crew_agent(self, agent_id: str, llm):
738
+ """Create a CrewAI Agent + Task for the given agent_id.
739
+
740
+ Returns (Agent, Task) tuple with per-agent tools, backstory, and
741
+ task description matching the current crew_orchestrator.py logic.
742
+ """
743
+ from crewai import Agent, Task
744
+ from agents.tools import (
745
+ query_design_state_tool, execute_cad_tool,
746
+ validate_cad_tool, generate_gcode_tool,
747
+ )
748
+ from core.cam import CAMPlan
749
+
750
+ agent_def = AGENTS[agent_id]
751
+ tools = [query_design_state_tool]
752
+ extra_backstory = ""
753
+ task_output_pydantic = None
754
+
755
+ if agent_id == "cad":
756
+ tools.extend([execute_cad_tool, validate_cad_tool])
757
+ from core.cadquery_prompts import CADQUERY_SYSTEM_PROMPT
758
+ extra_backstory = (
759
+ "\n\nBefore deciding if specs are sufficient, ALWAYS call the "
760
+ "Query Design State tool first to check what the orchestrator "
761
+ "already knows (dimensions, material, features). Only say "
762
+ "NOT READY if the tool confirms information is truly missing.\n\n"
763
+ "When generating code, use the Execute CadQuery Code tool "
764
+ "to test your code. If it fails, fix the errors and try again. "
765
+ "Use the Validate CNC Manufacturability tool to check the result. "
766
+ "Output ONLY valid CadQuery Python that assigns result to a "
767
+ "cq.Workplane. Import cadquery as cq.\n\n"
768
+ f"CadQuery reference:\n{CADQUERY_SYSTEM_PROMPT}"
769
+ )
770
+ elif agent_id == "cnc":
771
+ extra_backstory = (
772
+ "\n\nBefore deciding if manufacturability info is sufficient, "
773
+ "ALWAYS call the Query Design State tool first to check what "
774
+ "the orchestrator already knows (material, dimensions, features, "
775
+ "constraints, axis). Only say NOT READY if the tool confirms "
776
+ "information is truly missing."
777
+ )
778
+ elif agent_id == "cam":
779
+ tools.append(generate_gcode_tool)
780
+ task_output_pydantic = CAMPlan
781
+ elif agent_id in ("design", "engineering"):
782
+ extra_backstory = (
783
+ "\n\nBefore asking clarifying questions, call the Query Design "
784
+ "State tool to check what is already known. Do NOT ask about "
785
+ "fields the tool shows as already provided (e.g. do not ask "
786
+ "'What material?' if the tool returns a known material)."
787
+ )
788
+
789
+ knowledge_sources = self._build_knowledge_sources() if agent_id in ("cnc", "cam") else []
790
+
791
+ crew_agent = Agent(
792
+ role=agent_def.role,
793
+ goal=agent_def.goal,
794
+ backstory=agent_def.backstory + extra_backstory,
795
+ llm=llm,
796
+ tools=tools,
797
+ verbose=False,
798
+ allow_delegation=False,
799
+ knowledge_sources=knowledge_sources if knowledge_sources else None,
800
+ )
801
+
802
+ task_description = (
803
+ f"{self.state.context}\n\n"
804
+ f"As the {agent_def.role}, respond to the user's latest message. "
805
+ f"Keep your response concise (2-4 sentences). "
806
+ f"Do NOT repeat anything from the conversation history. "
807
+ f"Add NEW information from your expertise.\n\n"
808
+ f"Build on other agents' input — agree, disagree, refine, or add."
809
+ )
810
+
811
+ if agent_id == "cad":
812
+ task_description += (
813
+ "\n\nFIRST call the Query Design State tool to check what the "
814
+ "orchestrator already knows. Use the returned 'known' fields "
815
+ "as your specs. Only say 'NOT READY:' listing truly missing "
816
+ "items if the tool shows critical gaps (no shape, no dimensions, "
817
+ "no features). If enough info exists, generate CadQuery code and "
818
+ "use the Execute CadQuery Code tool to verify it works."
819
+ )
820
+ elif agent_id == "cam":
821
+ task_description += (
822
+ "\n\nFIRST call the Query Design State tool to check available "
823
+ "specs. If there is no CAD model generated yet or the tool shows "
824
+ "critical gaps (no material, no dimensions), start with "
825
+ "'NOT READY:' and list only the truly missing items. "
826
+ "If enough info exists, analyze the part geometry and create an "
827
+ "optimal machining strategy. Select operations in order (roughing "
828
+ "before finishing). Use the Generate G-code Toolpath tool to create "
829
+ "the G-code."
830
+ )
831
+ elif agent_id == "cnc":
832
+ task_description += (
833
+ "\n\nFIRST call the Query Design State tool to check what the "
834
+ "orchestrator already knows about dimensions, material, and "
835
+ "constraints. Only say 'NOT READY:' listing truly missing items "
836
+ "if the tool confirms critical gaps. If enough info exists, "
837
+ "provide your manufacturability assessment."
838
+ )
839
+ else:
840
+ task_description += (
841
+ "\n\nFIRST call the Query Design State tool to see what is "
842
+ "already known. Only ask clarifying questions about fields "
843
+ "the tool shows as missing in YOUR domain."
844
+ )
845
+
846
+ if agent_id == "cad":
847
+ expected_output = "Valid CadQuery Python code or a 'NOT READY:' message."
848
+ elif agent_id in ("cnc", "cam"):
849
+ expected_output = "A concise expert assessment or a 'NOT READY:' message listing missing items."
850
+ else:
851
+ expected_output = "A concise response from your expert perspective (2-4 sentences)."
852
+
853
+ task = Task(
854
+ description=task_description,
855
+ expected_output=expected_output,
856
+ agent=crew_agent,
857
+ output_pydantic=task_output_pydantic,
858
+ )
859
+
860
+ return crew_agent, task
861
+
862
+ def _run_advisor_crew(self, advisor_ids: list[str]) -> list[AgentResponse]:
863
+ """Run advisory agents as a single sequential Crew. Returns list of AgentResponse."""
864
+ from crewai import Crew, Process
865
+
866
+ llm = self._build_llm()
867
+ agents_and_tasks = [self._build_crew_agent(aid, llm) for aid in advisor_ids]
868
+ crew_agents = [at[0] for at in agents_and_tasks]
869
+ crew_tasks = [at[1] for at in agents_and_tasks]
870
+
871
+ crew = Crew(agents=crew_agents, tasks=crew_tasks, process=Process.sequential, verbose=False)
872
+ crew_result = crew.kickoff()
873
+
874
+ responses = []
875
+ task_outputs = crew_result.tasks_output if hasattr(crew_result, 'tasks_output') else []
876
+ for i, agent_id in enumerate(advisor_ids):
877
+ raw = str(task_outputs[i]) if i < len(task_outputs) else (str(crew_result) if i == 0 else "")
878
+ if raw.strip():
879
+ responses.append(AgentResponse.from_agent(agent_id, raw.strip()))
880
+ return responses
881
+
882
+ def _run_single_agent_crew(self, agent_id: str) -> str:
883
+ """Run a single agent as a one-agent Crew. Returns raw output string."""
884
+ from crewai import Crew, Process
885
+
886
+ llm = self._build_llm()
887
+ crew_agent, task = self._build_crew_agent(agent_id, llm)
888
+ crew = Crew(agents=[crew_agent], tasks=[task], process=Process.sequential, verbose=False)
889
+ crew_result = crew.kickoff()
890
+
891
+ task_outputs = crew_result.tasks_output if hasattr(crew_result, 'tasks_output') else []
892
+ return str(task_outputs[0]).strip() if task_outputs else str(crew_result).strip()
893
+
894
+ def _run_cad_step(self):
895
+ """Shared CAD agent execution for READY, NOT_READY, and GENERATORS_ONLY paths."""
896
+ if "cad" not in self.state.active_agent_ids:
897
+ return
898
+
899
+ raw_output = self._run_single_agent_crew("cad")
900
+ if not raw_output:
901
+ return
902
+
903
+ if raw_output.upper().startswith("NOT READY:"):
904
+ self.state.cad_response = AgentResponse.from_agent("cad", raw_output)
905
+ else:
906
+ code = extract_code(raw_output)
907
+ if code:
908
+ self.state.cad_response = AgentResponse.from_agent("cad", "Model generated.", code=code)
909
+ self.state.cad_code = code
910
+ else:
911
+ self.state.cad_response = AgentResponse.from_agent("cad", raw_output)
912
+
913
+ def _run_cam_step(self):
914
+ """Shared CAM agent execution — only runs if cad_code exists."""
915
+ if "cam" not in self.state.active_agent_ids:
916
+ return
917
+ if self.state.cad_code is None:
918
+ return
919
+
920
+ raw_output = self._run_single_agent_crew("cam")
921
+ if not raw_output:
922
+ return
923
+
924
+ # Try to parse CAMPlan from pydantic output
925
+ from core.cam import CAMPlan
926
+ try:
927
+ import json
928
+ plan_data = json.loads(raw_output)
929
+ cam_plan = CAMPlan(**plan_data)
930
+ self.state.cam_plan = cam_plan
931
+ self.state.cam_response = AgentResponse.from_agent(
932
+ "cam",
933
+ f"Machining plan: {', '.join(cam_plan.operations)} | "
934
+ f"{cam_plan.tool_diameter}mm endmill | {cam_plan.post_processor}",
935
+ )
936
+ except (json.JSONDecodeError, ValueError, KeyError):
937
+ self.state.cam_response = AgentResponse.from_agent("cam", raw_output)
938
+ ```
939
+
940
+ - [ ] **Step 4: Run tests to verify they pass**
941
+
942
+ Run: `pytest tests/test_agent_flow.py -v`
943
+ Expected: 32 passed
944
+
945
+ - [ ] **Step 5: Run full test suite to check for regressions**
946
+
947
+ Run: `pytest tests/ -x -q`
948
+ Expected: All tests pass (existing tests unmodified)
949
+
950
+ - [ ] **Step 6: Commit**
951
+
952
+ ```bash
953
+ git add agents/agent_flow.py tests/test_agent_flow.py
954
+ git commit -m "feat: add AgentDispatchFlow with router-gated agent dispatch"
955
+ ```
956
+
957
+ ---
958
+
959
+ ### Task 6: Wire Flow into crew_orchestrator.py
960
+
961
+ **Files:**
962
+ - Modify: `agents/crew_orchestrator.py`
963
+
964
+ - [ ] **Step 1: Replace _run_crew internals with Flow**
965
+
966
+ Replace the entire `_run_crew` method body in `agents/crew_orchestrator.py`. Keep phase manipulation, pre-extraction, context building, and post-processing. Replace the Crew creation + kickoff + response parsing block with Flow instantiation.
967
+
968
+ Replace `agents/crew_orchestrator.py` content:
969
+
970
+ ```python
971
+ """CrewAI orchestrator for all LLM backends.
972
+
973
+ Uses CrewAI Flow for agent dispatch where each specialist agent gets its own
974
+ focused LLM call with tools, structured output, and knowledge sources.
975
+
976
+ Falls back to MockChatBackend if CrewAI is not installed.
977
+ """
978
+
979
+ from __future__ import annotations
980
+
981
+ import logging
982
+ from pathlib import Path
983
+
984
+ from agents.base import BaseOrchestrator
985
+ from agents.definitions import AGENTS
986
+ from agents.design_state import DesignState, DesignPlan, extract_decisions, compute_score
987
+ from agents.gap_analyzer import analyze_gaps, generate_question_cards
988
+ from agents.orchestrator import _format_response
989
+ from config.settings import settings
990
+
991
+ logger = logging.getLogger(__name__)
992
+
993
+ DEFAULT_OUTPUT_DIR = Path(__file__).parent.parent / "output"
994
+
995
+
996
+ def _build_agent_context(
997
+ message: str,
998
+ history: list[dict],
999
+ design_state: DesignState,
1000
+ max_history: int = 20,
1001
+ approved_plan: DesignPlan | None = None,
1002
+ ) -> str:
1003
+ """Build a shared context string that each CrewAI agent receives."""
1004
+ parts = []
1005
+
1006
+ if approved_plan:
1007
+ parts.append(approved_plan.render_approved())
1008
+ else:
1009
+ spec = design_state.render()
1010
+ if spec:
1011
+ parts.append(f"## Current Design Spec\n{spec}")
1012
+
1013
+ recent = history[-max_history:] if len(history) > max_history else history
1014
+ if recent:
1015
+ lines = []
1016
+ for msg in recent:
1017
+ if msg.get("role") == "user":
1018
+ lines.append(f"USER: {msg.get('content', '')}")
1019
+ else:
1020
+ aid = msg.get("agent_id", "unknown")
1021
+ name = AGENTS.get(aid, AGENTS["design"]).name
1022
+ lines.append(f"{name.upper()}: {msg.get('content', '')}")
1023
+ parts.append("## Recent conversation\n" + "\n".join(lines))
1024
+
1025
+ parts.append(f"## User's latest message\n{message}")
1026
+ return "\n\n".join(parts)
1027
+
1028
+
1029
+ def _is_plan_trigger(message: str) -> bool:
1030
+ """Check if user message is requesting a plan review."""
1031
+ lower = message.lower().strip()
1032
+ for keyword in settings.planning.trigger_keywords:
1033
+ if keyword in lower:
1034
+ return True
1035
+ return False
1036
+
1037
+
1038
+ def _get_crewai_model(backend_name: str) -> str:
1039
+ """Get the CrewAI model string for a backend name."""
1040
+ return settings.backends.crewai_models.get(backend_name, "gemini/gemini-2.5-flash")
1041
+
1042
+
1043
+ class CrewOrchestrator(BaseOrchestrator):
1044
+ """Multi-call orchestrator using CrewAI Flow for agent dispatch.
1045
+
1046
+ Falls back to MockChatBackend if CrewAI is not installed.
1047
+ """
1048
+
1049
+ def __init__(self, backend_name: str = "gemini", output_dir=None):
1050
+ super().__init__(output_dir=output_dir or DEFAULT_OUTPUT_DIR)
1051
+ self.backend_name = backend_name
1052
+ self._crew_available = self._check_crewai()
1053
+
1054
+ @staticmethod
1055
+ def _check_crewai() -> bool:
1056
+ try:
1057
+ import importlib.util
1058
+ return importlib.util.find_spec("crewai") is not None
1059
+ except (ImportError, ModuleNotFoundError):
1060
+ return False
1061
+
1062
+ def chat_turn(
1063
+ self,
1064
+ message: str,
1065
+ history: list[dict],
1066
+ mentions: list[str] | None = None,
1067
+ max_history: int = 30,
1068
+ design_state: dict | None = None,
1069
+ plan_context: bool = False,
1070
+ ) -> dict:
1071
+ # Phase: manual plan trigger (before crew/fallback dispatch)
1072
+ state = DesignState(**(design_state or {}))
1073
+ if state.phase == "exploring" and _is_plan_trigger(message):
1074
+ score = compute_score(state)
1075
+ plan = DesignPlan.from_state(state, confidence_score=score)
1076
+ state.phase = "planning"
1077
+ state.plan = plan
1078
+ return {
1079
+ "responses": [],
1080
+ "preview": None,
1081
+ "design_state": state.model_dump(),
1082
+ "question_cards": [],
1083
+ }
1084
+
1085
+ if not self._crew_available:
1086
+ return self._fallback(message, history, mentions, max_history, design_state, plan_context)
1087
+
1088
+ try:
1089
+ return self._run_crew(message, history, mentions, max_history, design_state, plan_context)
1090
+ except Exception as exc:
1091
+ logger.warning("CrewAI run failed (%s), falling back", exc, exc_info=True)
1092
+ try:
1093
+ return self._fallback(message, history, mentions, max_history, design_state, plan_context)
1094
+ except Exception as fallback_exc:
1095
+ logger.error("Fallback also failed: %s", fallback_exc, exc_info=True)
1096
+ return {
1097
+ "responses": [_format_response(
1098
+ "design",
1099
+ f"Backend error: {exc}. Fallback also failed: {fallback_exc}. "
1100
+ f"Please check that your API key is set correctly.",
1101
+ )],
1102
+ "preview": None,
1103
+ "design_state": design_state or {},
1104
+ "question_cards": [],
1105
+ }
1106
+
1107
+ def _run_crew(
1108
+ self,
1109
+ message: str,
1110
+ history: list[dict],
1111
+ mentions: list[str] | None,
1112
+ max_history: int,
1113
+ design_state_dict: dict | None,
1114
+ plan_context: bool = False,
1115
+ ) -> dict:
1116
+ from agents.tools import set_design_state, get_last_shape
1117
+ from agents.agent_flow import AgentFlowState, AgentDispatchFlow, collect_responses
1118
+ from core.cam import generate_gcode
1119
+
1120
+ state = DesignState(**(design_state_dict or {}))
1121
+
1122
+ # Phase: if in planning and user sends a non-plan message, reset to exploring.
1123
+ if state.phase == "planning" and not plan_context:
1124
+ state.phase = "exploring"
1125
+ state.plan = None
1126
+
1127
+ # Phase: approved — pass flag to Flow for routing
1128
+ approved_plan = None
1129
+ is_approved = state.phase == "approved" and state.plan is not None
1130
+ if is_approved:
1131
+ approved_plan = state.plan
1132
+
1133
+ # Pre-extract decisions from the user's current message
1134
+ state = state.update_from_messages([], user_message=message)
1135
+
1136
+ # Expose design state to the orchestrator tool
1137
+ set_design_state(state.model_dump())
1138
+
1139
+ context = _build_agent_context(message, history, state, max_history, approved_plan=approved_plan)
1140
+
1141
+ # ── Run Flow ──────────────────────────────────────────────────
1142
+ flow = AgentDispatchFlow(initial_state=AgentFlowState(
1143
+ message=message,
1144
+ context=context,
1145
+ model_str=_get_crewai_model(self.backend_name),
1146
+ mentions=list(mentions) if mentions else [],
1147
+ is_approved_phase=is_approved,
1148
+ ))
1149
+ flow.kickoff()
1150
+
1151
+ # Read typed results
1152
+ agent_responses = collect_responses(
1153
+ flow.state.advisor_responses,
1154
+ flow.state.cad_response,
1155
+ flow.state.cam_response,
1156
+ )
1157
+ cad_code = flow.state.cad_code
1158
+ cam_plan = flow.state.cam_plan
1159
+
1160
+ # ── Post-processing (unchanged logic) ─────────────────────────
1161
+ # Convert AgentResponse models to dicts for API compatibility
1162
+ responses = [r.model_dump() for r in agent_responses]
1163
+ preview = None
1164
+
1165
+ # CAD execution + export
1166
+ if cad_code:
1167
+ shape = get_last_shape()
1168
+ if shape is not None:
1169
+ from core.executor import export_all
1170
+ from core.validator import validate_for_cnc
1171
+
1172
+ part_name = message[:40].strip().replace(" ", "_").lower()
1173
+ part_name = "".join(c for c in part_name if c.isalnum() or c == "_") or "part"
1174
+ base_path = self.output_dir / part_name
1175
+ try:
1176
+ export_all(shape, base_path)
1177
+ except Exception:
1178
+ pass
1179
+
1180
+ execution_data = {"success": True}
1181
+ try:
1182
+ bb = shape.val().BoundingBox()
1183
+ execution_data["volume_mm3"] = shape.val().Volume()
1184
+ execution_data["bounding_box_mm"] = [bb.xlen, bb.ylen, bb.zlen]
1185
+ execution_data["face_count"] = len(shape.faces().vals())
1186
+ execution_data["edge_count"] = len(shape.edges().vals())
1187
+ except Exception:
1188
+ pass
1189
+
1190
+ validation = validate_for_cnc(shape, part_name=part_name)
1191
+ preview = {
1192
+ "success": True,
1193
+ "part_name": part_name,
1194
+ "stl_url": f"/api/models/{part_name}.stl",
1195
+ "step_url": f"/api/models/{part_name}.step",
1196
+ "threemf_url": f"/api/models/{part_name}.3mf",
1197
+ "execution": execution_data,
1198
+ "validation": validation.model_dump(),
1199
+ }
1200
+
1201
+ # G-code generation
1202
+ if preview and preview.get("success") and cam_plan:
1203
+ shape = get_last_shape()
1204
+ if shape is not None:
1205
+ cam_result = generate_gcode(
1206
+ shape=shape,
1207
+ operations=cam_plan.operations,
1208
+ tool_config=cam_plan.to_tool_config(),
1209
+ post_processor=cam_plan.post_processor,
1210
+ )
1211
+ preview["cam"] = cam_result.model_dump()
1212
+ if cam_result.success and cam_result.gcode:
1213
+ part_name = preview["part_name"]
1214
+ gcode_path = self.output_dir / f"{part_name}.gcode"
1215
+ gcode_path.write_text(cam_result.gcode)
1216
+ preview["gcode_url"] = f"/api/models/{part_name}.gcode"
1217
+
1218
+ # Update design state
1219
+ agent_msgs = [{"message": r.get("message", "")} for r in responses]
1220
+ updated_state = extract_decisions(agent_msgs, state, message)
1221
+
1222
+ # Gap analysis
1223
+ gap_result = analyze_gaps(responses)
1224
+ question_cards = []
1225
+ if gap_result.has_gaps:
1226
+ cards = generate_question_cards(gap_result, updated_state, user_message=message)
1227
+ question_cards = [c.model_dump() for c in cards]
1228
+
1229
+ # Auto-trigger plan if score crosses threshold
1230
+ if updated_state.phase == "exploring":
1231
+ score = compute_score(updated_state)
1232
+ if score >= settings.planning.threshold:
1233
+ plan = DesignPlan.from_state(updated_state, confidence_score=score)
1234
+ updated_state.phase = "planning"
1235
+ updated_state.plan = plan
1236
+
1237
+ # If approved and CAD said NOT READY, reset
1238
+ if state.phase == "approved":
1239
+ for r in responses:
1240
+ if r.get("agent_id") == "cad" and r.get("message", "").upper().startswith("NOT READY:"):
1241
+ updated_state.phase = "exploring"
1242
+ updated_state.plan = None
1243
+ break
1244
+
1245
+ return {
1246
+ "responses": responses,
1247
+ "preview": preview,
1248
+ "design_state": updated_state.model_dump(),
1249
+ "question_cards": question_cards,
1250
+ }
1251
+
1252
+ def _fallback(
1253
+ self,
1254
+ message: str,
1255
+ history: list[dict],
1256
+ mentions: list[str] | None,
1257
+ max_history: int,
1258
+ design_state: dict | None,
1259
+ plan_context: bool = False,
1260
+ ) -> dict:
1261
+ """Fall back to MockChatBackend."""
1262
+ from agents.tools import set_design_state
1263
+ from agents.orchestrator import MockChatBackend
1264
+
1265
+ state = DesignState(**(design_state or {}))
1266
+ state = state.update_from_messages([], user_message=message)
1267
+ set_design_state(state.model_dump())
1268
+
1269
+ mock = MockChatBackend(output_dir=self.output_dir)
1270
+ result = mock.chat_turn(message, history, mentions, design_state=state.model_dump(), plan_context=plan_context)
1271
+ if "question_cards" not in result:
1272
+ result_responses = result.get("responses", [])
1273
+ gap_result = analyze_gaps(result_responses)
1274
+ if gap_result.has_gaps:
1275
+ cards = generate_question_cards(gap_result, state, user_message=message)
1276
+ result["question_cards"] = [c.model_dump() for c in cards]
1277
+ else:
1278
+ result["question_cards"] = []
1279
+ return result
1280
+ ```
1281
+
1282
+ - [ ] **Step 2: Run existing tests to verify no regressions**
1283
+
1284
+ Run: `pytest tests/test_crew_orchestrator.py -v`
1285
+ Expected: All tests pass (fallback path, planning phase, gap analysis)
1286
+
1287
+ - [ ] **Step 3: Run full test suite**
1288
+
1289
+ Run: `pytest tests/ -x -q`
1290
+ Expected: All tests pass
1291
+
1292
+ - [ ] **Step 4: Commit**
1293
+
1294
+ ```bash
1295
+ git add agents/crew_orchestrator.py
1296
+ git commit -m "refactor: replace Crew.kickoff() with AgentDispatchFlow in _run_crew"
1297
+ ```
1298
+
1299
+ ---
1300
+
1301
+ ### Task 7: Remove RoutingEngine + update MockChatBackend
1302
+
1303
+ **Files:**
1304
+ - Delete: `agents/routing.py`
1305
+ - Modify: `agents/orchestrator.py`
1306
+ - Modify: `tests/test_routing.py`
1307
+ - Modify: `tests/test_cam_routing.py`
1308
+
1309
+ - [ ] **Step 1: Update MockChatBackend to use route_agents**
1310
+
1311
+ In `agents/orchestrator.py`, replace the `RoutingEngine` import and `_router` singleton with `route_agents` from `agent_flow`:
1312
+
1313
+ Replace:
1314
+ ```python
1315
+ from agents.routing import RoutingEngine
1316
+ ```
1317
+ with:
1318
+ ```python
1319
+ from agents.agent_flow import route_agents
1320
+ ```
1321
+
1322
+ Replace:
1323
+ ```python
1324
+ _router = RoutingEngine()
1325
+ ```
1326
+ with nothing (remove the line).
1327
+
1328
+ In `MockChatBackend.chat_turn()`, replace:
1329
+ ```python
1330
+ active = _router.route(message)
1331
+ ```
1332
+ with:
1333
+ ```python
1334
+ active = route_agents(message, mentions=[], is_approved_phase=False)
1335
+ ```
1336
+
1337
+ - [ ] **Step 2: Update test_routing.py to test route_agents**
1338
+
1339
+ Replace `tests/test_routing.py`:
1340
+
1341
+ ```python
1342
+ """Tests for agent routing via route_agents function."""
1343
+
1344
+ from agents.agent_flow import route_agents
1345
+
1346
+
1347
+ class TestRouteAgents:
1348
+ def test_route_design_keywords(self):
1349
+ agents = route_agents("I want a sleek design with smooth shape", [], False)
1350
+ assert "design" in agents
1351
+
1352
+ def test_route_engineering_keywords(self):
1353
+ agents = route_agents("Use M6 bolts with 3mm wall thickness in aluminum", [], False)
1354
+ assert "engineering" in agents
1355
+
1356
+ def test_route_cnc_keywords(self):
1357
+ agents = route_agents("Can this be machined on a 3-axis CNC mill?", [], False)
1358
+ assert "cnc" in agents
1359
+
1360
+ def test_route_default_when_no_match(self):
1361
+ agents = route_agents("hello there", [], False)
1362
+ assert agents == ["design", "engineering"]
1363
+
1364
+ def test_route_max_three_agents(self):
1365
+ agents = route_agents("design shape in aluminum for CNC machining, generate preview", [], False)
1366
+ assert len(agents) <= 3
1367
+
1368
+ def test_has_cad_trigger_true(self):
1369
+ agents = route_agents("Generate a preview", [], False)
1370
+ assert "cad" in agents
1371
+
1372
+ def test_has_cad_trigger_false(self):
1373
+ agents = route_agents("hello there", [], False)
1374
+ assert "cad" not in agents
1375
+ ```
1376
+
1377
+ - [ ] **Step 3: Update test_cam_routing.py to test route_agents**
1378
+
1379
+ Replace `tests/test_cam_routing.py`:
1380
+
1381
+ ```python
1382
+ """Tests for CAM agent routing and definitions."""
1383
+
1384
+ from agents.agent_flow import route_agents
1385
+
1386
+
1387
+ class TestCAMRouting:
1388
+ def test_route_cam_keywords_toolpath(self):
1389
+ agents = route_agents("Generate a toolpath for this part", [], False)
1390
+ assert "cam" in agents
1391
+
1392
+ def test_route_cam_keywords_gcode(self):
1393
+ agents = route_agents("Create gcode for CNC milling", [], False)
1394
+ assert "cam" in agents
1395
+
1396
+ def test_route_cam_keywords_slicer(self):
1397
+ agents = route_agents("Run the slicer on this model", [], False)
1398
+ assert "cam" in agents
1399
+
1400
+ def test_cam_agent_exists_in_definitions(self):
1401
+ from agents.definitions import AGENTS
1402
+ assert "cam" in AGENTS
1403
+
1404
+ def test_cam_agent_has_color(self):
1405
+ from agents.definitions import AGENT_COLORS
1406
+ assert "cam" in AGENT_COLORS
1407
+
1408
+ def test_cam_agent_has_name(self):
1409
+ from agents.definitions import AGENT_NAMES
1410
+ assert AGENT_NAMES["cam"] == "CAM Agent"
1411
+ ```
1412
+
1413
+ - [ ] **Step 4: Delete agents/routing.py**
1414
+
1415
+ ```bash
1416
+ git rm agents/routing.py
1417
+ ```
1418
+
1419
+ - [ ] **Step 5: Run all tests**
1420
+
1421
+ Run: `pytest tests/ -x -q`
1422
+ Expected: All tests pass
1423
+
1424
+ - [ ] **Step 6: Commit**
1425
+
1426
+ ```bash
1427
+ git add agents/orchestrator.py tests/test_routing.py tests/test_cam_routing.py
1428
+ git commit -m "refactor: remove RoutingEngine, use route_agents from agent_flow"
1429
+ ```
1430
+
1431
+ ---
1432
+
1433
+ ### Task 8: Final validation + cleanup
1434
+
1435
+ **Files:**
1436
+ - Verify all files
1437
+
1438
+ - [ ] **Step 1: Run full test suite**
1439
+
1440
+ Run: `pytest tests/ -v`
1441
+ Expected: All tests pass with no warnings about missing modules
1442
+
1443
+ - [ ] **Step 2: Check for stale imports of routing.py**
1444
+
1445
+ Run: `grep -r "from agents.routing" --include="*.py" .`
1446
+ Expected: No results (all imports removed)
1447
+
1448
+ Run: `grep -r "RoutingEngine" --include="*.py" .`
1449
+ Expected: No results
1450
+
1451
+ - [ ] **Step 3: Verify API contract unchanged**
1452
+
1453
+ Run: `python -c "from agents.crew_orchestrator import CrewOrchestrator; o = CrewOrchestrator(); print(type(o))"`
1454
+ Expected: `<class 'agents.crew_orchestrator.CrewOrchestrator'>`
1455
+
1456
+ Run: `python -c "from agents.orchestrator import get_orchestrator; o = get_orchestrator('mock'); print(type(o))"`
1457
+ Expected: `<class 'agents.orchestrator.MockChatBackend'>`
1458
+
1459
+ - [ ] **Step 4: Verify Flow import works**
1460
+
1461
+ Run: `python -c "from agents.agent_flow import AgentDispatchFlow, AgentResponse, AgentFlowState, route_agents, check_readiness, collect_responses, extract_code; print('All imports OK')"`
1462
+ Expected: `All imports OK`
1463
+
1464
+ - [ ] **Step 5: Commit**
1465
+
1466
+ ```bash
1467
+ git add -A
1468
+ git commit -m "chore: final cleanup after CrewAI Flow refactor"
1469
+ ```