CallMeDaniel Claude Opus 4.7 (1M context) commited on
Commit
bdc65cb
·
1 Parent(s): cc2146c

docs: add ArchAgent Phase 1 design spec

Browse files

Greenfield successor to NeuralCAD targeting house-scale architectural
design. Two-service architecture (web + Blender worker), LangGraph
agent orchestration, HouseSpec Pydantic DSL with deterministic bpy
translator, gallery + per-item persistence, chat + image inputs.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

docs/superpowers/specs/2026-05-09-archagent-phase-1-design.md ADDED
@@ -0,0 +1,748 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # ArchAgent — Phase 1: Foundation + Architectural Shell
2
+
3
+ **Date:** 2026-05-09
4
+ **Status:** Draft
5
+ **Repo:** new (ArchAgent — greenfield, to be created)
6
+ **Phase:** 1 of 3 (this spec)
7
+
8
+ > Phase 2 (Interior composition) and Phase 3 (Exterior + site) are out of scope for this spec — they get their own spec → plan → implementation cycles after Phase 1 ships.
9
+
10
+ ## Context
11
+
12
+ ArchAgent is a multi-agent platform for generating 3D architectural designs from natural language and sketches. It is a greenfield successor to NeuralCAD (a CadQuery-based CNC mechanical design tool); none of NeuralCAD's code is carried over. ArchAgent targets house-scale architectural shells: walls, floors, doors, windows, roof, stairs, interior partitions, structural columns and beams, and fixed cabinetry.
13
+
14
+ The product is a conversational web app. A user opens a gallery of past designs, creates a new project, and converses with a team of specialist agents (Architect, Structural Engineer, Code Compliance, plus a Blender Coder and Construction Documenter). The agents accumulate design decisions across turns, gate generation behind readiness checks, and produce a 3D model plus floor plans plus a bill of materials when the user approves. Phase 1 deliberately excludes interior decoration, exterior visualization, and rendering quality concerns.
15
+
16
+ ## Decision Log
17
+
18
+ | # | Decision | Rationale |
19
+ |---|---|---|
20
+ | 1 | Greenfield repo, not a migration | NeuralCAD's identity, deploy story, and several architectural assumptions don't fit architectural CAD. Carrying old code adds friction without compounding value. |
21
+ | 2 | Phase 1 scope: extended architectural shell | Standard shell (walls/doors/windows/roof) plus structural elements and partitions makes the validators meaningful and matches what users mean by "house." |
22
+ | 3 | Two-service architecture: web + Blender worker | Failure isolation, lighter web image, Python-version freedom for the web service, clean local docker-compose deploy. |
23
+ | 4 | Blender worker uses persistent Blender subprocess | Full Blender features (Geometry Nodes, modifiers, all exporters), no bpy-wheel Python pin, more battle-tested than the bpy pip module. |
24
+ | 5 | Structured DSL (HouseSpec) → deterministic translator | LLMs emit JSON reliably; bpy code generation has retry-loop tax. Validators work on typed data, not on parsed bpy scenes. |
25
+ | 6 | LangGraph for agent orchestration | Five-out-of-six flow primitives we need (cycles, conditional branches, parallel fan-out, human-in-loop interrupts, typed state, checkpointed resumability) are first-class in LangGraph. CrewAI's strengths (emergent multi-agent collaboration) don't apply. |
26
+ | 7 | Standalone HTML + Three.js for viewer | All Phase 1 rendering needs (glTF, PBR materials, shadows, picking, clipping, labels) work without a build step. ~500 LOC ceiling before reconsidering. |
27
+ | 8 | SQLite + filesystem gallery, single-user | Right-sized for a personal tool. Multi-user is a clean later layer if needed. |
28
+ | 9 | Inputs: text + image upload | Vision-capable LLM can parse sketches into HouseSpec deltas. Big UX win for architects, modest implementation cost. |
29
+ | 10 | UX: chat-primary with inline structured-question cards | Surfaces multi-agent reasoning without ambiguity tax. Cards are LangGraph `interrupt()` payloads rendered as small inline forms. |
30
+ | 11 | No HuggingFace Spaces deploy constraint | Default deploy is `docker-compose` on any Docker host. Cloud Run / Kubernetes work by swapping the shared volume for object storage. |
31
+ | 12 | Clean cutover: NeuralCAD CNC product is not preserved by ArchAgent | NeuralCAD repo is left as-is. ArchAgent is a separate codebase. |
32
+
33
+ ## System Architecture
34
+
35
+ ```
36
+ ┌──────────────────────────────────┐
37
+ browser ◀────────│ Web service (Python 3.12) │
38
+ │ │
39
+ │ FastAPI ── /api/{chat,gallery,...}│
40
+ │ FastMCP ── 8 MCP tools │
41
+ │ LangGraph ── agent state graph │
42
+ │ ├─ Architect node │
43
+ │ ├─ Structural node │
44
+ │ ├─ Code Compliance node │
45
+ │ ├─ Coder node (HouseSpec) │
46
+ │ └─ Documenter node │
47
+ │ Pydantic ── HouseSpec, DesignState│
48
+ │ ezdxf ── DXF/PDF floor plans │
49
+ │ SQLite ── gallery metadata │
50
+ │ static/ ── index.html (Three.js) │
51
+ └──────────────┬───────────────────────┘
52
+ │ HTTP REST (private)
53
+
54
+ ┌─────��────────────────────────────┐
55
+ │ Blender worker (Python 3.11+) │
56
+ │ │
57
+ │ FastAPI ── /translate /thumbnail │
58
+ │ │ │
59
+ │ ▼ (JSON over UNIX socket) │
60
+ │ Blender 4.x (--background) │
61
+ │ ↑ persistent subprocess │
62
+ │ ↑ command-server addon │
63
+ │ ↑ translator builds bpy scene │
64
+ │ ↑ exports glTF/blend/stl │
65
+ └──────────────┬───────────────────────┘
66
+ │ writes
67
+
68
+ ┌──────────────────────────────────┐
69
+ │ data/ (shared volume) │
70
+ │ ├─ gallery.db (SQLite) │
71
+ │ └─ items/<id>/ │
72
+ │ ├─ state.json │
73
+ │ ├─ spec.json │
74
+ │ ├─ history.jsonl │
75
+ │ ├─ thumbnail.png │
76
+ │ └─ artifacts/ │
77
+ │ ├─ house.gltf │
78
+ │ ├─ house.blend │
79
+ │ ├─ floor_plan_l1.dxf │
80
+ │ ├─ floor_plan_l1.pdf │
81
+ │ └─ bom.json │
82
+ └────────────────────────────────────┘
83
+ ```
84
+
85
+ ### Data flow per chat turn
86
+
87
+ 1. Browser POSTs `/api/chat` with `{item_id, message, image?}`.
88
+ 2. Web service loads `Item` from gallery: rehydrates `DesignState`, recent chat history.
89
+ 3. LangGraph runs with the loaded state as initial input.
90
+ 4. Advisor nodes execute (parallel where possible) and return `READY` / `NOT_READY` with optional gap questions.
91
+ 5. Readiness gate (conditional edge):
92
+ - Any `NOT_READY` → emit `structured_question` interrupt → return to user as inline card.
93
+ - All `READY` plus generation-intent → continue to Coder node.
94
+ - All `READY` plus chat-only intent → continue to advisor consensus reply, no Coder.
95
+ 6. Coder node calls LLM with `HouseSpec` schema as a tool. Output is type-checked. If invalid, retry up to 3 times with the validation error appended.
96
+ 7. Schema-valid `HouseSpec` passes through reference validation and semantic validators. Failure → retry or surface to user.
97
+ 8. Web posts `HouseSpec` to worker `/translate`. Worker writes artifacts and returns geometric violations.
98
+ 9. Geometric validators surface failures the same way.
99
+ 10. Documenter node generates BoM and queues DXF/PDF export (web-side via `ezdxf`).
100
+ 11. Web persists updated `DesignState` and chat turn to `data/items/<id>/`.
101
+ 12. Response streams back to browser: text, validation summary, artifact URLs (gltf, dxf, pdf, bom).
102
+
103
+ ## HouseSpec DSL
104
+
105
+ `HouseSpec` is the central abstraction. It is a Pydantic v2 model emitted by the Coder agent and consumed by the translator. All units are meters; coordinates use a right-handed system with z up. Centerlines for linear elements (walls, beams).
106
+
107
+ ```python
108
+ # core/spec.py (lives in web service; imported by worker too)
109
+
110
+ from pydantic import BaseModel, Field
111
+ from typing import Literal
112
+
113
+ Point2D = tuple[float, float]
114
+ Polygon = list[Point2D] # closed implicit; first ≠ last
115
+
116
+ WallMaterial = Literal["wood_frame", "cmu", "concrete", "steel_stud"]
117
+ StructuralSystem = Literal["wood_frame", "steel", "concrete"]
118
+ RoofType = Literal["gable", "hip", "flat", "shed"]
119
+ RoomProgram = Literal[
120
+ "bedroom", "bathroom", "kitchen", "living", "dining",
121
+ "circulation", "storage", "utility", "garage", "office"
122
+ ]
123
+
124
+ class ProjectMetadata(BaseModel):
125
+ name: str
126
+ occupancy_type: Literal["R-3", "R-2", "B"] = "R-3"
127
+ units: Literal["meters"] = "meters"
128
+
129
+ class SiteContext(BaseModel):
130
+ north_angle_deg: float = 0.0
131
+ lot_polygon: Polygon | None = None
132
+
133
+ class Level(BaseModel):
134
+ id: str
135
+ name: str # "Ground", "Second", "Basement"
136
+ elevation_m: float # absolute z of floor surface
137
+ height_m: float # floor-to-floor
138
+
139
+ class Wall(BaseModel):
140
+ id: str
141
+ level_id: str
142
+ start: Point2D
143
+ end: Point2D
144
+ height_m: float
145
+ thickness_m: float = 0.2
146
+ material: WallMaterial = "wood_frame"
147
+ is_load_bearing: bool = False
148
+
149
+ class InteriorPartition(BaseModel):
150
+ id: str
151
+ level_id: str
152
+ start: Point2D
153
+ end: Point2D
154
+ height_m: float
155
+ thickness_m: float = 0.1
156
+ material: WallMaterial = "wood_frame"
157
+
158
+ class Floor(BaseModel):
159
+ id: str
160
+ level_id: str
161
+ outline: Polygon
162
+ thickness_m: float = 0.25
163
+
164
+ class Door(BaseModel):
165
+ id: str
166
+ parent_wall_id: str
167
+ offset_along_wall_m: float
168
+ width_m: float
169
+ height_m: float
170
+ handing: Literal["left", "right"] = "left"
171
+ swing: Literal["in", "out"] = "in"
172
+ is_egress: bool = False
173
+
174
+ class Window(BaseModel):
175
+ id: str
176
+ parent_wall_id: str
177
+ offset_along_wall_m: float
178
+ sill_height_m: float
179
+ width_m: float
180
+ height_m: float
181
+ operable: bool = True
182
+
183
+ class Roof(BaseModel):
184
+ type: RoofType
185
+ parent_outline: Polygon
186
+ pitch_deg: float = 0.0 # 0 for flat
187
+ overhang_m: float = 0.5
188
+ base_elevation_m: float
189
+
190
+ class Stairs(BaseModel):
191
+ id: str
192
+ from_level_id: str
193
+ to_level_id: str
194
+ run_polyline: list[Point2D]
195
+ width_m: float
196
+ tread_depth_m: float = 0.25
197
+ riser_height_m: float = 0.18
198
+
199
+ class Column(BaseModel):
200
+ id: str
201
+ level_id: str
202
+ position: Point2D
203
+ height_m: float
204
+ cross_section_m: tuple[float, float] = (0.2, 0.2)
205
+ material: Literal["wood", "steel", "concrete"] = "steel"
206
+
207
+ class Beam(BaseModel):
208
+ id: str
209
+ level_id: str
210
+ start: Point2D
211
+ end: Point2D
212
+ elevation_offset_m: float = 0.0
213
+ cross_section_m: tuple[float, float] = (0.2, 0.4)
214
+ material: Literal["wood", "steel", "concrete"] = "steel"
215
+
216
+ class FixedCabinetry(BaseModel):
217
+ id: str
218
+ level_id: str
219
+ cabinet_type: Literal["kitchen_lower", "kitchen_upper", "bath_vanity", "closet", "builtin"]
220
+ footprint: Polygon
221
+ height_m: float
222
+
223
+ class Room(BaseModel):
224
+ id: str
225
+ name: str
226
+ program: RoomProgram
227
+ level_id: str
228
+ bounded_by_wall_ids: list[str] # closed loop required
229
+ floor_id: str
230
+
231
+ class HouseSpec(BaseModel):
232
+ metadata: ProjectMetadata
233
+ site: SiteContext = Field(default_factory=SiteContext)
234
+ structural_system: StructuralSystem = "wood_frame"
235
+ levels: list[Level]
236
+ walls: list[Wall] = []
237
+ floors: list[Floor] = []
238
+ roof: Roof | None = None
239
+ doors: list[Door] = []
240
+ windows: list[Window] = []
241
+ stairs: list[Stairs] = []
242
+ interior_partitions: list[InteriorPartition] = []
243
+ columns: list[Column] = []
244
+ beams: list[Beam] = []
245
+ cabinetry: list[FixedCabinetry] = []
246
+ rooms: list[Room] = []
247
+ ```
248
+
249
+ ### Validation layers
250
+
251
+ 1. **Schema** (Pydantic): types, required fields, enums, ranges.
252
+ 2. **Reference** (`core/validators/reference.py`): every `parent_wall_id` and `bounded_by_wall_ids` resolves; every Room's wall loop is closed.
253
+ 3. **Architect** (`core/validators/architect.py`): room minimum areas (jurisdiction-dependent), program adjacencies (no bedroom adjacent to garage without buffer), corridor widths.
254
+ 4. **Structural** (`core/validators/structural.py`): load-bearing walls have continuous foundation below; beams span column-to-column or column-to-load-bearing; max spans per material from `config/materials.yaml`.
255
+ 5. **Code Compliance** (`core/validators/code.py`): egress doors per occupancy; ceiling heights ≥ jurisdiction min; window-to-floor-area ratio for habitable rooms; stair geometry; ADA where applicable. Rules data-driven via `config/jurisdictions/IRC_2021.yaml`.
256
+ 6. **Geometric** (worker, `geometric_validator.py`): collisions between non-adjacent solids, manifold check on each primitive, no overlapping walls.
257
+
258
+ `Violation` shape:
259
+
260
+ ```python
261
+ class Violation(BaseModel):
262
+ level: Literal["error", "warning"]
263
+ code: str # "IRC_R304_3", "ARCH_ROOM_MIN_AREA", "GEOM_COLLISION"
264
+ primitive_ids: list[str]
265
+ message: str
266
+ suggested_fix: str | None = None
267
+ ```
268
+
269
+ ## Agent Layer (LangGraph)
270
+
271
+ Five agents, all defined as nodes in a LangGraph `StateGraph`. State is a Pydantic model.
272
+
273
+ ```python
274
+ # agents/state.py
275
+
276
+ class AdvisorReport(BaseModel):
277
+ agent: Literal["architect", "structural", "code"]
278
+ status: Literal["READY", "NOT_READY"]
279
+ text: str
280
+ gap_questions: list["StructuredQuestion"] = []
281
+
282
+ class StructuredQuestion(BaseModel):
283
+ field: str # "program.bedroom_count"
284
+ label: str # "How many bedrooms?"
285
+ type: Literal["int", "float", "enum", "text"]
286
+ options: list[str] | None = None
287
+ default: str | int | float | None = None
288
+ required: bool = True
289
+
290
+ class ArchAgentState(BaseModel):
291
+ item_id: str
292
+ user_message: str
293
+ user_image_b64: str | None = None
294
+ history: list[dict] # prior turns, last 20
295
+ design_state: DesignState
296
+ advisor_reports: list[AdvisorReport] = []
297
+ intent: Literal["chat", "generate"] = "chat"
298
+ spec: HouseSpec | None = None
299
+ spec_violations: list[Violation] = []
300
+ geometric_violations: list[Violation] = []
301
+ artifacts: dict[str, str] = {} # {"gltf": "data/items/<id>/artifacts/house.gltf", ...}
302
+ documenter_output: dict | None = None
303
+ structured_questions: list[StructuredQuestion] = []
304
+ error: str | None = None
305
+ ```
306
+
307
+ ### Graph topology
308
+
309
+ ```
310
+ START
311
+
312
+
313
+ classify_intent ── decides: "chat" | "generate"
314
+
315
+
316
+ fan_out_advisors ── Send to architect, structural, code in parallel
317
+
318
+ ▼ (joins)
319
+ readiness_gate ── conditional edge:
320
+ │ all READY + generate → coder
321
+ │ all READY + chat → advisor_reply
322
+ │ any NOT_READY → emit_questions (interrupt)
323
+
324
+ ├──▶ emit_questions ──▶ END (interrupt; resumed on next turn with answers)
325
+
326
+ ├──▶ advisor_reply ──▶ persist ──▶ END
327
+
328
+ └──▶ coder
329
+
330
+
331
+ validate_schema_and_references
332
+ │ on fail: retry coder up to 3 times
333
+
334
+ validate_semantic (architect + structural + code on the spec)
335
+ │ on fail: retry coder up to 3 times
336
+
337
+ call_worker_translate
338
+
339
+
340
+ handle_geometric_violations
341
+ │ on fail: retry coder up to 2 times
342
+
343
+ documenter
344
+
345
+
346
+ persist ──▶ END
347
+ ```
348
+
349
+ Conditional edges and cycles are first-class LangGraph features. The graph is compiled once at startup and reused per turn.
350
+
351
+ ### Agent node prompts
352
+
353
+ Each advisor node has a prompt scaffold defined in `config/agents.yaml`:
354
+
355
+ ```yaml
356
+ architect:
357
+ role: "Senior residential architect"
358
+ goal: "Refine room program, layout intent, and aesthetic constraints into a buildable HouseSpec."
359
+ responsibilities:
360
+ - "Translate user wishes into specific room counts, sizes, adjacencies."
361
+ - "Flag missing info as structured_questions."
362
+ - "Return READY only when program + dimensions + structural system are all set."
363
+ output_format: "JSON: {status, text, gap_questions}"
364
+ ```
365
+
366
+ The Coder is a thin LangGraph node that calls the LLM with `HouseSpec` as a tool schema (Anthropic / OpenAI structured outputs). No CrewAI-style backstory needed — its job is fully constrained by the typed output.
367
+
368
+ ### Advisors vs validators — same names, different roles
369
+
370
+ `architect`, `structural`, and `code` exist as **two distinct things**:
371
+
372
+ - **Advisor nodes** (in `agents/nodes/`) — LangGraph nodes that call an LLM with the agent's role prompt. They run *before* generation, read `DesignState` + chat history, and return guidance + gap questions.
373
+ - **Validator modules** (in `core/validators/`) — pure-Python rule-based checks that run *after* the Coder emits a spec. They take a `HouseSpec` and return a list of `Violation`s. No LLM calls.
374
+
375
+ The two layers are complementary: advisors prevent missing information from reaching the Coder; validators catch mistakes the Coder made anyway. The Documenter node consumes the same spec + violations and renders the final outputs.
376
+
377
+ ### Documenter node outputs
378
+
379
+ When invoked after geometric validation passes, the Documenter writes:
380
+
381
+ - `data/items/<id>/artifacts/bom.json` — counted primitives by type and material, total wall length per material, opening counts, total floor area per level, total enclosed volume.
382
+ - `data/items/<id>/artifacts/floor_plan_<level_id>.dxf` — one DXF per level. Walls as polylines, doors as arcs (showing swing), windows as breaks, room labels as text entities.
383
+ - `data/items/<id>/artifacts/floor_plan_<level_id>.pdf` — matplotlib render of the corresponding DXF.
384
+
385
+ DXF and PDF generation runs in the web service via `ezdxf` — no worker call needed. The Documenter node returns paths and a short summary text rendered in chat ("Generated BoM, 2 floor plans, 1 PDF").
386
+
387
+ ### Checkpointing
388
+
389
+ LangGraph's `SqliteSaver` persists graph state per `thread_id`. We use `item_id` as `thread_id` so resumes happen automatically when a user re-opens an item. The checkpoint database is separate from the gallery DB to keep concerns clean: `data/checkpoints.db` (LangGraph) vs `data/gallery.db` (gallery metadata).
390
+
391
+ ## Web Service
392
+
393
+ ```
394
+ services/web/
395
+ ├── core/
396
+ │ ├── spec.py # HouseSpec models (single source of truth)
397
+ │ ├── design_state.py # DesignState model
398
+ │ ├── llm/
399
+ │ │ ├── anthropic_backend.py
400
+ │ │ ├── openai_backend.py
401
+ │ │ ├── gemini_backend.py
402
+ │ │ └── mock_backend.py
403
+ │ ├── validators/
404
+ │ │ ├── reference.py
405
+ │ │ ├── architect.py
406
+ │ │ ├── structural.py
407
+ │ │ └── code.py
408
+ │ ├── jurisdictions.py # loader for config/jurisdictions/*.yaml
409
+ │ ├── exporters/
410
+ │ │ ├── dxf.py # ezdxf — runs without bpy
411
+ │ │ ├── pdf.py # ezdxf matplotlib renderer
412
+ │ │ └── bom.py # JSON bill of materials
413
+ │ ├── gallery/
414
+ │ │ ├── store.py # SQLite + filesystem
415
+ │ │ ├── models.py
416
+ │ │ └── schema.sql
417
+ │ └── worker_client.py # httpx client for worker /translate, /thumbnail
418
+ ├── agents/
419
+ │ ├── state.py # ArchAgentState, AdvisorReport, StructuredQuestion
420
+ │ ├── graph.py # build_graph() returns compiled LangGraph
421
+ │ ├── nodes/
422
+ │ │ ├── classify_intent.py
423
+ │ │ ├── architect.py
424
+ │ │ ├── structural.py
425
+ │ │ ├── code.py
426
+ │ │ ├── coder.py
427
+ │ │ ├── documenter.py
428
+ │ │ └── readiness.py
429
+ │ └── prompts.py # loaded from config/agents.yaml
430
+ ├── server/
431
+ │ ├── routes.py # FastAPI endpoints
432
+ │ ├── mcp.py # FastMCP server (8 tools)
433
+ │ ├── streaming.py # SSE for chat streaming
434
+ │ └── deps.py # DI: gallery, llm, worker client
435
+ ├── config/
436
+ │ ├── settings.py # Pydantic Settings
437
+ │ ├── agents.yaml
438
+ │ ├── materials.yaml
439
+ │ └── jurisdictions/
440
+ │ └── IRC_2021.yaml
441
+ ├── static/
442
+ │ └── index.html # Three.js + chat UI (vanilla)
443
+ ├── tests/
444
+ │ ├── test_spec.py
445
+ │ ├── test_validators_*.py
446
+ │ ├── test_gallery.py
447
+ │ ├── test_graph.py # LangGraph compile + happy path with mock backend
448
+ │ └── test_routes.py
449
+ ├── Dockerfile
450
+ └── pyproject.toml
451
+ ```
452
+
453
+ ### REST endpoints
454
+
455
+ | Endpoint | Method | Purpose |
456
+ |---|---|---|
457
+ | `/api/chat` | POST | Run a chat turn for an item. Returns text + violations + artifact URLs. Supports SSE streaming. |
458
+ | `/api/gallery` | GET | List items (filter, sort, paginate) |
459
+ | `/api/gallery` | POST | Create new item |
460
+ | `/api/gallery/{id}` | GET | Get full item (state, spec, recent history) |
461
+ | `/api/gallery/{id}` | PATCH | Rename, change status |
462
+ | `/api/gallery/{id}` | DELETE | Soft delete (`?hard=true` for hard delete) |
463
+ | `/api/gallery/{id}/duplicate` | POST | Fork an item |
464
+ | `/api/gallery/{id}/approve` | POST | Lock spec from edits |
465
+ | `/api/gallery/{id}/artifacts/{format}` | GET | Stream artifact (gltf, blend, dxf, pdf, bom, stl) |
466
+ | `/api/gallery/{id}/thumbnail.png` | GET | Stream thumbnail |
467
+ | `/api/jurisdictions` | GET | List supported codes |
468
+ | `/api/agents` | GET | List agent metadata for UI |
469
+ | `/api/healthz` | GET | Health (and worker reachability) |
470
+
471
+ ### MCP tools
472
+
473
+ | Tool | Signature | Purpose |
474
+ |---|---|---|
475
+ | `chat_turn` | `(message, item_id?, mentions?, image?, backend?) -> ChatTurnResponse` | Multi-agent dialog |
476
+ | `generate_house` | `(prompt, name?, backend?, max_retries?) -> {item_id, gltf_url, validation, bom}` | One-shot end-to-end |
477
+ | `validate_spec` | `(spec_json, jurisdiction) -> [Violation]` | Validate without rendering |
478
+ | `list_items` | `(status_filter?, sort_by?, limit?, offset?) -> [ItemSummary]` | Gallery browse |
479
+ | `get_item` | `(item_id) -> Item` | Full state |
480
+ | `delete_item` | `(item_id, hard?) -> {ok}` | Soft/hard delete |
481
+ | `duplicate_item` | `(item_id, new_name) -> Item` | Fork |
482
+ | `get_jurisdictions` | `() -> [Jurisdiction]` | List code rules |
483
+
484
+ Resources: `archagent://schema`, `archagent://jurisdictions/{id}`.
485
+
486
+ ## Blender Worker Service
487
+
488
+ ```
489
+ services/worker/
490
+ ├── app.py # FastAPI; spawns Blender subprocess at startup
491
+ ├── blender_client.py # JSON-over-UNIX-socket client to Blender addon
492
+ ├── translator/
493
+ │ ├── walls.py
494
+ │ ├── floors.py
495
+ │ ├── doors_windows.py
496
+ │ ├── roof.py
497
+ │ ├── stairs.py
498
+ │ ├── partitions.py
499
+ │ ├── columns_beams.py
500
+ │ └── cabinetry.py
501
+ ├── exporters.py # gltf, blend, stl via bpy
502
+ ├── geometric_validator.py # collision/manifold checks
503
+ ├── thumbnail.py # camera + Eevee render to PNG
504
+ ├── addons/
505
+ │ └── archagent_command_server/ # custom Blender addon
506
+ │ ├── __init__.py
507
+ │ └── server.py # listens on UNIX socket, dispatches commands
508
+ ├── Dockerfile # apt: blender-4.x; pip: fastapi, httpx
509
+ ├── requirements.txt
510
+ └── tests/
511
+ ├── test_translator_*.py # goldens: spec → expected bpy ops
512
+ └── test_exporters.py
513
+ ```
514
+
515
+ ### Worker startup
516
+
517
+ 1. Container starts.
518
+ 2. `app.py` lifespan handler launches Blender:
519
+ ```bash
520
+ blender --background --addons archagent_command_server --python-expr "register()"
521
+ ```
522
+ 3. Addon registers a UNIX socket listener at `/tmp/archagent.sock`.
523
+ 4. FastAPI is ready; `/healthz` returns OK once socket connection succeeds.
524
+ 5. On crash (Blender exits), the lifespan handler restarts it; in-flight requests fail with 503 and the web service retries once.
525
+
526
+ ### Worker HTTP API
527
+
528
+ | Endpoint | Body | Returns |
529
+ |---|---|---|
530
+ | `POST /translate` | `{spec: HouseSpec, item_id: str}` | `{artifacts: {gltf, blend, stl}, geometric_violations: [Violation]}` |
531
+ | `POST /thumbnail` | `{item_id: str, view: "iso"\|"plan"}` | `{path: str}` |
532
+ | `POST /validate-geometric` | `{spec, item_id}` | `[Violation]` (no artifacts written) |
533
+ | `GET /healthz` | — | `{ok: true, blender_version: str}` |
534
+
535
+ ### Translator design
536
+
537
+ `translator/__init__.py:translate(spec: HouseSpec) -> dict[str, str]` is the entry point. It clears the bpy scene, then calls each per-primitive builder in dependency order:
538
+
539
+ 1. Levels (just data; not built)
540
+ 2. Floors (extrude polygons by thickness, anchor to level elevation)
541
+ 3. Walls (extrude centerline rectangles; subtract door + window openings)
542
+ 4. Interior partitions (same as walls but flagged non-load-bearing)
543
+ 5. Columns (extrude rectangles)
544
+ 6. Beams (extrude rectangles along centerline at level elevation + offset)
545
+ 7. Roof (build per type: gable/hip/flat/shed)
546
+ 8. Stairs (sweep tread profile along run polyline)
547
+ 9. Doors and windows (placeholder geometry in their wall openings)
548
+ 10. Cabinetry (extrude footprint by height, place at level)
549
+ 11. Rooms (no geometry; tagged metadata for room nodes in glTF)
550
+
551
+ Each builder returns a list of bpy object names; the dispatcher accumulates an `id_to_bpy_name` mapping for the geometric validator.
552
+
553
+ Output: `{gltf: "data/items/<id>/artifacts/house.gltf", blend: "...", stl: "..."}`.
554
+
555
+ ## Gallery & Persistence
556
+
557
+ ```sql
558
+ -- core/gallery/schema.sql
559
+ CREATE TABLE items (
560
+ id TEXT PRIMARY KEY,
561
+ name TEXT NOT NULL,
562
+ status TEXT NOT NULL, -- draft|generated|approved|archived
563
+ created_at TEXT NOT NULL,
564
+ updated_at TEXT NOT NULL,
565
+ jurisdiction TEXT,
566
+ occupancy TEXT,
567
+ num_stories INTEGER,
568
+ gross_area_m2 REAL,
569
+ validation TEXT, -- JSON {errors, warnings}
570
+ thumb_path TEXT
571
+ );
572
+ CREATE INDEX idx_items_updated ON items(updated_at DESC);
573
+ CREATE INDEX idx_items_status ON items(status);
574
+ ```
575
+
576
+ Per-item filesystem layout under `data/items/<id>/`:
577
+
578
+ - `state.json` — `DesignState` snapshot
579
+ - `spec.json` — latest `HouseSpec`
580
+ - `history.jsonl` — append-only chat log (one turn per line)
581
+ - `inputs/` — uploaded images
582
+ - `thumbnail.png` — 512×512 iso render
583
+ - `artifacts/` — generated outputs
584
+
585
+ Item lifecycle:
586
+
587
+ | Op | Effect |
588
+ |---|---|
589
+ | Create | UUIDv7 → mkdir, insert row (`status=draft`) |
590
+ | Load | read state.json + spec.json + tail of history.jsonl into memory |
591
+ | Update (turn) | overwrite state.json, append history.jsonl, update `updated_at` |
592
+ | Generate | write spec.json, call worker, write artifacts/, render thumbnail, `status=generated` |
593
+ | Approve | `status=approved` |
594
+ | Duplicate | copy directory, new id, name="<original> (copy)" |
595
+ | Delete | soft `status=archived`; hard removes directory |
596
+
597
+ In-memory LRU cache (max 16 items) avoids re-reading on rapid turns. Invalidated on duplicate/delete/approve.
598
+
599
+ ## Web Viewer
600
+
601
+ `services/web/static/index.html` — single file. Estimated 500-700 LOC of JS plus inline CSS.
602
+
603
+ - Three.js r160+ via importmap from jsdelivr.
604
+ - Modules: `GLTFLoader`, `OrbitControls`, `RGBELoader` (HDRI environment), `CSS2DRenderer` (room labels), `Raycaster` (picking).
605
+ - Three render modes: 3D iso (default), top-down floor plan (clipping plane at level + 0.6m, orthographic camera), exploded axonometric (Phase 2 candidate).
606
+ - Click a primitive → highlight (outline pass) + sidebar shows spec id, type, dimensions, validation issues touching it.
607
+ - Validation overlay: red outline for primitives with errors; yellow for warnings.
608
+ - Chat panel on the right: messages, agent avatars (color-coded), inline `structured_question` cards.
609
+ - Gallery view: grid of cards, thumbnail + name + status pill + last-modified.
610
+
611
+ ## Inputs & UX
612
+
613
+ ### Image upload
614
+
615
+ - File picker accepts JPEG/PNG, max 8 MB.
616
+ - Encoded base64 alongside the user message in `/api/chat`.
617
+ - Web service forwards image to a vision-capable model (Anthropic Claude Sonnet 4.6 default; OpenAI GPT-4o, Gemini 2.x as alternates).
618
+ - System prompt includes: "If an image is attached, treat it as a sketch or photo of a floor plan. Extract architectural intent: room layout, approximate dimensions, openings. Propose HouseSpec deltas as advisors gather more information."
619
+ - Uploaded image stored in `data/items/<id>/inputs/<timestamp>.png`, referenced from history.jsonl.
620
+
621
+ ### Inline structured-question cards
622
+
623
+ When an advisor returns `gap_questions`, LangGraph emits an `interrupt`. The web layer renders each `StructuredQuestion` as a small inline card:
624
+
625
+ ```
626
+ ┌──────────────────────────────────────────┐
627
+ │ Architect needs to know: │
628
+ │ │
629
+ │ How many bedrooms? │
630
+ │ ( ) 2 ( ) 3 ( ) 4 ( ) 5+ │
631
+ │ │
632
+ │ What ceiling height? │
633
+ │ [ 2.7 ] meters │
634
+ │ │
635
+ │ [ Submit answers ] │
636
+ └──────────────────────────────────────────┘
637
+ ```
638
+
639
+ Submit posts `{item_id, answers: {field: value, ...}}` to `/api/chat/resume`; LangGraph resumes from the interrupt, advisor re-runs with the new info, and the turn proceeds.
640
+
641
+ ## Build Sequence
642
+
643
+ | # | Milestone | Scope | Deliverable |
644
+ |---|---|---|---|
645
+ | **M0** | Bootstrap (1-2 days) | Monorepo, `services/web` + `services/worker` scaffolds, `docker-compose.yml` with shared `./data` mount, ruff/pytest/mypy at workspace root, GitHub Actions builds both images, ArchAgent repo created on GitHub | `docker compose up` brings both `/healthz` green |
646
+ | **M1** | Spec + Translator + Worker (week 1) | `core/spec.py`, full `translator/` per-primitive builders, worker `/translate`, gltf/blend/stl exporters, golden tests for ranch/two-story/ADU | `curl POST /translate` produces valid gltf in `data/items/test/artifacts/`; goldens pass |
647
+ | **M2** | Validators + jurisdictions (week 2) | reference / architect / structural / code validators in web; geometric validator in worker; `IRC_2021.yaml`; ezdxf DXF + PDF exporters; BoM | `archagent validate sample/ranch.json` works; DXF + PDF written |
648
+ | **M3** | LLM backends + Coder node (week 3) | `core/llm/`, Coder LangGraph node with structured-output enforcement, three retry stages | `archagent generate "small ranch" --backend anthropic` emits valid spec end-to-end |
649
+ | **M4** | Multi-agent flow (week 4) | LangGraph nodes for advisors, readiness gate, Documenter, parallel fan-out, interrupt-based gap questions | CLI multi-turn chat with advisor gating works |
650
+ | **M5** | Gallery & persistence (week 5) | SQLite store, item CRUD, per-item DesignState rehydration, MCP gallery tools, thumbnail trigger to worker, LangGraph SqliteSaver | `archagent gallery list/show/duplicate` works; items survive restart; resumed turns work |
651
+ | **M6** | Server + MCP + web UI (week 6) | FastAPI routes, FastMCP, Three.js + GLTFLoader frontend, gallery grid, item view, structured-question cards, image upload, validation overlay | Browser flow: gallery → new design → chat (with image) → 3D model → approve → re-open later |
652
+ | **M7** | Polish + deploy (week 7) | Docker images published to GHCR, README with `docker compose up` quickstart, deploy guide for self-hosted VPS, Playwright happy-path tests, perf budget (cold start <30s, warm chat turn <10s) | Tagged release, deploy guide verified on a clean VPS |
653
+
654
+ ## Risks
655
+
656
+ 1. **LangGraph ramp-up.** Solo developer learning a new framework. Mitigation: prototype the readiness-gate-with-interrupts pattern in M0/M1 before building all five nodes.
657
+ 2. **LLM JSON-vs-schema reliability.** Even with structured outputs, models miss enums or violate constraints. Mitigation: 3 retries per stage with the validation error appended; budget caps prevent runaway loops; fall back to "I couldn't generate, try rephrasing" with the violation summary visible.
658
+ 3. **Translator complexity.** Extended shell already includes 11 primitive types. Mitigation: hold the Phase 1 line — no MEP, electrical, framing detail, or rendering.
659
+ 4. **DXF rendering quality.** ezdxf's matplotlib backend is functional but not architectural-publication quality. Acceptable for Phase 1; revisit in M7 or Phase 2 if users push back.
660
+ 5. **Worker crash recovery.** Persistent Blender subprocess can crash on malformed bpy state. Mitigation: lifespan handler restarts; web retries once; surface clear error to chat.
661
+ 6. **Shared volume permissions.** Docker uid/gid mismatch is a common footgun. Mitigation: pin a non-root user with explicit uid/gid in both Dockerfiles, document in README.
662
+ 7. **Vision LLM hallucinations on sketches.** Sketches are noisy; vision models will guess. Mitigation: parse the result through advisor agents who flag low-confidence inferences as `structured_question`s rather than committing them silently.
663
+
664
+ ## Out of Scope (Phase 2 / Phase 3)
665
+
666
+ Phase 2 (Interior composition):
667
+ - Asset library integrations (Poly Haven, Sketchfab, Hyper3D Rodin, Hunyuan3D)
668
+ - Furniture placement, materials, lighting
669
+ - Interior Designer / Lighting Designer / Materials specialist agents
670
+ - Higher-fidelity material authoring
671
+
672
+ Phase 3 (Exterior + site):
673
+ - Site grading, landscaping, plantings
674
+ - Facade detailing
675
+ - Sun study and analytical lighting
676
+ - Render-quality output (path tracing, denoising)
677
+
678
+ Other deferrals:
679
+ - Multi-user / auth / sharing
680
+ - DXF / IFC import
681
+ - Direct-manipulation canvas UX
682
+ - IFC export
683
+ - Rendering presets beyond glTF preview
684
+ - Mobile / native apps
685
+ - Real-time collaboration
686
+ - Subscription / billing flows
687
+
688
+ ## Open Questions
689
+
690
+ These are not blockers for writing the implementation plan; they can be settled in M0 or with a small RFC during build:
691
+
692
+ 1. **Repo location** (GitHub user/org for `ArchAgent`). Default: `github.com/danghoangnhan/ArchAgent`.
693
+ 2. **LLM default** for the Coder. Default: Claude Sonnet 4.6. Confirm based on tool-use reliability for `HouseSpec` schema.
694
+ 3. **Jurisdiction default**. IRC 2021 (US residential) is a reasonable default; international users will want IBC + local codes later.
695
+ 4. **Thumbnail style** — flat shading / clay render / styled. Default: clay render with single sun for fast Eevee.
696
+ 5. **Rate limits** on chat endpoint. Default: per-IP 60 turns/hour; revisit when deploy target is fixed.
697
+ 6. **Shared volume vs. object storage** for multi-host deploys. Phase 1 ships volume-only; document the swap path.
698
+
699
+ ## Appendix A — Sample HouseSpec
700
+
701
+ ```json
702
+ {
703
+ "metadata": {"name": "Sunset Ranch", "occupancy_type": "R-3"},
704
+ "structural_system": "wood_frame",
705
+ "levels": [{"id": "l1", "name": "Ground", "elevation_m": 0.0, "height_m": 2.7}],
706
+ "walls": [
707
+ {"id": "w1", "level_id": "l1", "start": [0, 0], "end": [12, 0],
708
+ "height_m": 2.7, "thickness_m": 0.2, "material": "wood_frame", "is_load_bearing": true},
709
+ {"id": "w2", "level_id": "l1", "start": [12, 0], "end": [12, 8],
710
+ "height_m": 2.7, "thickness_m": 0.2, "material": "wood_frame", "is_load_bearing": true}
711
+ ],
712
+ "floors": [
713
+ {"id": "f1", "level_id": "l1",
714
+ "outline": [[0, 0], [12, 0], [12, 8], [0, 8]], "thickness_m": 0.25}
715
+ ],
716
+ "doors": [
717
+ {"id": "d1", "parent_wall_id": "w1", "offset_along_wall_m": 1.5,
718
+ "width_m": 0.9, "height_m": 2.1, "is_egress": true}
719
+ ],
720
+ "windows": [
721
+ {"id": "win1", "parent_wall_id": "w1", "offset_along_wall_m": 4.0,
722
+ "sill_height_m": 0.9, "width_m": 1.5, "height_m": 1.2, "operable": true}
723
+ ],
724
+ "rooms": [
725
+ {"id": "r1", "name": "Living", "program": "living", "level_id": "l1",
726
+ "bounded_by_wall_ids": ["w1", "w2", "w3", "w4"], "floor_id": "f1"}
727
+ ],
728
+ "roof": {"type": "gable", "parent_outline": [[0, 0], [12, 0], [12, 8], [0, 8]],
729
+ "pitch_deg": 22.5, "overhang_m": 0.5, "base_elevation_m": 2.7}
730
+ }
731
+ ```
732
+
733
+ ## Appendix B — Tech Stack Summary
734
+
735
+ | Layer | Choice | Notes |
736
+ |---|---|---|
737
+ | Web language | Python 3.12 | No bpy → free of Python pin |
738
+ | Worker language | Python 3.11+ | Blender 4.x bundles Python 3.11 |
739
+ | Web framework | FastAPI | Async, OpenAPI |
740
+ | MCP server | FastMCP | Same process as web |
741
+ | Agent orchestration | LangGraph | Cycles, interrupts, checkpointing |
742
+ | LLM SDKs | Anthropic, OpenAI, Gemini | Direct, no LangChain wrappers needed |
743
+ | Schema | Pydantic v2 | HouseSpec, DesignState, all messages |
744
+ | Persistence | SQLite + filesystem | Two DBs: gallery, checkpoints |
745
+ | Frontend | Three.js r160+ vanilla | Single HTML, importmap |
746
+ | Floor plans | ezdxf | DXF + PDF (matplotlib backend) |
747
+ | Container runtime | Docker + docker-compose | Default deploy |
748
+ | Tooling | uv, ruff, pytest, mypy | Workspace at repo root |