riazmo Claude Opus 4.6 commited on
Commit
f0ceb42
·
1 Parent(s): 6b43e51

docs: update all docs for v3.2 + add Part 2 component generation research

Browse files

- Rewrite Medium article for v3.2: color classifier, naming authority chain,
DTCG compliance, 8-source extraction, component generation teaser
- Fix AURORA prompt contradiction in llm_agents.py: align SYSTEM_PROMPT
(advisory only) with PROMPT_TEMPLATE (optional naming_map)
- Update LinkedIn post, image guide, and context doc for v3.2
- Add PART2_COMPONENT_GENERATION.md: 30+ tool research, custom plugin
decision, MVP scope (5 components, 86 variants), architecture plan
- Update CLAUDE.md Phase 5 with research findings and decision

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

CLAUDE.md CHANGED
@@ -1123,18 +1123,34 @@ PHASE 4: EXTRACTION IMPROVEMENTS (NOT STARTED)
1123
  4c. ❌ Rule engine: shadow elevation analysis
1124
  ```
1125
 
1126
- ### PHASE 5: COMPONENT GENERATION (FUTURENOT STARTED)
1127
 
1128
- Based on strategic research (Feb 2026), the next major feature is automated component generation in Figma:
 
 
 
 
1129
 
1130
  ```
1131
  PHASE 5: FIGMA COMPONENT GENERATION
1132
  5a. Component Definition Schema (JSON defining anatomy + token bindings + variants)
1133
- 5b. Token-to-Component binding engine
1134
- 5c. Figma Plugin: createComponent() + combineAsVariants() + setBoundVariable()
1135
- 5d. MVP Components: Button (60 variants), TextInput (8), Card (2), Toast (4), Checkbox+Radio (12)
1136
- 5e. Variable Collections: Primitives, Semantic, Spacing, Radius, Typography
 
 
 
 
 
 
 
 
1137
 
 
 
 
 
1138
  PHASE 6: ECOSYSTEM INTEGRATION
1139
  6a. Style Dictionary v4 compatible output (50+ platform formats for free)
1140
  6b. Tokens Studio compatible JSON import
@@ -1151,6 +1167,8 @@ PHASE 7: MCP INTEGRATION
1151
 
1152
  **"Lighthouse for Design Systems"** — We are NOT a token management platform (Tokens Studio), NOT a documentation platform (Zeroheight), NOT an extraction tool (Dembrandt). We are the **automated audit + bootstrap tool** that sits upstream of all of those.
1153
 
 
 
1154
  **Unique differentiators no competitor has:**
1155
  - Type scale ratio detection + standard scale matching
1156
  - Spacing grid detection (GCD-based, base-8 alignment scoring)
@@ -1158,12 +1176,15 @@ PHASE 7: MCP INTEGRATION
1158
  - Holistic design system quality score (0-100)
1159
  - Visual spec page auto-generated in Figma
1160
  - Benchmark comparison against established design systems
 
1161
 
1162
  **Key competitors to watch:**
1163
- - Dembrandt (1,300) — does extraction better, but no analysis
1164
- - Tokens Studio (264K users) — does Figma management better, but no extraction
1165
  - Knapsack ($10M funding) — building ingestion engine, biggest strategic threat
1166
- - html.to.designcaptures layouts but not tokens/variables
 
 
1167
 
1168
  ---
1169
 
 
1123
  4c. ❌ Rule engine: shadow elevation analysis
1124
  ```
1125
 
1126
+ ### PHASE 5: COMPONENT GENERATION (NEXTRESEARCH COMPLETE)
1127
 
1128
+ **Full context**: See `PART2_COMPONENT_GENERATION.md` for detailed research, API checks, and architecture.
1129
+
1130
+ **Research finding (Feb 2026)**: 30+ tools evaluated. No production tool takes DTCG JSON -> Figma Components. This is a genuine market gap.
1131
+
1132
+ **Decision**: Custom Figma Plugin (Option A) — extend existing `code.js` with component generation.
1133
 
1134
  ```
1135
  PHASE 5: FIGMA COMPONENT GENERATION
1136
  5a. Component Definition Schema (JSON defining anatomy + token bindings + variants)
1137
+ 5b. Token-to-Component binding engine (resolveTokenValue, bindTokenToVariable)
1138
+ 5c. Variable Collection builder (primitives, semantic, spacing, radius, shadow, typography)
1139
+ 5d. MVP Components:
1140
+ - Button: 4 variants x 3 sizes x 5 states = 60 variants (2-3 days)
1141
+ - TextInput: 4 states x 2 sizes = 8 variants (1-2 days)
1142
+ - Card: 2 configurations (1 day)
1143
+ - Toast: 4 types success/error/warn/info (1 day)
1144
+ - Checkbox+Radio: ~12 variants (1-2 days)
1145
+ 5e. Post-MVP: Toggle (4), Select (multi-state), Modal (3 sizes), Table (template)
1146
+
1147
+ Estimated: ~1400 lines new plugin code, 8-12 days total
1148
+ ```
1149
 
1150
+ **Figma Plugin API confirmed**: createComponent(), combineAsVariants(), setBoundVariable(),
1151
+ setBoundVariableForPaint(), addComponentProperty(), setReactionsAsync() — ALL supported.
1152
+
1153
+ ```
1154
  PHASE 6: ECOSYSTEM INTEGRATION
1155
  6a. Style Dictionary v4 compatible output (50+ platform formats for free)
1156
  6b. Tokens Studio compatible JSON import
 
1167
 
1168
  **"Lighthouse for Design Systems"** — We are NOT a token management platform (Tokens Studio), NOT a documentation platform (Zeroheight), NOT an extraction tool (Dembrandt). We are the **automated audit + bootstrap tool** that sits upstream of all of those.
1169
 
1170
+ **With Phase 5**: We become the ONLY tool that goes from URL -> complete Figma design system WITH components. Fully automated. Nobody else does this end-to-end.
1171
+
1172
  **Unique differentiators no competitor has:**
1173
  - Type scale ratio detection + standard scale matching
1174
  - Spacing grid detection (GCD-based, base-8 alignment scoring)
 
1176
  - Holistic design system quality score (0-100)
1177
  - Visual spec page auto-generated in Figma
1178
  - Benchmark comparison against established design systems
1179
+ - (Phase 5) Automated component generation from extracted tokens
1180
 
1181
  **Key competitors to watch:**
1182
+ - Dembrandt (1,300 stars) — does extraction better, but no analysis, no components
1183
+ - Tokens Studio (1M+ installs) — manages tokens, no extraction, no component generation
1184
  - Knapsack ($10M funding) — building ingestion engine, biggest strategic threat
1185
+ - Figr Identity generates components but from brand config, not extracted tokens
1186
+ - html.to.design — captures layouts but not tokens/variables/components
1187
+ - story.to.design — Storybook->Figma components, but needs full code pipeline
1188
 
1189
  ---
1190
 
PART2_COMPONENT_GENERATION.md ADDED
@@ -0,0 +1,418 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Design System Extractor — Part 2: Component Generation
2
+
3
+ ## Session Context
4
+
5
+ **Prerequisite**: Part 1 (Token Extraction + Analysis) is COMPLETE at v3.2
6
+ - Phases 1-3 DONE: Normalizer, Stage 2 agents, Export all working
7
+ - 113 tests passing, W3C DTCG v1 compliant output
8
+ - GitHub: https://github.com/hiriazmo/design-system-extractor-v3
9
+ - Project: `/Users/yahya/design-system-extractor-v3/`
10
+
11
+ **This session**: Build automated component generation from extracted tokens into Figma.
12
+
13
+ ---
14
+
15
+ ## THE GAP: Nobody Does This
16
+
17
+ Exhaustive research of 30+ tools (Feb 2026) confirms:
18
+
19
+ **No production tool takes DTCG JSON and outputs Figma Components.**
20
+
21
+ ```
22
+ YOUR EXTRACTOR THE GAP FIGMA
23
+ +--------------+ +----------------------------+ +------------------+
24
+ | DTCG JSON |--->| ??? Nothing does this |--->| Button component |
25
+ | with tokens | | tokens -> components | | with 60 variants |
26
+ +--------------+ +----------------------------+ +------------------+
27
+ ```
28
+
29
+ ### What Exists (and What It Can't Do)
30
+
31
+ | Category | Best Tool | What It Does | Creates Components? |
32
+ |----------|-----------|-------------|-------------------|
33
+ | Token Importers | Tokens Studio (1M+ installs) | JSON -> Figma Variables | NO - variables only |
34
+ | AI Design | Figma Make | Prompt -> prototype | NO - not token-driven |
35
+ | MCP Bridges | Figma Console MCP (543 stars) | AI writes to Figma | YES but non-deterministic |
36
+ | Code-to-Figma | story.to.design | Storybook -> Figma components | YES but needs full Storybook |
37
+ | Generators | Figr Identity | Brand config -> components | YES but can't consume YOUR tokens |
38
+ | Commercial | Knapsack ($10M), Supernova | Token management | NO - manages, doesn't create |
39
+ | DEAD | Specify.app (shutting down), Backlight.dev (shut down June 2025) | - | - |
40
+
41
+ ### Key Findings Per Category
42
+
43
+ **Token Importers** (7+ tools evaluated): Tokens Studio, TokensBrucke, Styleframe, DTCG Token Manager, GitFig, Supa Design Tokens, Design System Automator — ALL create Figma Variables from JSON, NONE create components.
44
+
45
+ **MCP Bridges** (5 tools): Figma Console MCP (Southleft), claude-talk-to-figma-mcp, cursor-talk-to-figma-mcp (Grab), figma-mcp-write-server, Figma-MCP-Write-Bridge — ALL have full write access, but component creation is AI-interpreted (non-deterministic, varies per run).
46
+
47
+ **Code-to-Figma**: story.to.design is the standout — creates REAL Figma components with proper variants from Storybook. But requires a full coded component library + running Storybook instance as intermediary.
48
+
49
+ **figma-json2component** (GitHub): Experimental proof-of-concept that generates components from custom JSON schema. Not DTCG, not production quality, but validates the concept IS possible.
50
+
51
+ ---
52
+
53
+ ## FOUR APPROACHES — RANKED
54
+
55
+ ### Option A: Custom Figma Plugin (RECOMMENDED)
56
+ ```
57
+ DTCG JSON -> Your Plugin reads JSON -> Creates Variables -> Generates Components -> Done
58
+ ```
59
+ - **Effort**: 4-8 weeks (~1400 lines of plugin code for 5 MVP components)
60
+ - **Quality**: Highest — fully deterministic, consistent every run
61
+ - **Advantage**: We already have a working plugin (code.js) that imports tokens
62
+ - **Risk**: Low — Figma Plugin API supports everything needed
63
+
64
+ ### Option B: Pipeline — shadcn + Storybook + story.to.design
65
+ ```
66
+ DTCG JSON -> Style Dictionary -> CSS vars -> shadcn themed -> Storybook -> story.to.design -> Figma
67
+ ```
68
+ - **Effort**: 2-3 days setup, then 15-30 min per extraction
69
+ - **Quality**: High — battle-tested shadcn components
70
+ - **Dependency**: story.to.design (commercial, paid)
71
+ - **Risk**: Medium — many moving parts
72
+
73
+ ### Option C: MCP + Claude AI Chain
74
+ ```
75
+ DTCG JSON -> Claude reads tokens -> Figma Console MCP -> AI creates components -> Figma
76
+ ```
77
+ - **Effort**: 2-3 weeks
78
+ - **Quality**: Medium — non-deterministic
79
+ - **Risk**: High — AI output varies per run
80
+
81
+ ### Option D: Figr Identity + Manual Token Swap
82
+ ```
83
+ Figr Identity generates base system -> Manually swap tokens -> Adjust
84
+ ```
85
+ - **Effort**: 1-2 days
86
+ - **Quality**: Medium — not YOUR tokens
87
+ - **Risk**: Medium — manual alignment needed
88
+
89
+ **Decision: Option A (Custom Plugin)** — we already have 80% of the infrastructure, it's deterministic, no external dependencies, and fills a genuine market gap.
90
+
91
+ ---
92
+
93
+ ## FIGMA PLUGIN API: FULL CAPABILITY CHECK
94
+
95
+ Every feature needed for component generation is supported:
96
+
97
+ | Requirement | API Method | Status |
98
+ |------------|-----------|--------|
99
+ | Create components | `figma.createComponent()` | Supported |
100
+ | Variant sets (60 variants) | `figma.combineAsVariants()` | Supported |
101
+ | Auto-layout with padding | `layoutMode`, `paddingTop/Right/Bottom/Left`, `itemSpacing` | Supported |
102
+ | Text labels | `figma.createText()` + `loadFontAsync()` | Supported |
103
+ | Icon slot (optional) | `addComponentProperty("ShowIcon", "BOOLEAN", true)` | Supported |
104
+ | Instance swap (icons) | `addComponentProperty("Icon", "INSTANCE_SWAP", id)` | Supported |
105
+ | Border radius from tokens | `setBoundVariable('topLeftRadius', radiusVar)` | Supported |
106
+ | Colors from tokens | `setBoundVariableForPaint()` -> binds to variables | Supported |
107
+ | Shadows from tokens | `setBoundVariableForEffect()` | Supported (has spread bug, workaround exists) |
108
+ | Hover/press interactions | `node.setReactionsAsync()` with `ON_HOVER`/`ON_PRESS` | Supported |
109
+ | Expose text property | `addComponentProperty("Label", "TEXT", "Button")` | Supported |
110
+ | Disabled opacity | `node.opacity = 0.5` | Supported |
111
+
112
+ ---
113
+
114
+ ## MVP SCOPE: 5 Components, 62 Variants
115
+
116
+ | Component | Variants | Automatable? | Effort |
117
+ |-----------|---------|-------------|--------|
118
+ | **Button** | 4 variants x 3 sizes x 5 states = 60 | Fully | 2-3 days |
119
+ | **Text Input** | 4 states x 2 sizes = 8 | Fully | 1-2 days |
120
+ | **Card** | 2 configurations | Semi | 1 day |
121
+ | **Toast/Notification** | 4 types (success/error/warn/info) | Fully | 1 day |
122
+ | **Checkbox + Radio** | ~12 variants | Fully | 1-2 days |
123
+ | **Total** | **~86 variants** | | **8-12 days** |
124
+
125
+ ### Post-MVP Components
126
+
127
+ | Component | Variants | Automatable? | Effort |
128
+ |-----------|---------|-------------|--------|
129
+ | Toggle/Switch | on/off x enabled/disabled = 4 | Fully | 0.5 day |
130
+ | Select/Dropdown | Multiple states | Semi | 1-2 days |
131
+ | Modal/Dialog | 3 sizes | Semi | 1 day |
132
+ | Table | Header + data rows | Template-based | 2 days |
133
+
134
+ ---
135
+
136
+ ## TOKEN-TO-COMPONENT MAPPING
137
+
138
+ How extracted tokens bind to component properties:
139
+
140
+ ### Button Example
141
+ ```
142
+ Token -> Figma Property
143
+ -------------------------------------------------
144
+ color.brand.primary -> Fill (default state)
145
+ color.brand.600 -> Fill (hover state)
146
+ color.brand.700 -> Fill (pressed state)
147
+ color.text.inverse -> Text color
148
+ color.neutral.200 -> Fill (secondary variant)
149
+ color.neutral.300 -> Fill (secondary hover)
150
+ radius.md -> Corner radius (all corners)
151
+ shadow.sm -> Drop shadow (elevated variant)
152
+ spacing.3 -> Padding horizontal (16px)
153
+ spacing.2 -> Padding vertical (8px)
154
+ font.body.md -> Text style (label)
155
+ ```
156
+
157
+ ### Variable Collections Needed
158
+ ```
159
+ 1. Primitives -> Raw color palette (blue.50 through blue.900, etc.)
160
+ 2. Semantic -> Role-based aliases (brand.primary -> blue.500)
161
+ 3. Spacing -> 4px grid (spacing.1=4, spacing.2=8, spacing.3=12...)
162
+ 4. Radius -> none/sm/md/lg/xl/full
163
+ 5. Shadow -> xs/sm/md/lg/xl elevation levels
164
+ 6. Typography -> Font families, sizes, weights, line-heights
165
+ ```
166
+
167
+ ---
168
+
169
+ ## COMPONENT DEFINITION SCHEMA (Proposed)
170
+
171
+ Each component needs a JSON definition describing its anatomy, token bindings, and variant matrix:
172
+
173
+ ```json
174
+ {
175
+ "component": "Button",
176
+ "anatomy": {
177
+ "root": {
178
+ "type": "frame",
179
+ "layout": "horizontal",
180
+ "padding": { "h": "spacing.3", "v": "spacing.2" },
181
+ "radius": "radius.md",
182
+ "fill": "color.brand.primary",
183
+ "gap": "spacing.2"
184
+ },
185
+ "icon_slot": {
186
+ "type": "instance_swap",
187
+ "size": 16,
188
+ "visible": false,
189
+ "property": "ShowIcon"
190
+ },
191
+ "label": {
192
+ "type": "text",
193
+ "style": "font.body.md",
194
+ "color": "color.text.inverse",
195
+ "content": "Button",
196
+ "property": "Label"
197
+ }
198
+ },
199
+ "variants": {
200
+ "Variant": ["Primary", "Secondary", "Outline", "Ghost"],
201
+ "Size": ["Small", "Medium", "Large"],
202
+ "State": ["Default", "Hover", "Pressed", "Focused", "Disabled"]
203
+ },
204
+ "variant_overrides": {
205
+ "Variant=Secondary": {
206
+ "root.fill": "color.neutral.200",
207
+ "label.color": "color.text.primary"
208
+ },
209
+ "Variant=Outline": {
210
+ "root.fill": "transparent",
211
+ "root.stroke": "color.border.primary",
212
+ "root.strokeWeight": 1,
213
+ "label.color": "color.brand.primary"
214
+ },
215
+ "Variant=Ghost": {
216
+ "root.fill": "transparent",
217
+ "label.color": "color.brand.primary"
218
+ },
219
+ "State=Hover": {
220
+ "root.fill": "color.brand.600"
221
+ },
222
+ "State=Pressed": {
223
+ "root.fill": "color.brand.700"
224
+ },
225
+ "State=Disabled": {
226
+ "root.opacity": 0.5
227
+ },
228
+ "Size=Small": {
229
+ "root.padding.h": "spacing.2",
230
+ "root.padding.v": "spacing.1",
231
+ "label.style": "font.body.sm"
232
+ },
233
+ "Size=Large": {
234
+ "root.padding.h": "spacing.4",
235
+ "root.padding.v": "spacing.3",
236
+ "label.style": "font.body.lg"
237
+ }
238
+ }
239
+ }
240
+ ```
241
+
242
+ ### Component Generation Pattern (Plugin Code)
243
+
244
+ Every component follows the same pipeline:
245
+ ```
246
+ 1. Read tokens from DTCG JSON
247
+ 2. Create Variable Collections (if not exist)
248
+ 3. For each variant combination:
249
+ a. Create frame with auto-layout
250
+ b. Add child nodes (icon slot, label, etc.)
251
+ c. Apply token bindings via setBoundVariable()
252
+ d. Apply variant-specific overrides
253
+ 4. combineAsVariants() -> component set
254
+ 5. Add component properties (Label text, ShowIcon boolean)
255
+ ```
256
+
257
+ ---
258
+
259
+ ## ARCHITECTURE FOR PLUGIN EXTENSION
260
+
261
+ Current plugin (`code.js`) already does:
262
+ - Parse DTCG JSON (isDTCGFormat detection)
263
+ - Create paint styles from colors
264
+ - Create text styles from typography
265
+ - Create effect styles from shadows
266
+ - Create variable collections
267
+
268
+ What needs to be ADDED:
269
+ ```
270
+ code.js (existing ~1200 lines)
271
+ |
272
+ +-- componentGenerator.js (NEW ~1400 lines)
273
+ | |-- generateButton() ~250 lines
274
+ | |-- generateTextInput() ~200 lines
275
+ | |-- generateCard() ~150 lines
276
+ | |-- generateToast() ~150 lines
277
+ | |-- generateCheckbox() ~200 lines
278
+ | |-- generateRadio() ~150 lines
279
+ | +-- shared utilities ~300 lines
280
+ | |-- createAutoLayoutFrame()
281
+ | |-- bindTokenToVariable()
282
+ | |-- buildVariantMatrix()
283
+ | |-- resolveTokenValue()
284
+ |
285
+ +-- componentDefinitions.json (NEW ~500 lines)
286
+ |-- Button definition
287
+ |-- TextInput definition
288
+ |-- Card definition
289
+ |-- Toast definition
290
+ +-- Checkbox/Radio definition
291
+ ```
292
+
293
+ ### Implementation Order
294
+ ```
295
+ Week 1-2: Infrastructure
296
+ - Variable collection builder (primitives, semantic, spacing, radius, shadow)
297
+ - Token resolver (DTCG path -> Figma variable reference)
298
+ - Auto-layout frame builder with token bindings
299
+ - Variant matrix generator
300
+
301
+ Week 3-4: MVP Components
302
+ - Button (60 variants) — most complex, validates the full pipeline
303
+ - TextInput (8 variants) — validates form patterns
304
+ - Toast (4 variants) — validates feedback patterns
305
+
306
+ Week 5-6: Remaining MVP + Polish
307
+ - Card (2 configs) — validates layout composition
308
+ - Checkbox + Radio (12 variants) — validates toggle patterns
309
+ - Error handling, edge cases, testing
310
+
311
+ Week 7-8: Post-MVP (if time)
312
+ - Toggle/Switch, Select, Modal
313
+ - Documentation
314
+ ```
315
+
316
+ ---
317
+
318
+ ## EXISTING FILES TO KNOW ABOUT
319
+
320
+ | File | Purpose | Lines |
321
+ |------|---------|-------|
322
+ | `app.py` | Main Gradio app, token extraction orchestration | ~5000 |
323
+ | `agents/llm_agents.py` | AURORA, ATLAS, SENTINEL, NEXUS LLM agents | ~1200 |
324
+ | `agents/normalizer.py` | Token normalization (colors, radius, shadows) | ~950 |
325
+ | `core/color_classifier.py` | Rule-based color classification (PRIMARY authority) | ~815 |
326
+ | `core/color_utils.py` | Color math (hex/RGB/HSL, contrast, ramps) | ~400 |
327
+ | `core/rule_engine.py` | Type scale, WCAG, spacing grid analysis | ~1100 |
328
+ | `output_json/figma-plugin-extracted/figma-design-token-creator 5/src/code.js` | **Figma plugin — EXTEND THIS** | ~1200 |
329
+ | `output_json/figma-plugin-extracted/figma-design-token-creator 5/src/ui.html` | Plugin UI | ~500 |
330
+
331
+ ### DTCG Output Format (What the Plugin Receives)
332
+
333
+ ```json
334
+ {
335
+ "color": {
336
+ "brand": {
337
+ "primary": {
338
+ "$type": "color",
339
+ "$value": "#005aa3",
340
+ "$description": "[classifier] brand: primary_action",
341
+ "$extensions": {
342
+ "com.design-system-extractor": {
343
+ "frequency": 47,
344
+ "confidence": "high",
345
+ "category": "brand",
346
+ "evidence": ["background-color on <a>", "background-color on <button>"]
347
+ }
348
+ }
349
+ }
350
+ }
351
+ },
352
+ "radius": {
353
+ "md": { "$type": "dimension", "$value": "8px" },
354
+ "lg": { "$type": "dimension", "$value": "16px" },
355
+ "full": { "$type": "dimension", "$value": "9999px" }
356
+ },
357
+ "shadow": {
358
+ "sm": {
359
+ "$type": "shadow",
360
+ "$value": {
361
+ "offsetX": "0px",
362
+ "offsetY": "2px",
363
+ "blur": "8px",
364
+ "spread": "0px",
365
+ "color": "#00000026"
366
+ }
367
+ }
368
+ },
369
+ "typography": {
370
+ "body": {
371
+ "md": {
372
+ "$type": "typography",
373
+ "$value": {
374
+ "fontFamily": "Inter",
375
+ "fontSize": "16px",
376
+ "fontWeight": 400,
377
+ "lineHeight": 1.5,
378
+ "letterSpacing": "0px"
379
+ }
380
+ }
381
+ }
382
+ },
383
+ "spacing": {
384
+ "1": { "$type": "dimension", "$value": "4px" },
385
+ "2": { "$type": "dimension", "$value": "8px" },
386
+ "3": { "$type": "dimension", "$value": "16px" }
387
+ }
388
+ }
389
+ ```
390
+
391
+ ---
392
+
393
+ ## COMPETITIVE ADVANTAGE
394
+
395
+ Building this fills a genuine market gap:
396
+ - **Tokens Studio** (1M+ installs) = token management, no component generation
397
+ - **Figr Identity** = generates components but from brand config, not YOUR tokens
398
+ - **story.to.design** = needs full Storybook pipeline as intermediary
399
+ - **MCP bridges** = non-deterministic AI interpretation
400
+ - **Us** = DTCG JSON in, deterministic Figma components out. Nobody else does this.
401
+
402
+ ### Strategic Position
403
+ ```
404
+ [Extract from website] -> [Analyze & Score] -> [Generate Components in Figma]
405
+ Part 1 (DONE) Part 1 (DONE) Part 2 (THIS)
406
+ ```
407
+
408
+ We become the only tool that goes from URL to complete Figma design system with components — fully automated.
409
+
410
+ ---
411
+
412
+ ## OPEN QUESTIONS FOR THIS SESSION
413
+
414
+ 1. Should component definitions live in JSON (data-driven) or be hardcoded in JS (simpler)?
415
+ 2. Should we generate all 60 Button variants at once, or let user pick which variants?
416
+ 3. How to handle missing tokens? (e.g., site has no shadow tokens — skip shadow on buttons or use defaults?)
417
+ 4. Should we support dark mode variants from the start, or add later?
418
+ 5. Icon system — use a bundled icon set (Lucide?) or just placeholder frames?
agents/llm_agents.py CHANGED
@@ -364,7 +364,7 @@ For each area: THINK → ACT → OBSERVE → VERIFY.
364
  "palette_strategy": "complementary|analogous|triadic|monochromatic|random",
365
  "cohesion_score": N,
366
  "cohesion_notes": "...",
367
- "naming_map": {},
368
  "typography_notes": "Heading: Inter 700, Body: Inter 400. Clean hierarchy.",
369
  "spacing_notes": "8px grid, 92% aligned.",
370
  "radius_notes": "Rounded style: 4px inputs, 8px cards.",
@@ -391,7 +391,7 @@ Return ONLY valid JSON."""
391
  ## SHADOWS
392
  {shadow_data}
393
 
394
- Use ReAct for each area. Name EVERY color in naming_map."""
395
 
396
  def __init__(self, hf_client):
397
  self.hf_client = hf_client
 
364
  "palette_strategy": "complementary|analogous|triadic|monochromatic|random",
365
  "cohesion_score": N,
366
  "cohesion_notes": "...",
367
+ "naming_map": {}, // Optional: ONLY semantic role suggestions (brand.primary, text.secondary, etc.)
368
  "typography_notes": "Heading: Inter 700, Body: Inter 400. Clean hierarchy.",
369
  "spacing_notes": "8px grid, 92% aligned.",
370
  "radius_notes": "Rounded style: 4px inputs, 8px cards.",
 
391
  ## SHADOWS
392
  {shadow_data}
393
 
394
+ Use ReAct for each area. If you see clear semantic roles (brand primary, text color, etc.), suggest them in naming_map. Otherwise leave naming_map empty — the rule-based classifier handles naming."""
395
 
396
  def __init__(self, hf_client):
397
  self.hf_client = hf_client
docs/CONTEXT.md CHANGED
@@ -1,797 +1,190 @@
1
- # Design System Extractor v2 — Master Context File
2
 
3
  > **Upload this file to refresh Claude's context when continuing work on this project.**
4
 
5
- **Last Updated:** January 2026
6
 
7
  ---
8
 
9
- ## 📁 Files Changed in Latest Session
10
 
11
- | File | What Changed |
12
- |------|--------------|
13
- | `agents/extractor.py` | Enhanced 7-source extraction (DOM, CSS vars, SVG, inline, stylesheets, external CSS, page scan) |
14
- | `agents/firecrawl_extractor.py` | **NEW** Agent 1B for deep CSS parsing |
15
- | `agents/semantic_analyzer.py` | **NEW** Agent 1C for semantic color categorization (brand/text/bg/border) |
16
- | `core/preview_generator.py` | AS-IS previews + Color Ramps sorted by brand priority |
17
- | `app.py` | Stage 1 UI now has 6 preview tabs including Semantic Colors |
18
- | `docs/CONTEXT.md` | Updated with semantic analyzer, full architecture diagrams |
 
19
 
20
  ---
21
 
22
- ## 🎯 Project Goal
23
 
24
- Build a **semi-automated, human-in-the-loop agentic system** that:
25
  1. Reverse-engineers a design system from a live website
26
- 2. Reconstructs and upgrades it into a modern, scalable design system
27
- 3. Outputs production-ready JSON tokens (Figma Tokens Studio compatible)
28
-
29
- **Philosophy:** This is a design-aware co-pilot, NOT a magic button. Humans decide, agents propose.
30
-
31
- ---
32
-
33
- ## 🤔 Why This Project? (Market Differentiation)
34
-
35
- ### The Problem We Solve
36
-
37
- | Pain Point | Who Has It | Current Solutions | Why They Fail |
38
- |------------|------------|-------------------|---------------|
39
- | Legacy websites with no design system | Enterprise teams | Manual audit (weeks) | Time-consuming, error-prone |
40
- | Inconsistent design tokens scattered in CSS | Agencies inheriting projects | Figma plugins (style extractors) | Only extract from Figma, not live sites |
41
- | Need to modernize without breaking existing | Product teams | Design system generators | Generate new, don't reverse-engineer existing |
42
- | AA compliance gaps unknown | Accessibility teams | Contrast checkers | Check one color at a time, no system view |
43
-
44
- ### Existing Tools & Their Gaps
45
-
46
- | Tool | What It Does | Gap We Fill |
47
- |------|--------------|-------------|
48
- | **Figma Tokens Studio** | Manages tokens in Figma | Doesn't extract from websites |
49
- | **Style Dictionary** | Transforms tokens to code | Needs tokens first (we create them) |
50
- | **Polypane/VisBug** | Inspect live sites | No systematic extraction or upgrade |
51
- | **AI Design Tools** (Galileo, Uizard) | Generate new designs | Don't reverse-engineer existing |
52
- | **CSS Stats** | Analyze CSS files | Statistics only, no actionable tokens |
53
- | **Chromatic/Percy** | Visual regression | Compare, don't extract or upgrade |
54
-
55
- ### Our Unique Value Proposition
56
-
57
- ```
58
- ┌─────────────────────────────────────────────────────────────────────────────┐
59
- │ WHAT MAKES US DIFFERENT │
60
- ├─────────────────────────────────────────────────────────────────────────────┤
61
- │ │
62
- │ 1. REVERSE-ENGINEERING (not generation) │
63
- │ • Extracts from LIVE websites, not design files │
64
- │ • Preserves what's working, upgrades what's broken │
65
- │ • Respects existing brand decisions │
66
- │ │
67
- │ 2. MULTI-AGENT REASONING (not single LLM) │
68
- │ • Two analysts with different perspectives │
69
- │ • HEAD compiler resolves conflicts │
70
- │ • Shows reasoning, not just results │
71
- │ │
72
- │ 3. HUMAN-IN-THE-LOOP (not magic button) │
73
- │ • Designer reviews every stage │
74
- │ • Accept/reject individual tokens │
75
- │ • Choose from upgrade OPTIONS, not forced decisions ��
76
- │ │
77
- │ 4. VISUAL PREVIEWS (not just data tables) │
78
- │ • Typography rendered in actual detected font │
79
- │ • Color ramps with AA compliance per shade │
80
- │ • See before you export │
81
- │ │
82
- │ 5. COST-TRANSPARENT (not black box) │
83
- │ • Shows token usage and cost per analysis │
84
- │ • Uses HF free tier ($0.10/mo) or Pro ($2/mo)
85
- │ • ~$0.05 per full analysis │
86
- │ │
87
- └─────────────────────────────────────────────────────────────────────────────┘
88
- ```
89
-
90
- ### Target Users
91
-
92
- | User | Use Case | Value |
93
- |------|----------|-------|
94
- | **UX Managers** (like you!) | Modernize legacy booking platforms | Weeks → Hours |
95
- | **Design System Teams** | Audit and standardize existing properties | Systematic, not ad-hoc |
96
- | **Agencies** | Onboard client projects with no documentation | Instant design inventory |
97
- | **Accessibility Consultants** | AA compliance audit with fixes | Full palette view |
98
- | **Developers** | Get production-ready tokens from designer's website | No manual translation |
99
-
100
- ### Why Not Just Use [X]?
101
-
102
- **"Why not just inspect the CSS manually?"**
103
- → You could, but it takes weeks for a complex site. We do it in minutes with systematic coverage.
104
-
105
- **"Why not use Figma's native styles?"**
106
- → Many legacy sites were never in Figma. We extract from the source of truth: the live website.
107
-
108
- **"Why do you need AI? Can't rules handle this?"**
109
- Rules extract tokens. AI understands *design intent* — why is this color used here? What scale was intended? Where does it deviate from best practices?
110
-
111
- **"Isn't this just CSS Stats with AI?"**
112
- CSS Stats tells you what exists. We tell you what it *should* be and give you actionable upgrade paths.
113
-
114
- ---
115
-
116
- ## 🏗️ Architecture Overview
117
-
118
- ```
119
- ┌─────────────────────────────────────────────────────────────────────────────┐
120
- │ TECH STACK │
121
- ├─────────────────────────────────────────────────────────────────────────────┤
122
- │ Frontend: Gradio (long-scroll, sectioned UI with live preview) │
123
- │ Orchestration: LangGraph (agent state management & workflow) │
124
- │ Models: HuggingFace Inference Providers (Novita, Groq, etc.) │
125
- │ Hosting: Hugging Face Spaces │
126
- │ Storage: HF Spaces persistent storage │
127
- │ Output: Platform-agnostic JSON tokens (Figma Tokens Studio) │
128
- └─────────────────────────────────────────────────────────────────────────────┘
129
- ```
130
-
131
- ---
132
-
133
- ## 🧠 Model Assignments
134
-
135
- ### Stage 2: Multi-Agent Analysis (4 Named Agents + Rule Engine)
136
-
137
- | Agent | Persona | Model | Temperature | Cost |
138
- |-------|---------|-------|-------------|------|
139
- | **Rule Engine** | — (deterministic) | None | — | FREE |
140
- | **AURORA** | Brand Color Analyst | `Qwen/Qwen2.5-72B-Instruct` | 0.4 | ~Free (HF PRO) |
141
- | **ATLAS** | Benchmark Advisor | `meta-llama/Llama-3.3-70B-Instruct` | 0.25 | ~Free (HF PRO) |
142
- | **SENTINEL** | Best Practices Auditor | `Qwen/Qwen2.5-72B-Instruct` | 0.2 | ~Free (HF PRO) |
143
- | **NEXUS** | Head Synthesizer | `meta-llama/Llama-3.3-70B-Instruct` | 0.3 | ~$0.001 |
144
-
145
- **Architecture:**
146
- ```
147
- ┌─────────────────────────────────────────────────────────────────────────────┐
148
- │ LAYER 1: DETERMINISTIC (Free — $0.00) │
149
- │ ├─ WCAG Contrast Checker (actual FG/BG pairs, not just vs white) │
150
- │ ├─ Type Scale Detection (ratio math, variance, standard comparison) │
151
- │ ├─ Spacing Grid Analysis (GCD math, alignment %) │
152
- │ └─ Color Statistics (unique, near-duplicates, hue distribution) │
153
- │ │
154
- │ LAYER 2: 4 AI AGENTS (~$0.003 total) │
155
- │ │
156
- │ Rule Engine Results │
157
- │ │ │
158
- │ ┌────┼────────────────┐ │
159
- │ ↓ ↓ ↓ │
160
- │ ┌────────┐ ┌────────┐ ┌──────────┐ │
161
- │ │ AURORA │ │ ATLAS │ │ SENTINEL │ (analyze in parallel) │
162
- │ │ Brand │ │ Bench- │ │ Best │ │
163
- │ │ Colors │ │ marks │ │ Practices│ │
164
- │ │Qwen 72B│ │Llama70B│ │ Qwen 72B │ │
165
- │ └───┬────┘ └───┬────┘ └────┬─────┘ │
166
- │ └───────────┼────────────┘ │
167
- │ ↓ │
168
- │ ┌───────────┐ │
169
- │ │ NEXUS │ (final synthesis) │
170
- │ │ Llama 70B │ │
171
- │ │ • Resolve │ │
172
- │ │ • Score │ │
173
- │ │ • Top 3 │ │
174
- │ └───────────┘ │
175
- └─────────────────────────────────────────────────────────────────────────────┘
176
- ```
177
-
178
- ### Other Agents
179
-
180
- | Agent | Role | Model | Provider | Why |
181
- |-------|------|-------|----------|-----|
182
- | **Agent 1** | Crawler & Extractor | None (Rule-based) | — | Pure CSS extraction, no LLM needed |
183
- | **Agent 2** | Normalizer | `microsoft/Phi-3.5-mini-instruct` | Novita | Fast, great structured output |
184
- | **Agent 4** | Generator | `mistralai/Codestral-22B-v0.1` | Novita | Code specialist, JSON formatting |
185
-
186
- ### Provider Configuration
187
-
188
- Default provider: **Novita** (configurable in `config/agents.yaml`)
189
-
190
- Available providers (via HuggingFace Inference Providers):
191
- - **novita** - Default, good balance
192
- - **groq** - Fastest
193
- - **cerebras** - Ultra-fast
194
- - **sambanova** - Good for Llama
195
- - **together** - Wide model selection
196
-
197
- ### Cost Tracking
198
-
199
- Estimated cost per Stage 2 analysis: **~$0.003**
200
- - Rule Engine: $0.00 (free — pure math)
201
- - AURORA + ATLAS + SENTINEL: ~Free within HF PRO ($9/mo subscription)
202
- - NEXUS: ~$0.001
203
- - HuggingFace PRO tier: $9/month (covers inference for all models)
204
-
205
- ---
206
-
207
- ## 👁️ Visual Previews
208
-
209
- ### Stage 1: AS-IS Previews (No Enhancements)
210
-
211
- Shows raw extracted values exactly as found on the website:
212
-
213
- | Preview | What It Shows |
214
- |---------|---------------|
215
- | **Typography** | Actual font rendered with detected styles |
216
- | **Colors** | Simple swatches with hex, frequency, context, AA status |
217
- | **Spacing** | Visual bars representing each spacing value |
218
- | **Radius** | Boxes with each border-radius applied |
219
- | **Shadows** | Cards with each box-shadow applied |
220
-
221
- ### Stage 2: Enhanced Previews (Upgraded)
222
-
223
- Shows proposed upgrades and improvements:
224
-
225
- | Preview | What It Shows |
226
- |---------|---------------|
227
- | **Typography** | Type scale comparison (1.2, 1.25, 1.333 ratios) |
228
- | **Color Ramps** | 11 shades (50-950) with AA compliance per shade |
229
-
230
- ---
231
-
232
- ## 🔍 Enhanced Extraction (Agent 1)
233
-
234
- Agent 1 now extracts from **5 sources** to capture ALL colors:
235
-
236
- ```
237
- ┌─────────────────────────────────────────────────────────────────────────────┐
238
- │ ENHANCED EXTRACTION SOURCES │
239
- ├─────────────────────────────────────────────────────────────────────────────┤
240
- │ │
241
- │ 1. DOM Computed Styles │
242
- │ • window.getComputedStyle(element) │
243
- │ • Captures: color, background-color, border-color, etc. │
244
- │ │
245
- │ 2. CSS Variables │
246
- │ • :root { --primary-color: #3860be; } │
247
- │ • Parses all stylesheets for CSS custom properties │
248
- │ │
249
- │ 3. SVG Colors │
250
- │ • <svg fill="#00c4cc"> │
251
- │ • <path stroke="#3860be"> │
252
- │ │
253
- │ 4. Inline Styles │
254
- │ • <div style="background-color: #bcd432;"> │
255
- │ • Parses style attributes for color values │
256
- │ │
257
- │ 5. Stylesheet Rules │
258
- │ • Parses CSS rules that may not be applied to visible elements │
259
- │ • Catches hover states, pseudo-elements, etc. │
260
- │ │
261
- └─────────────────────────────────────────────────────────────────────────────┘
262
- ```
263
-
264
- ---
265
-
266
- ## 📋 Enhanced Logging
267
-
268
- ### Stage 1 Extraction Logs
269
-
270
- Shows detailed extraction progress:
271
- ```
272
- ============================================================
273
- 🖥️ DESKTOP EXTRACTION (1440px)
274
- ============================================================
275
-
276
- 📡 Enhanced extraction from 5 sources:
277
- 1. DOM computed styles (getComputedStyle)
278
- 2. CSS variables (:root { --color: })
279
- 3. SVG colors (fill, stroke)
280
- 4. Inline styles (style='color:')
281
- 5. Stylesheet rules (CSS files)
282
- 6. External CSS files (fetch & parse)
283
- 7. Page content scan (brute-force)
284
-
285
- 📊 EXTRACTION RESULTS:
286
- Colors: 45 unique
287
- Typography: 12 styles
288
- Spacing: 28 values
289
- Radius: 8 values
290
- Shadows: 4 values
291
-
292
- 🎨 CSS Variables found: 15
293
- --primary-color: #3860be
294
- --accent-color: #00c4cc
295
- --brand-lime: #bcd432
296
- ... and 12 more
297
-
298
- 🔄 Normalizing (deduping, naming)...
299
- ✅ Normalized: 32 colors, 10 typography, 18 spacing
300
-
301
- ============================================================
302
- 🔥 FIRECRAWL CSS EXTRACTION
303
- ============================================================
304
-
305
- 🌐 Scraping: https://example.com
306
- ✅ Page scraped (125000 chars)
307
- 📝 Parsing <style> blocks...
308
- Found 5 style blocks
309
- 🔗 Finding linked CSS files...
310
- Found 8 CSS files
311
- 📄 Fetching: main.css...
312
- ✅ Parsed (234 colors)
313
- 📄 Fetching: theme.css...
314
- ✅ Parsed (45 colors)
315
-
316
- 📊 FIRECRAWL RESULTS:
317
- CSS files parsed: 8
318
- Style blocks parsed: 5
319
- CSS variables found: 23
320
- Unique colors found: 156
321
-
322
- 🎨 Top colors found:
323
- #06b2c4 (used 45x)
324
- #c1df1f (used 38x)
325
- #373737 (used 120x)
326
-
327
- 🔀 Merging Firecrawl colors with Playwright extraction...
328
- ✅ Added 12 new colors from Firecrawl
329
- 📊 Total colors now: 44
330
-
331
- ============================================================
332
- 🧠 SEMANTIC COLOR ANALYSIS
333
- ============================================================
334
-
335
- 📊 Analyzing 143 colors...
336
- Using rule-based analysis (no LLM)
337
-
338
- 📊 SEMANTIC ANALYSIS RESULTS:
339
-
340
- 🎨 BRAND COLORS:
341
- primary: #06b2c4 (high)
342
- └─ Most frequent saturated color on interactive elements (freq: 33)
343
- secondary: #c1df1f (medium)
344
- └─ Second most frequent brand color (freq: 15)
345
-
346
- 📝 TEXT COLORS:
347
- primary: #373737 (high)
348
- secondary: #666666 (medium)
349
-
350
- 🖼️ BACKGROUND COLORS:
351
- primary: #ffffff (high)
352
- secondary: #f5f5f5 (medium)
353
-
354
- 📈 SUMMARY:
355
- Total colors analyzed: 143
356
- Brand colors found: 2
357
- Clear hierarchy: Yes
358
- Analysis method: rule-based
359
- ```
360
-
361
- ### Stage 2 LLM Analysis Logs (With Semantic Context)
362
-
363
- Shows detailed reasoning from each agent WITH semantic context:
364
-
365
- ```
366
- ============================================================
367
- 🧠 STAGE 2: MULTI-AGENT ANALYSIS
368
- ============================================================
369
-
370
- 🧠 SEMANTIC CONTEXT FROM STAGE 1:
371
- Brand Primary: #06b2c4
372
- Text Primary: #373737
373
- Analysis Method: rule-based
374
-
375
- =======================================================
376
- 🤖 LLM 1: meta-llama/Llama-3.1-70B-Instruct
377
- =======================================================
378
- Provider: novita
379
- 💰 Cost: $0.29/M in, $0.59/M out
380
- 📝 Task: Typography, Colors, AA, Spacing analysis
381
- 🧠 Semantic context: Yes ← NEW: LLM knows color roles!
382
-
383
- 📊 LLM 1 FINDINGS:
384
-
385
- COLORS (with semantic context):
386
- ├─ Brand Primary (#06b2c4): "Fails AA on white (3.2:1)"
387
- ├─ Suggested fix: "#0891a8 (4.6:1)"
388
- └─ Score: 6/10
389
-
390
- =======================================================
391
- 🎯 HEAD: Compiling final recommendations...
392
- =======================================================
393
-
394
- 📥 INPUT: Analyzing outputs from LLM 1 + LLM 2 + Rules + Semantic...
395
-
396
- 📊 HEAD SYNTHESIS:
397
-
398
- COLOR RECOMMENDATIONS (per semantic role):
399
- ├─ brand.primary: #06b2c4 → Keep for branding, use #0891a8 for text
400
- ├─ text.primary: #373737 → Keep (passes AA)
401
- └─ Generate ramps for: brand.primary, brand.secondary, neutral
402
- ```
403
-
404
- ---
405
-
406
- ## 🤖 Agent Personas
407
-
408
- ### Agent 1A: Website Crawler & Enhanced Extractor
409
- - **Persona:** Meticulous Design Archaeologist
410
- - **Tool:** Playwright
411
- - **Job:**
412
- - Auto-discover 10+ pages from base URL
413
- - Crawl Desktop (1440px) + Mobile (375px) separately
414
- - Scroll to bottom + wait for network idle
415
- - **ENHANCED: Extract from 7 sources:**
416
- 1. DOM computed styles (`getComputedStyle`)
417
- 2. CSS variables (`:root { --primary: #xxx }`)
418
- 3. SVG colors (`fill`, `stroke` attributes)
419
- 4. Inline styles (`style="background-color: #xxx"`)
420
- 5. Stylesheet rules (CSS files, hover states, pseudo-elements)
421
- 6. External CSS files (fetch & parse to bypass CORS)
422
- 7. Page content scan (brute-force regex on HTML)
423
- - **Output:** Raw tokens with frequency, context, confidence, source type
424
-
425
- ### Agent 1B: Firecrawl CSS Deep Diver
426
- - **Persona:** CSS Deep Diver
427
- - **Tool:** Firecrawl / httpx fallback
428
- - **Job:**
429
- - Fetch and parse ALL linked CSS files
430
- - Extract colors from CSS rules and variables
431
- - Bypass CORS restrictions
432
- - Find colors missed by DOM inspection
433
- - **Output:** Additional colors merged into main extraction
434
-
435
- ### Agent 1C: Semantic Color Analyzer (NEW - LLM)
436
- - **Persona:** Design System Semanticist
437
- - **Tool:** Rule-based analysis (LLM optional)
438
- - **Job:**
439
- - Analyze colors based on actual CSS usage (not guessing)
440
- - Categorize into semantic roles:
441
- - **Brand Colors:** Used on buttons, CTAs, links (interactive elements)
442
- - **Text Colors:** Used with `color` property on p, span, h1-h6
443
- - **Background Colors:** Used with `background-color` on containers
444
- - **Border Colors:** Used with `border-color` properties
445
- - **Feedback Colors:** Error (red), success (green), warning (yellow)
446
- - Detect color hierarchy (primary → secondary → muted)
447
- - **Input:** Colors WITH context data (css_properties, elements, frequency)
448
- - **Output:** Semantic categorization with confidence levels
449
- - **Why:** Stage 2 LLMs can now give SPECIFIC recommendations per role
450
-
451
- ### Agent 2: Token Normalizer & Structurer
452
- - **Persona:** Design System Librarian
453
- - **Job:**
454
- - Clean noisy extraction, dedupe
455
- - Infer naming patterns
456
- - Tag tokens as: `detected` | `inferred` | `low-confidence`
457
- - **Output:** Structured token sets with metadata
458
-
459
- ### Agent 3: Design System Best Practices Advisor
460
- - **Persona:** Senior Staff Design Systems Architect
461
- - **Job:**
462
- - Research modern DS patterns (Material, Polaris, Carbon, etc.)
463
- - Propose upgrade OPTIONS (not decisions)
464
- - Suggest: type scales (3 options), spacing (8px), color ramps (AA compliant), naming conventions
465
- - **Output:** Option sets with rationale
466
-
467
- ### Agent 4: Plugin & JSON Generator
468
- - **Persona:** Automation Engineer
469
- - **Job:**
470
- - Convert finalized tokens to Figma-compatible JSON
471
- - Generate: typography, color (with tints/shades), spacing variables
472
- - Maintain Desktop + Mobile + version metadata
473
- - **Output:** Production-ready JSON (flat structure for Figma Tokens Studio)
474
-
475
- ---
476
-
477
- ## 🖥️ UI Stages (3 Stages)
478
-
479
- ### Stage 1: Extraction Review (AS-IS)
480
- - **Purpose:** Trust building — show exactly what was extracted
481
- - **Shows:**
482
- - Token tables (colors, typography, spacing)
483
- - **6 Visual Preview Tabs (AS-IS, no enhancements):**
484
- 1. 🔤 Typography — actual font rendered
485
- 2. 🎨 Colors — simple swatches sorted by frequency (no ramps)
486
- 3. 🧠 Semantic Colors — colors organized by usage (brand/text/bg/border)
487
- 4. 📏 Spacing — visual bars
488
- 5. 🔘 Radius — rounded boxes
489
- 6. 🌑 Shadows — shadow cards
490
- - **Human Actions:** Accept/reject tokens, flag anomalies, toggle Desktop↔Mobile
491
-
492
- ### Stage 2: Upgrade Playground (MOST IMPORTANT)
493
- - **Purpose:** Decision-making through live visuals
494
- - **Shows:**
495
- - Side-by-side option selector + live preview
496
- - **Color Ramps (50-950 shades with AA compliance)**
497
- - Type scale options (1.2, 1.25, 1.333)
498
- - **Semantic-aware recommendations:** "Your brand primary #06b2c4 fails AA, consider #0891a8"
499
- - **Human Actions:** Select type scale A/B/C, spacing system, color ramps — preview updates instantly
500
-
501
- ### Stage 3: Final Review & Export
502
- - **Purpose:** Confidence before export
503
- - **Shows:** Token preview, JSON tree, diff view (original vs final)
504
- - **Human Actions:** Download JSON, save version, label version
505
-
506
- ---
507
-
508
- ## 📁 Project Structure
509
-
510
- ```
511
- design-system-extractor/
512
- ├── app.py # Gradio main entry point
513
- ├── requirements.txt
514
- ├── README.md
515
-
516
- ├── config/
517
- │ ├── .env.example # Environment variables template
518
- │ ├── agents.yaml # Agent personas & configurations
519
- │ └── settings.py # Application settings
520
-
521
- ├── agents/
522
- │ ├── __init__.py
523
- │ ├── state.py # LangGraph state definitions
524
- │ ├── graph.py # LangGraph workflow orchestration
525
- │ ├── crawler.py # Agent 1A: Website crawler
526
- │ ├── extractor.py # Agent 1A: Token extraction (7 sources)
527
- │ ├── firecrawl_extractor.py # Agent 1B: Deep CSS parsing
528
- │ ├── semantic_analyzer.py # Agent 1C: Semantic color categorization
529
- │ ├── normalizer.py # Agent 2: Token normalization
530
- │ ├── advisor.py # Agent 3: Best practices
531
- │ ├── stage2_graph.py # Stage 2 multi-agent LLM workflow
532
- │ └── generator.py # Agent 4: JSON generator
533
-
534
- ├── core/
535
- │ ├── __init__.py
536
- │ ├── color_utils.py # Color analysis, contrast, ramps
537
- │ ├── preview_generator.py # HTML preview generation
538
- │ ├── hf_inference.py # HuggingFace LLM inference
539
- │ └── token_schema.py # Token data structures (Pydantic)
540
-
541
- ├── ui/
542
- │ └── __init__.py
543
-
544
- ├── templates/
545
-
546
- ├── storage/
547
- │ └── __init__.py
548
-
549
- ├── tests/
550
- │ └── __init__.py
551
-
552
- └── docs/
553
- └── CONTEXT.md # THIS FILE - upload for context refresh
554
- ```
555
-
556
- ---
557
-
558
- ## 🔧 Key Technical Decisions
559
 
560
  | Decision | Choice | Rationale |
561
  |----------|--------|-----------|
562
- | Viewports | Fixed 1440px + 375px | Simplicity, covers main use cases |
563
- | Scrolling | Bottom + network idle | Captures lazy-loaded content |
564
- | Infinite scroll | Skip | Avoid complexity |
565
- | Modals | Manual trigger | User decides what to capture |
566
- | Color ramps | 5-10 shades, AA compliant | Industry standard |
567
- | Type scales | 3 options (1.25, 1.333, 1.414) | User selects |
568
- | Spacing | 8px base system | Modern standard |
569
- | ML models | Minimal, rule-based preferred | Simplicity, reliability |
570
- | Versioning | HF Spaces persistent storage | Built-in, free |
571
- | Preview | Gradio + iframe (best for dynamic) | Smooth updates |
572
 
573
  ---
574
 
575
- ## 📊 Token Schema (Core Data Structures)
576
-
577
- ```python
578
- class TokenSource(Enum):
579
- DETECTED = "detected" # Directly found in CSS
580
- INFERRED = "inferred" # Derived from patterns
581
- UPGRADED = "upgraded" # User-selected improvement
582
-
583
- class Confidence(Enum):
584
- HIGH = "high" # 10+ occurrences
585
- MEDIUM = "medium" # 3-9 occurrences
586
- LOW = "low" # 1-2 occurrences
587
-
588
- class Viewport(Enum):
589
- DESKTOP = "desktop" # 1440px
590
- MOBILE = "mobile" # 375px
591
- ```
592
 
593
- ### Token Types:
594
- - **ColorToken:** value, frequency, contexts, elements, contrast ratios
595
- - **TypographyToken:** family, size, weight, line-height, elements
596
- - **SpacingToken:** value, frequency, contexts, fits_base_8
597
- - **RadiusToken:** value, frequency, elements
598
- - **ShadowToken:** value, frequency, elements
599
 
600
- ---
601
-
602
- ## 🔄 LangGraph Workflow
603
-
604
- ```
605
- ┌─────────────┐
606
- │ START │
607
- └──────┬──────┘
608
-
609
-
610
- ┌─────────────┐
611
- │ URL Input │
612
- └──────┬──────┘
613
-
614
-
615
- ┌────────────────────────┐
616
- │ Agent 1: Discover │
617
- │ (find pages) │
618
- └───────────┬────────────┘
619
-
620
-
621
- ┌────────────────────────┐
622
- │ HUMAN: Confirm pages │◄─── Checkpoint 1
623
- └───────────┬────────────┘
624
-
625
-
626
- ┌────────────────────────┐
627
- │ Agent 1: Extract │
628
- │ (crawl & extract) │
629
- └───────────┬────────────┘
630
-
631
-
632
- ┌────────────────────────┐
633
- │ Agent 2: Normalize │
634
- └───────────┬────────────┘
635
-
636
-
637
- ┌────────────────────────┐
638
- │ HUMAN: Review tokens │◄─── Checkpoint 2 (Stage 1 UI)
639
- └───────────┬────────────┘
640
-
641
- ┌───────────────┴───────────────┐
642
- │ │
643
- ▼ ▼
644
- ┌──────────────────┐ ┌──────────────────┐
645
- │ Agent 3: Advise │ │ (parallel) │
646
- │ (best practices) │ │ │
647
- └────────┬─────────┘ └──────────────────┘
648
-
649
-
650
- ┌────────────────────────┐
651
- │ HUMAN: Select options │◄─── Checkpoint 3 (Stage 2 UI)
652
- └───────────┬────────────┘
653
-
654
-
655
- ┌────────────────────────┐
656
- │ Agent 4: Generate │
657
- │ (final JSON) │
658
- └───────────┬────────────┘
659
-
660
-
661
- ┌────────────────────────┐
662
- │ HUMAN: Export │◄─── Checkpoint 4 (Stage 3 UI)
663
- └───────────┬────────────┘
664
-
665
-
666
- ┌─────────┐
667
- │ END │
668
- └─────────┘
669
  ```
670
-
671
- ---
672
-
673
- ## 🚦 Human-in-the-Loop Rules
674
-
675
- 1. **No irreversible automation**
676
- 2. **Agents propose Humans decide**
677
- 3. **Every auto action must be:**
678
- - Visible
679
- - Reversible
680
- - Previewed
681
-
682
- ---
683
-
684
- ## 📦 Output JSON Format
685
-
686
- ```json
687
- {
688
- "metadata": {
689
- "source_url": "https://example.com",
690
- "extracted_at": "2025-01-23T10:00:00Z",
691
- "version": "v1-recovered",
692
- "viewport": "desktop"
693
- },
694
- "colors": {
695
- "primary": {
696
- "50": { "value": "#e6f2ff", "source": "upgraded" },
697
- "500": { "value": "#007bff", "source": "detected" },
698
- "900": { "value": "#001a33", "source": "upgraded" }
699
- }
700
- },
701
- "typography": {
702
- "heading-xl": {
703
- "fontFamily": "Inter",
704
- "fontSize": "32px",
705
- "fontWeight": 700,
706
- "lineHeight": "1.2",
707
- "source": "detected"
708
- }
709
- },
710
- "spacing": {
711
- "xs": { "value": "4px", "source": "upgraded" },
712
- "sm": { "value": "8px", "source": "detected" },
713
- "md": { "value": "16px", "source": "detected" }
714
- }
715
- }
716
  ```
717
 
718
- ---
719
-
720
- ## 🛠️ Implementation Phases & Current Status
721
-
722
- ### Phase 1 ✅ COMPLETE
723
- - [x] Project structure
724
- - [x] Configuration files
725
- - [x] Token schema (Pydantic models)
726
- - [x] Agent 1: Crawler (page discovery)
727
- - [x] Agent 1: Enhanced Extractor (5-source extraction)
728
- - [x] Agent 2: Normalizer
729
- - [x] Stage 1 UI with 5 AS-IS preview tabs
730
- - [x] LangGraph basic workflow
731
- - [x] JSON export (flat structure for Figma)
732
-
733
- ### Phase 2 ✅ MOSTLY COMPLETE
734
- - [x] Agent 3: Multi-LLM Advisor (Qwen + Llama + HEAD)
735
- - [x] Stage 2 UI (Upgrade Playground)
736
- - [x] Live preview system (typography, color ramps)
737
- - [x] Enhanced LLM logging with reasoning
738
- - [ ] Accept/Reject checkbox wiring to export
739
-
740
- ### Phase 3 🔄 IN PROGRESS
741
- - [ ] Agent 4: Generator (component patterns)
742
- - [ ] Stage 3 UI (diff view)
743
- - [ ] Arabic page filtering
744
-
745
- ### Phase 4 ⏳ PENDING
746
- - [ ] Full LangGraph orchestration
747
- - [ ] HF Spaces deployment
748
- - [ ] Persistent storage
749
- - [ ] MCP Claude / Figma plugin integration (Part 2 of article)
750
-
751
- ---
752
-
753
- ## 🐛 Known Issues & Pending Fixes
754
-
755
- | Issue | Status | Fix |
756
- |-------|--------|-----|
757
- | Arabic pages included | Pending | Filter `/ar/` URLs in crawler |
758
- | Accept/Reject not wired | Pending | Export should respect checkbox state |
759
- | Stage 1 vs Stage 2 preview confusion | ✅ Fixed | Stage 1 now shows AS-IS (no ramps) |
760
- | Colors missed from CSS variables | ✅ Fixed | Enhanced 5-source extraction |
761
- | JSON nested structure | ✅ Fixed | Flat structure for Figma compatibility |
762
 
763
- ---
764
-
765
- ## 🔑 Environment Variables
766
-
767
- ```env
768
- # Required
769
- HF_TOKEN=your_huggingface_token
770
-
771
- # Model Configuration (defaults shown — diverse providers)
772
- AGENT2_MODEL=microsoft/Phi-3.5-mini-instruct # Microsoft - Fast naming
773
- AGENT3_MODEL=meta-llama/Llama-3.1-70B-Instruct # Meta - Strong reasoning
774
- AGENT4_MODEL=mistralai/Codestral-22B-v0.1 # Mistral - Code/JSON
775
-
776
- # Optional
777
- DEBUG=true
778
- LOG_LEVEL=INFO
779
- ```
780
 
781
  ---
782
 
783
- ## 📝 Notes for Claude
784
 
785
- When continuing this project:
786
- 1. **Check current phase** in Implementation Phases section
787
- 2. **Review agent personas** in agents.yaml for consistent behavior
788
- 3. **Follow token schema** defined in core/token_schema.py
789
- 4. **Maintain LangGraph state** consistency across agents
790
- 5. **Use Gradio components** from ui/components.py for consistency
791
- 6. **Test with** real websites before deployment
792
- 7. **Enhanced extraction** captures from 5 sources — check logs to verify
793
- 8. **Stage 1 = AS-IS** (no ramps), **Stage 2 = Enhanced** (with ramps)
794
 
795
  ---
796
 
797
- *Last updated: 2025-01-23*
 
1
+ # Design System Extractor v3.2 — Master Context File
2
 
3
  > **Upload this file to refresh Claude's context when continuing work on this project.**
4
 
5
+ **Last Updated:** February 2026
6
 
7
  ---
8
 
9
+ ## Current Status
10
 
11
+ | Component | Status | Version |
12
+ |-----------|--------|---------|
13
+ | Token Extraction (Part 1) | COMPLETE | v3.2 |
14
+ | Color Classification | COMPLETE | v3.1 |
15
+ | DTCG Compliance | COMPLETE | v3.2 |
16
+ | Naming Authority Chain | COMPLETE | v3.2 |
17
+ | Figma Plugin (Visual Spec) | COMPLETE | v7 |
18
+ | Component Generation (Part 2) | RESEARCH DONE | - |
19
+ | Tests | 113 passing | - |
20
 
21
  ---
22
 
23
+ ## Project Goal
24
 
25
+ Build a **semi-automated, human-in-the-loop system** that:
26
  1. Reverse-engineers a design system from a live website
27
+ 2. Classifies colors deterministically by CSS evidence
28
+ 3. Audits against industry benchmarks and best practices
29
+ 4. Outputs W3C DTCG v1 compliant JSON
30
+ 5. Generates Figma Variables, Styles, and Visual Spec pages
31
+ 6. (Part 2) Auto-generates Figma components from tokens
32
+
33
+ **Philosophy:** AI as copilot, not autopilot. Humans decide, agents propose.
34
+
35
+ ---
36
+
37
+ ## Architecture (v3.2)
38
+
39
+ ```
40
+ +--------------------------------------------------+
41
+ | LAYER 1: EXTRACTION + NORMALIZATION (Free) |
42
+ | +- Crawler + 7-Source Extractor (Playwright) |
43
+ | +- Normalizer: colors, radius, shadows, typo |
44
+ | +- Firecrawl: deep CSS parsing |
45
+ +--------------------------------------------------+
46
+ | LAYER 2: CLASSIFICATION + RULE ENGINE (Free) |
47
+ | +- Color Classifier (815 lines, deterministic) |
48
+ | +- WCAG Contrast Checker (actual FG/BG pairs) |
49
+ | +- Type Scale Detection (ratio math) |
50
+ | +- Spacing Grid Analysis (GCD math) |
51
+ +--------------------------------------------------+
52
+ | LAYER 3: 4 AI AGENTS (~$0.003) |
53
+ | +- AURORA - Brand Advisor (Qwen 72B) |
54
+ | +- ATLAS - Benchmark Advisor (Llama 70B) |
55
+ | +- SENTINEL - Best Practices Audit (Qwen 72B) |
56
+ | +- NEXUS - Head Synthesizer (Llama 70B) |
57
+ +--------------------------------------------------+
58
+ | EXPORT: W3C DTCG v1 Compliant JSON |
59
+ | +- $type, $value, $description, $extensions |
60
+ | +- Figma Plugin: Variables + Styles + Visual Spec|
61
+ +--------------------------------------------------+
62
+ ```
63
+
64
+ ### Naming Authority Chain (v3.2)
65
+
66
+ ```
67
+ 1. Color Classifier (PRIMARY) - deterministic, covers ALL colors
68
+ +- CSS evidence -> category -> token name
69
+ +- 100% reproducible, logged with evidence
70
+
71
+ 2. AURORA LLM (SECONDARY) - semantic role enhancer ONLY
72
+ +- Can promote "color.blue.500" -> "color.brand.primary"
73
+ +- CANNOT rename palette colors
74
+ +- filter_aurora_naming_map() enforces boundary
75
+
76
+ 3. Normalizer (FALLBACK) - preliminary hue+shade names
77
+ ```
78
+
79
+ ---
80
+
81
+ ## File Structure
82
+
83
+ ```
84
+ design-system-extractor-v3/
85
+ +-- app.py # Main Gradio app (~5000 lines)
86
+ +-- CLAUDE.md # Project context and architecture
87
+ +-- PART2_COMPONENT_GENERATION.md # Part 2 research + plan
88
+ |
89
+ +-- agents/
90
+ | +-- crawler.py # Page discovery
91
+ | +-- extractor.py # Playwright 7-source extraction
92
+ | +-- firecrawl_extractor.py # Deep CSS parsing
93
+ | +-- normalizer.py # Token normalization (~950 lines)
94
+ | +-- llm_agents.py # AURORA, ATLAS, SENTINEL, NEXUS
95
+ | +-- semantic_analyzer.py # DEPRECATED in v3.2
96
+ | +-- stage2_graph.py # DEPRECATED in v3.2
97
+ |
98
+ +-- core/
99
+ | +-- color_classifier.py # Rule-based classification (815 lines)
100
+ | +-- color_utils.py # Color math (hex/RGB/HSL, contrast)
101
+ | +-- rule_engine.py # Type scale, WCAG, spacing grid (~1100 lines)
102
+ | +-- hf_inference.py # HuggingFace Inference API client
103
+ | +-- token_schema.py # Pydantic models
104
+ |
105
+ +-- config/
106
+ | +-- settings.py # Configuration
107
+ |
108
+ +-- tests/
109
+ | +-- test_stage1_extraction.py # 82 deterministic tests
110
+ | +-- test_agent_evals.py # 27 LLM agent schema/behavior tests
111
+ | +-- test_stage2_pipeline.py # Pipeline integration tests
112
+ |
113
+ +-- output_json/
114
+ | +-- figma-plugin-extracted/
115
+ | +-- figma-design-token-creator 5/
116
+ | +-- src/code.js # Figma plugin (~1200 lines)
117
+ | +-- src/ui.html # Plugin UI (~500 lines)
118
+ |
119
+ +-- docs/
120
+ +-- MEDIUM_ARTICLE_EPISODE_6.md # Medium article
121
+ +-- LINKEDIN_POST_EPISODE_6.md # LinkedIn post
122
+ +-- IMAGE_GUIDE_EPISODE_6.md # Image specs for article
123
+ +-- FIGMA_SPECIMEN_IDEAS.md # Visual spec layout reference
124
+ +-- CONTEXT.md # THIS FILE
125
+ ```
126
+
127
+ ---
128
+
129
+ ## Model Assignments
130
+
131
+ | Agent | Model | Temperature | Role |
132
+ |-------|-------|-------------|------|
133
+ | Rule Engine | None | - | WCAG, type scale, spacing (FREE) |
134
+ | Color Classifier | None | - | CSS evidence -> category (FREE) |
135
+ | AURORA | Qwen/Qwen2.5-72B-Instruct | 0.4 | Brand advisor (SECONDARY) |
136
+ | ATLAS | meta-llama/Llama-3.3-70B-Instruct | 0.25 | Benchmark comparison |
137
+ | SENTINEL | Qwen/Qwen2.5-72B-Instruct | 0.2 | Best practices audit |
138
+ | NEXUS | meta-llama/Llama-3.3-70B-Instruct | 0.3 | Final synthesis |
139
+
140
+ **Total cost per analysis:** ~$0.003
141
+
142
+ ---
143
+
144
+ ## Key Technical Decisions
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
145
 
146
  | Decision | Choice | Rationale |
147
  |----------|--------|-----------|
148
+ | Color naming | Numeric shades (50-900) | Never words (light/dark/base) |
149
+ | Naming authority | Classifier PRIMARY, LLM SECONDARY | One source of truth |
150
+ | Export format | W3C DTCG v1 | Industry standard (Oct 2025) |
151
+ | Token metadata | $extensions (namespaced) | Frequency, confidence, evidence |
152
+ | Radius processing | Parse, deduplicate, sort, name | none/sm/md/lg/xl/2xl/full |
153
+ | Shadow processing | Parse, sort by blur, name | xs/sm/md/lg/xl (always 5 levels) |
154
+ | Accessibility | Actual FG/BG pairs from DOM | Not just color vs white |
155
+ | Figma output | Variables + Styles + Visual Spec | Auto-generated specimen page |
156
+ | LLM role | Advisory only, never naming authority | Deterministic reproducibility |
 
157
 
158
  ---
159
 
160
+ ## Execution Status
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
161
 
162
+ ### Part 1: Token Extraction + Analysis (COMPLETE)
 
 
 
 
 
163
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
164
  ```
165
+ PHASE 1: NORMALIZER [DONE]
166
+ PHASE 2: STAGE 2 AGENTS [DONE]
167
+ PHASE 3: EXPORT + DTCG [DONE]
168
+ PHASE 4: EXTRACTION IMPROVEMENTS [NOT STARTED]
169
+ 4a. Font family detection (still returns "sans-serif")
170
+ 4b. Rule engine: radius grid analysis
171
+ 4c. Rule engine: shadow elevation analysis
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
172
  ```
173
 
174
+ ### Part 2: Component Generation (RESEARCH COMPLETE)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
175
 
176
+ **Decision:** Custom Figma Plugin (Option A)
177
+ **Scope:** 5 MVP components, ~86 variants, ~1400 lines new plugin code
178
+ **See:** `PART2_COMPONENT_GENERATION.md` for full details
 
 
 
 
 
 
 
 
 
 
 
 
 
 
179
 
180
  ---
181
 
182
+ ## GitHub
183
 
184
+ - **Repository:** https://github.com/hiriazmo/design-system-extractor-v3
185
+ - **Latest commit:** `6b43e51` (DTCG compliance + naming authority)
186
+ - **Tests:** 113 passing
 
 
 
 
 
 
187
 
188
  ---
189
 
190
+ *Last updated: 2026-02-23*
docs/IMAGE_GUIDE_EPISODE_6.md CHANGED
@@ -1,188 +1,252 @@
1
- # 📸 Image Guide for Episode 6 Article
2
 
3
- ## Required Images (8-10 total)
4
 
5
  ### 1. Hero Image
6
  **What:** Screenshot of the Gradio interface showing the full pipeline output
7
  **Where:** After title, before first section
8
  **Specs:** 1200x630px (LinkedIn preview size)
9
- **Content:** Show the Visual Previews section with colors, typography, and NEXUS synthesis visible
10
 
11
  ### 2. Complete Workflow Diagram
12
- **What:** The 8-step pipeline: Website Agents Figma Compare
13
  **Where:** After "The Complete Workflow" section
14
  **Specs:** 1200x800px
15
  **Content:**
16
  ```
17
- 🌐 Website URL
18
-
19
- 🤖 AI Agents (7-source extraction)
20
-
21
- 📄 AS-IS JSON
22
-
23
- 🔌 Figma Plugin (Import)
24
-
25
- 📋 AS-IS Specimen (Review)
26
-
27
- 🧠 Rule Engine + 4 AI Agents (Stage 2)
28
-
29
- ☑️ Accept/Reject (Human Decision)
30
-
31
- 📄 TO-BE JSON 🔌 Figma → 📋 TO-BE Specimen
32
- ```
33
-
34
- ### 3. Two-Layer Architecture Diagram
35
- **What:** Layer 1 (Deterministic, Free) + Layer 2 (4 Named Agents)
 
 
 
 
36
  **Where:** After "Architecture Overview" section
37
- **Specs:** 1200x600px
38
  **Content:**
39
  ```
40
- ┌─────────────────────────────────────────────────┐
41
- LAYER 1: DETERMINISTIC (Free $0.00)
42
- ├─ Crawler + 7-Source Extractor + Normalizer
43
- ├─ Semantic Color Analyzer (rule-based) │
44
- │ ├─ WCAG Contrast Checker (math) │
45
- ├─ Type Scale Detection (ratio math)
46
- ├─ Spacing Grid Analysis (GCD math)
47
- └─ Color Statistics (deduplication) │
48
- ├─────────────────────────────────────────────────┤
49
- LAYER 2: 4 AI AGENTS (~$0.003)
50
- ├─ AURORA Brand Color Analyst (Qwen 72B)
51
- │ ├─ ATLAS — Benchmark Advisor (Llama 70B) │
52
- │ ├─ SENTINEL — Best Practices Auditor (Qwen 72B)│
53
- │ └─ NEXUS — Head Synthesizer (Llama 70B) │
54
- └─────────────────────────────────────────────────┘
55
- ```
56
-
57
- ### 4. Agent Pipeline Flow
58
- **What:** Show the 4 named agents with their flow: parallel analysis → synthesis
59
- **Where:** After "Layer 2" section header
60
  **Specs:** 1200x500px
61
  **Content:**
62
  ```
63
- Rule Engine Results
64
-
65
- ┌────┼────────────────┐
66
- ↓ ↓ ↓
67
- ┌──────┐ ┌──────┐ ┌────────┐
68
- │AURORA│ │ATLAS │SENTINEL│
69
- │Brand │ │Bench │ │Audit │
70
- │Qwen │ │Llama │Qwen │
71
- └──┬───┘ └──┬───┘ └───┬────┘
72
- └────────┼──────────┘
73
-
74
- ┌──────────┐
75
- │ NEXUS │
76
- │Synthesis │
77
- │ Llama 70B│
78
- └──────────┘
79
-
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
80
  Final Recommendations
81
  ```
82
 
83
- ### 5. 7 Extraction Sources Visual
84
  **What:** Show the 7 different methods of extraction
85
- **Where:** After "Stage 1: Extraction" section
86
  **Specs:** 1000x600px
87
  **Content:**
88
  ```
89
- ┌─────────────┐ ┌─────────────┐ ┌─────────────┐
90
- 1. Computed 2. CSS 3. Inline
91
- Styles Variables Styles
92
- └─────────────┘ └─────────────┘ └─────────────┘
93
 
94
- ┌─────────────┐ ┌─────────────┐ ┌─────────────┐
95
- 4. SVG 5. External 6. Style
96
- Attrs CSS Files Blocks
97
- └─────────────┘ └─────────────┘ └─────────────┘
98
 
99
- ┌─────────────────────────────────────────────────┐
100
- 7. Firecrawl Deep Parser
101
- └─────────────────────────────────────────────────┘
102
  ```
103
 
104
- ### 6. Rule Engine Output Screenshot
105
- **What:** Screenshot of actual rule engine output in the Gradio logs panel
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
106
  **Where:** After "The Rule Engine" section
107
  **Specs:** 1200x600px
108
- **Content:** Show the actual emoji-formatted output:
109
- - 📐 TYPE SCALE ANALYSIS
110
- - ACCESSIBILITY CHECK
111
- - 📏 SPACING GRID
112
- - 📊 CONSISTENCY SCORE
113
-
114
- ### 7. NEXUS Synthesis Output
115
- **What:** Screenshot of the final synthesis with scores, top 3 actions, color recommendations
116
  **Where:** After "Agent 4: NEXUS" section
117
  **Specs:** 1200x700px
118
- **Content:** Show the final output with:
119
  - Executive summary
120
- - Scores dashboard (overall, accessibility, consistency, organization)
121
  - Top 3 actions with impact/effort
122
- - Color recommendations with accept/reject checkboxes
123
 
124
- ### 8. Benchmark Comparison Table
125
- **What:** Screenshot of the benchmark comparison showing match percentages
126
- **Where:** After "Agent 2: ATLAS" section
127
- **Specs:** 1000x400px
128
  **Content:** Show:
129
- - 🥇 Polaris: 87% match
130
- - 🥈 Material 3: 77% match
131
- - 🥉 Atlassian: 76% match
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
132
 
133
- ### 9. Before/After Comparison
 
 
 
 
 
 
 
 
 
 
134
  **What:** Side-by-side showing AS-IS vs TO-BE
135
  **Where:** After "Comparing AS-IS vs TO-BE" section
136
  **Specs:** 1200x500px
137
  **Content:**
138
  ```
139
  AS-IS TO-BE
140
- ───── ─────
141
- Type: ~1.18 (random) 1.25 (Major Third)
142
- Brand: #06b2c4 (AA: 3.2) #048391 (AA: 4.5)
143
- Spacing: Mixed 8px grid
144
- Colors: 143 unique ~20 semantic
145
- Score: 52/100 → 78/100
 
 
146
  ```
147
 
148
- ### 10. Cost Comparison Table
149
- **What:** Visual table comparing V1 vs V2 costs + model assignments
150
  **Where:** After "Cost & Model Strategy" section
151
  **Specs:** 1000x400px
152
  **Content:**
153
  ```
154
- Agent Model Cost
155
- ────────────────────────────
156
- Rule Engine None $0.00
157
- AURORA Qwen 72B ~Free (HF PRO)
158
- ATLAS Llama 70B ~Free (HF PRO)
159
- SENTINEL Qwen 72B ~Free (HF PRO)
160
- NEXUS Llama 70B ~$0.001
161
- ─────────────────────────────
162
- TOTAL ~$0.003
163
  ```
164
 
165
- ### 11. Figma Specimen (If Available)
166
- **What:** Screenshot of the Figma specimen page after JSON import
167
- **Where:** After "The Figma Bridge" section
168
- **Specs:** 1200x700px
169
- **Content:** Show Typography + Semantic Colors + Spacing display
170
-
171
  ---
172
 
173
  ## Image Creation Tools
174
 
175
  **Recommended:**
176
- 1. **Figma** Architecture diagrams, pipeline flows, tech stack
177
- 2. **Screenshot tool** Gradio interface captures (use dark mode)
178
- 3. **Excalidraw** Quick hand-drawn style diagrams (for the architecture)
179
 
180
  **Tips:**
181
  - Use dark background screenshots (Gradio dark mode)
182
  - Add subtle drop shadows to screenshots
183
- - Keep consistent color scheme (blues + cyans match brand color #06b2c4)
184
  - Use the agent names (AURORA, ATLAS, SENTINEL, NEXUS) in diagram labels
185
- - Color-code: Layer 1 = green (free), Layer 2 = blue (AI)
 
 
186
 
187
  ---
188
 
@@ -191,15 +255,17 @@ TOTAL ~$0.003
191
  ```
192
  episode6-hero-dashboard.png
193
  episode6-workflow-8steps.png
194
- episode6-architecture-2layers.png
 
195
  episode6-agent-pipeline.png
196
  episode6-extraction-7sources.png
 
197
  episode6-rule-engine-output.png
198
  episode6-nexus-synthesis.png
199
- episode6-benchmark-comparison.png
 
200
  episode6-before-after.png
201
- episode6-cost-table.png
202
- episode6-figma-specimen.png
203
  ```
204
 
205
  ---
@@ -215,3 +281,6 @@ Before taking screenshots:
215
  - [ ] Set consistent window size (1440px wide)
216
  - [ ] Run a real analysis so outputs are populated
217
  - [ ] Ensure agent names (AURORA, ATLAS, etc.) are visible in logs
 
 
 
 
1
+ # Image Guide for Episode 6 Article (v3.2)
2
 
3
+ ## Required Images (10-12 total)
4
 
5
  ### 1. Hero Image
6
  **What:** Screenshot of the Gradio interface showing the full pipeline output
7
  **Where:** After title, before first section
8
  **Specs:** 1200x630px (LinkedIn preview size)
9
+ **Content:** Show the Visual Spec page in Figma with colors, typography, and agent synthesis visible
10
 
11
  ### 2. Complete Workflow Diagram
12
+ **What:** The 8-step pipeline: Website -> Agents -> Figma -> Compare
13
  **Where:** After "The Complete Workflow" section
14
  **Specs:** 1200x800px
15
  **Content:**
16
  ```
17
+ Website URL
18
+ |
19
+ 7-Source Extraction (Playwright + Firecrawl)
20
+ |
21
+ Normalizer (radius, shadows, colors)
22
+ |
23
+ Color Classifier (deterministic)
24
+ |
25
+ Rule Engine (WCAG, type scale, spacing)
26
+ |
27
+ DTCG JSON (AS-IS)
28
+ |
29
+ Figma Plugin -> Variables + Visual Spec
30
+ |
31
+ 4 AI Agents (AURORA, ATLAS, SENTINEL, NEXUS)
32
+ |
33
+ Accept/Reject -> DTCG JSON (TO-BE)
34
+ |
35
+ Figma Plugin -> Compare AS-IS vs TO-BE
36
+ ```
37
+
38
+ ### 3. Three-Layer Architecture Diagram
39
+ **What:** Layer 1 (Extraction) + Layer 2 (Classification + Rules) + Layer 3 (4 Agents)
40
  **Where:** After "Architecture Overview" section
41
+ **Specs:** 1200x700px
42
  **Content:**
43
  ```
44
+ +--------------------------------------------------+
45
+ | LAYER 1: EXTRACTION + NORMALIZATION (Free) |
46
+ | +- 7-Source Extractor + Normalizer |
47
+ | +- Radius/Shadow/Color normalization |
48
+ +--------------------------------------------------+
49
+ | LAYER 2: CLASSIFICATION + RULE ENGINE (Free) |
50
+ | +- Color Classifier (815 lines, deterministic) |
51
+ | +- WCAG + Type Scale + Spacing Grid |
52
+ +--------------------------------------------------+
53
+ | LAYER 3: 4 AI AGENTS (~$0.003) |
54
+ | +- AURORA -> ATLAS -> SENTINEL -> NEXUS |
55
+ +--------------------------------------------------+
56
+ ```
57
+
58
+ ### 4. Naming Authority Chain (NEW - V3 Key Innovation)
59
+ **What:** Diagram showing the V2 chaos vs V3 clean authority
60
+ **Where:** After "The Naming Authority Chain" section
 
 
 
61
  **Specs:** 1200x500px
62
  **Content:**
63
  ```
64
+ V2 (BROKEN): V3 (FIXED):
65
+ +----------+ +------------------+
66
+ |Normalizer| -> "blue.light" |Color Classifier | -> PRIMARY
67
+ +----------+ | (deterministic) |
68
+ +----------+ +------------------+
69
+ | Export | -> "blue.500" |
70
+ +----------+ +------------------+
71
+ +----------+ |AURORA (advisory) | -> SECONDARY
72
+ | AURORA | -> "brand.primary" | roles only |
73
+ +----------+ +------------------+
74
+ |
75
+ = CHAOS in Figma +------------------+
76
+ |Normalizer | -> FALLBACK
77
+ +------------------+
78
+
79
+ = CLEAN output
80
+ ```
81
+
82
+ ### 5. Agent Pipeline Flow
83
+ **What:** Show the 4 named agents with their flow: parallel analysis -> synthesis
84
+ **Where:** After "Layer 3" section header
85
+ **Specs:** 1200x500px
86
+ **Content:**
87
+ ```
88
+ Rule Engine + Classifier Results
89
+ |
90
+ +----+----------------+
91
+ v v v
92
+ +------+ +------+ +--------+
93
+ |AURORA| |ATLAS | |SENTINEL|
94
+ |Brand | |Bench | |Audit |
95
+ |Qwen | |Llama | |Qwen |
96
+ +--+---+ +--+---+ +---+----+
97
+ +--------+----------+
98
+ v
99
+ +----------+
100
+ | NEXUS |
101
+ |Synthesis |
102
+ | Llama 70B|
103
+ +----------+
104
+ v
105
  Final Recommendations
106
  ```
107
 
108
+ ### 6. 7 Extraction Sources Visual
109
  **What:** Show the 7 different methods of extraction
110
+ **Where:** After "Extraction: 7 Sources" section
111
  **Specs:** 1000x600px
112
  **Content:**
113
  ```
114
+ +-------------+ +-------------+ +-------------+
115
+ | 1. Computed | | 2. CSS | | 3. Inline |
116
+ | Styles | | Variables| | Styles |
117
+ +-------------+ +-------------+ +-------------+
118
 
119
+ +-------------+ +-------------+ +-------------+
120
+ | 4. SVG | | 5. External | | 6. Style |
121
+ | Attrs | | CSS Files| | Blocks |
122
+ +-------------+ +-------------+ +-------------+
123
 
124
+ +-------------------------------------------------+
125
+ | 7. Firecrawl Deep Parser |
126
+ +-------------------------------------------------+
127
  ```
128
 
129
+ ### 7. Color Classifier Output (NEW)
130
+ **What:** Show the classifier's evidence-based categorization
131
+ **Where:** After "The Color Classifier" section
132
+ **Specs:** 1200x600px
133
+ **Content:**
134
+ ```
135
+ [CLASSIFY] #06b2c4 -> BRAND
136
+ Evidence: background-color on <button> (freq=33)
137
+
138
+ [CLASSIFY] #373737 -> TEXT
139
+ Evidence: color on <p> (freq=120)
140
+
141
+ [CLASSIFY] #ffffff -> BG
142
+ Evidence: background-color on <body> (freq=1)
143
+
144
+ [DEDUP] #1a1a1a merged with #1b1b1b (dist=1.7)
145
+
146
+ Category Caps: brand(3) text(3) bg(3) border(3) feedback(4) palette(rest)
147
+ ```
148
+
149
+ ### 8. Rule Engine Output
150
+ **What:** Screenshot of actual rule engine output
151
  **Where:** After "The Rule Engine" section
152
  **Specs:** 1200x600px
153
+ **Content:** Show the emoji-formatted output:
154
+ - TYPE SCALE ANALYSIS (ratio, variance, recommendation)
155
+ - ACCESSIBILITY CHECK (actual pairs, not just vs white)
156
+ - SPACING GRID (GCD, alignment %)
157
+ - CONSISTENCY SCORE
158
+
159
+ ### 9. NEXUS Synthesis Output
160
+ **What:** Screenshot of the final synthesis with scores and top 3 actions
161
  **Where:** After "Agent 4: NEXUS" section
162
  **Specs:** 1200x700px
163
+ **Content:** Show final output with:
164
  - Executive summary
165
+ - Scores (overall, accessibility, consistency, organization)
166
  - Top 3 actions with impact/effort
167
+ - Color recommendations with accept/reject
168
 
169
+ ### 10. DTCG JSON Example (NEW)
170
+ **What:** Code block showing the W3C DTCG format with $extensions
171
+ **Where:** After "W3C DTCG v1 Compliance" section
172
+ **Specs:** 1000x500px
173
  **Content:** Show:
174
+ ```json
175
+ {
176
+ "color": {
177
+ "brand": {
178
+ "primary": {
179
+ "$type": "color",
180
+ "$value": "#005aa3",
181
+ "$extensions": {
182
+ "com.design-system-extractor": {
183
+ "frequency": 47,
184
+ "confidence": "high"
185
+ }
186
+ }
187
+ }
188
+ }
189
+ }
190
+ }
191
+ ```
192
 
193
+ ### 11. Figma Visual Spec Page (NEW)
194
+ **What:** Screenshot of the auto-generated visual spec in Figma
195
+ **Where:** After "The Custom Figma Plugin" section
196
+ **Specs:** 1200x700px
197
+ **Content:** Show:
198
+ - Typography frame (Desktop + Mobile) with font metadata
199
+ - Color frame organized by semantic role (brand/text/bg/border/feedback)
200
+ - AA compliance badges on each swatch
201
+ - Radius display, Spacing scale, Shadow elevation
202
+
203
+ ### 12. Before/After Comparison
204
  **What:** Side-by-side showing AS-IS vs TO-BE
205
  **Where:** After "Comparing AS-IS vs TO-BE" section
206
  **Specs:** 1200x500px
207
  **Content:**
208
  ```
209
  AS-IS TO-BE
210
+ ----- -----
211
+ Type: ~1.18 (random) -> 1.25 (Major Third)
212
+ Brand: #06b2c4 (AA: 3.2) -> #048391 (AA: 4.5)
213
+ Spacing: Mixed -> 8px grid
214
+ Colors: 143 unique -> ~20 semantic
215
+ Radius: raw CSS -> none/sm/md/lg/xl/full
216
+ Shadows: unsorted -> xs/sm/md/lg/xl
217
+ Score: 52/100 -> 78/100
218
  ```
219
 
220
+ ### 13. V1 vs V2 vs V3 Evolution (NEW)
221
+ **What:** Table showing the version progression
222
  **Where:** After "Cost & Model Strategy" section
223
  **Specs:** 1000x400px
224
  **Content:**
225
  ```
226
+ Version Cost Naming LLM Role Output
227
+ ------- ------- ---------- ---------- --------
228
+ V1 $0.50 LLM decides Everything Unreliable
229
+ V2 $0.003 3 systems Split w/ rules Naming chaos
230
+ V3 $0.003 1 authority Advisory only Clean DTCG
 
 
 
 
231
  ```
232
 
 
 
 
 
 
 
233
  ---
234
 
235
  ## Image Creation Tools
236
 
237
  **Recommended:**
238
+ 1. **Figma** - Architecture diagrams, pipeline flows, tech stack
239
+ 2. **Screenshot tool** - Gradio interface captures (use dark mode)
240
+ 3. **Excalidraw** - Quick hand-drawn style diagrams
241
 
242
  **Tips:**
243
  - Use dark background screenshots (Gradio dark mode)
244
  - Add subtle drop shadows to screenshots
245
+ - Keep consistent color scheme (blues match brand)
246
  - Use the agent names (AURORA, ATLAS, SENTINEL, NEXUS) in diagram labels
247
+ - Color-code: Layer 1 = green (free), Layer 2 = blue (rules), Layer 3 = purple (AI)
248
+ - NEW: Include W3C DTCG logo/badge where format is mentioned
249
+ - NEW: Show the naming authority chain prominently - it's the V3 key story
250
 
251
  ---
252
 
 
255
  ```
256
  episode6-hero-dashboard.png
257
  episode6-workflow-8steps.png
258
+ episode6-architecture-3layers.png
259
+ episode6-naming-authority.png
260
  episode6-agent-pipeline.png
261
  episode6-extraction-7sources.png
262
+ episode6-color-classifier.png
263
  episode6-rule-engine-output.png
264
  episode6-nexus-synthesis.png
265
+ episode6-dtcg-json.png
266
+ episode6-figma-visual-spec.png
267
  episode6-before-after.png
268
+ episode6-v1-v2-v3-evolution.png
 
269
  ```
270
 
271
  ---
 
281
  - [ ] Set consistent window size (1440px wide)
282
  - [ ] Run a real analysis so outputs are populated
283
  - [ ] Ensure agent names (AURORA, ATLAS, etc.) are visible in logs
284
+ - [ ] Ensure color classifier evidence logs are visible
285
+ - [ ] Capture the Figma visual spec page with AA badges
286
+ - [ ] Show DTCG format in JSON export preview
docs/LINKEDIN_POST_EPISODE_6.md CHANGED
@@ -1,4 +1,4 @@
1
- # LinkedIn Post - Episode 6: Design System Extractor
2
 
3
  ## Main Post (Copy-Paste Ready)
4
 
@@ -6,145 +6,162 @@
6
 
7
  Every designer has done this: Open DevTools. Inspect element. Copy hex code. Paste to spreadsheet. Recreate in Figma. Repeat 200 times.
8
 
9
- I spent 35 days manually extracting design tokens from websites. Then more time recreating them in Figma as variables.
10
 
11
- So I built a semi-automated workflow with 4 named AI agents + a free rule engine 👇
12
 
13
- **The Architecture:**
14
 
15
- Layer 1 (FREE — $0.00, <1 second):
16
- 🔢 Rule Engine — WCAG contrast checker (pure math)
17
- 🔢 Type scale detection + spacing grid analysis
18
- 🔢 Color deduplication + statistics
19
 
20
- Layer 2 (~$0.003, 4 specialized agents):
21
- 🎨 AURORA identifies brand colors from usage context (Qwen 72B)
22
- 📊 ATLAS benchmarks against 8 industry design systems (Llama 70B)
23
- ✅ SENTINEL — prioritizes fixes by business impact (Qwen 72B)
24
- 🧠 NEXUS — synthesizes everything, resolves contradictions (Llama 70B)
25
 
26
- **The Complete Pipeline:**
 
 
 
 
27
 
28
- 🌐 Website URL → 🤖 AI Agents → 📄 AS-IS JSON → 🔌 Figma Plugin → Variables
29
-
30
- 🧠 AI Analysis (Stage 2)
31
-
32
- ☑️ Accept/Reject → 📄 TO-BE JSON → 🔌 Figma Plugin → Modernized Variables
33
 
34
- **My V1 used LLMs for everything.**
35
- Cost: $0.50–1.00/run
36
- LLMs hallucinate math
37
 
38
- **V2 flipped the approach:**
39
- ✅ Deterministic code handles certainty. LLMs handle ambiguity.
40
- ✅ ~100-300x cheaper. More accurate. Always produces output.
41
 
42
- The rule engine does 80% of the work for $0.
43
- The agents handle the 20% that requires judgment.
 
 
 
 
 
 
44
 
45
  **Real results:**
46
- 143 colors extracted (semantically categorized)
47
- 220 FG/BG pairs checked for AA compliance
48
- Benchmarked against Material 3, Polaris, Atlassian + 5 more
49
- Type scale: random 1.25 Major Third
50
- Brand color: AA 3.2 4.5 (with my approval)
51
- Time: 3–5 days ~15 minutes
52
- Cost: ~$0.003
 
 
53
 
54
  The key? **I stay in control.** AI recommends, I decide.
55
 
56
- 📄 Full workflow + architecture: [Medium link]
57
- 🚀 Try it: [HuggingFace Space link]
58
- 💻 Code: [GitHub link]
59
 
60
- This is Episode 6 of "AI in My Daily Work."
61
 
62
  What design workflows are you automating?
63
 
64
- #UXDesign #AIEngineering #DesignSystems #Figma #HuggingFace #Accessibility #WCAG #MultiAgent #DesignTokens #BuildInPublic
65
 
66
  ---
67
 
68
  ## First Comment (Post Immediately After)
69
 
70
- 🔗 **Resources:**
71
 
72
- 📄 **Medium Article:** [link]
73
- Complete architecture breakdown + Figma integration workflow
74
 
75
- 🚀 **Live Demo:** [HuggingFace Space link]
76
  Try it with any website URL
77
 
78
- 💻 **GitHub:** [link]
79
- Open source star it if useful!
80
 
81
  ---
82
 
83
- **The 4 Named Agents:**
84
 
85
- 🎨 **AURORA** "33 buttons + 12 CTAs using #06b2c4 = brand primary" (context LLMs understand, rules can't)
 
 
 
86
 
87
- 📊 **ATLAS** "87% aligned to Polaris. Closing the type scale gap takes 1 hour." (trade-off reasoning)
88
 
89
- **SENTINEL** "67 AA failures. Fix brand primary firstaffects 40% of interactions." (impact prioritization)
90
 
91
- 🧠 **NEXUS** Synthesizes all 3 agents + rule engine → executive summary + top 3 actions
 
 
 
 
 
 
 
 
 
 
 
 
92
 
93
  ---
94
 
95
  **Previous Episodes:**
96
- Episode 5: UX Friction Analysis (7 agents + Databricks)
97
- Episode 4: UI Regression Testing
98
- Episode 3: Review Intelligence System
99
 
100
- What should I build for Episode 7? Drop ideas below 👇
101
 
102
  ---
103
 
104
- ## Alternative Version (Story-Driven)
105
 
106
  ---
107
 
108
  "Can you audit their design system and document it in Figma?"
109
 
110
- A 35 day task. I've done it dozens of times.
111
-
112
- DevTools → Inspect → Copy hex → Spreadsheet → Figma Variables → Repeat
113
 
114
- This time I built something different:
115
 
116
- A semi-automated workflow where:
117
- 🔢 A free rule engine checks WCAG, type scale, spacing (pure math — $0)
118
- 🎨 AURORA identifies brand colors from 143 extracted colors
119
- 📊 ATLAS benchmarks against 8 industry design systems
120
- ✅ SENTINEL prioritizes fixes by business impact
121
- 🧠 NEXUS synthesizes everything into a final action plan
122
- 🔌 A Figma plugin imports the JSON directly as variables
123
 
124
- The difference? **I stay in control.**
125
 
126
- AI doesn't auto-apply changes. It recommends:
127
- "Brand primary #06b2c4 fails AA (3.2:1). Suggest #048391 (4.5:1)."
128
 
129
- I decide if that's right for the brand.
 
 
 
 
 
 
130
 
131
- 15 minutes. $0.003. Full design system documented and in Figma.
132
 
133
- 📄 How I built it: [Medium link]
134
- 🚀 Demo: [HuggingFace link]
135
 
136
  Episode 6 of "AI in My Daily Work"
137
 
138
- #DesignSystems #AIAgents #UXDesign #Figma #Automation #HuggingFace #WCAG
139
 
140
  ---
141
 
142
  ## Image Suggestions
143
 
144
- 1. **Hero:** Architecture diagram (Layer 1 deterministic + Layer 2 four named agents)
145
- 2. **Before/After:** AS-IS specimen vs TO-BE specimen in Figma
146
- 3. **Agent Output:** Screenshot of NEXUS synthesis with scores
147
- 4. **Figma Specimen:** Typography + Semantic Colors display
 
148
 
149
  ---
150
 
@@ -154,21 +171,22 @@ Primary (always include):
154
  #UXDesign #AIEngineering #DesignSystems #Figma #HuggingFace
155
 
156
  Secondary (mix based on audience):
157
- #DesignTokens #MultiAgent #Accessibility #WCAG #BuildInPublic #Automation #LLM
158
 
159
  ---
160
 
161
  ## Posting Strategy
162
 
163
- **Best time:** TuesdayThursday, 810 AM your timezone
164
 
165
- **Key messages:**
166
- 1. Free rule engine does 80% of the work (cost optimization story)
167
- 2. 4 named agents with specific roles (not generic "LLM 1, LLM 2")
168
- 3. Semi-automation with human control (not full automation)
169
- 4. The Figma integration + specimen view sets this apart
 
170
 
171
  **Differentiation from Episode 5:**
172
  - Episode 5 = UX friction analysis (GA4 + Clarity + Databricks)
173
- - Episode 6 = Design system extraction (Playwright + Figma + HuggingFace)
174
  - Same philosophy: deterministic code for certainty, LLMs for ambiguity
 
1
+ # LinkedIn Post - Episode 6: Design System Extractor v3.2
2
 
3
  ## Main Post (Copy-Paste Ready)
4
 
 
6
 
7
  Every designer has done this: Open DevTools. Inspect element. Copy hex code. Paste to spreadsheet. Recreate in Figma. Repeat 200 times.
8
 
9
+ I spent 3-5 days manually extracting design tokens from websites. Then more time recreating them in Figma as variables.
10
 
11
+ So I built a 3-layer system: deterministic extraction + rule-based color classifier + 4 AI agents.
12
 
13
+ **The Architecture (v3.2):**
14
 
15
+ Layer 1 (FREE, <1 second):
16
+ - 7-source extraction (Playwright + Firecrawl)
17
+ - Normalizer: radius, shadows, colors all cleaned and named
18
+ - Color Classifier (815 lines, deterministic): CSS evidence -> category -> token name
19
 
20
+ Layer 2 (FREE, <1 second):
21
+ - Rule Engine: WCAG contrast (actual FG/BG pairs), type scale detection, spacing grid
22
+ - 113 tests passing, 100% reproducible
 
 
23
 
24
+ Layer 3 (~$0.003, 4 specialized agents):
25
+ - AURORA: brand color advisor (Qwen 72B) — advisory only, can't override classifier
26
+ - ATLAS: benchmarks against 8 industry design systems (Llama 70B)
27
+ - SENTINEL: prioritizes fixes by business impact (Qwen 72B)
28
+ - NEXUS: synthesizes everything, resolves contradictions (Llama 70B)
29
 
30
+ **The Pipeline:**
 
 
 
 
31
 
32
+ Website URL -> 7-Source Extraction -> Color Classifier -> Rule Engine -> DTCG JSON
33
+ -> Figma Plugin -> Variables + Styles + Auto-Generated Visual Spec Page
34
+ -> AI Analysis -> Accept/Reject -> TO-BE JSON -> Compare in Figma
35
 
36
+ **My biggest lesson building V1 -> V2 -> V3:**
 
 
37
 
38
+ V1: LLMs for everything. $0.50/run. Hallucinated contrast ratios.
39
+ V2: Rules + LLM split. $0.003/run. But 3 naming systems fighting in exports.
40
+ V3: Rules + Classifier + Advisory LLM. $0.003/run. ONE naming authority. Clean output.
41
+
42
+ The fix wasn't better AI. It was a clear authority chain:
43
+ 1. Color Classifier (PRIMARY) - deterministic, covers ALL colors
44
+ 2. AURORA LLM (SECONDARY) - can only suggest semantic roles
45
+ 3. Normalizer (FALLBACK) - hue + numeric shade
46
 
47
  **Real results:**
48
+ - 143 colors extracted, classified, and named (deterministically)
49
+ - 220 FG/BG pairs checked for AA compliance
50
+ - Radius: raw CSS garbage -> none/sm/md/lg/xl/full (normalized)
51
+ - Shadows: unsorted -> xs/sm/md/lg/xl (5 progressive levels)
52
+ - Benchmarked against Material 3, Polaris, Atlassian + 5 more
53
+ - Output: W3C DTCG v1 compliant JSON with $extensions metadata
54
+ - Figma: auto-generated visual spec with AA badges
55
+ - Time: 3-5 days -> ~15 minutes
56
+ - Cost: ~$0.003
57
 
58
  The key? **I stay in control.** AI recommends, I decide.
59
 
60
+ Full workflow + architecture: [Medium link]
61
+ Try it: [HuggingFace Space link]
62
+ Code: [GitHub link]
63
 
64
+ Episode 6 of "AI in My Daily Work"
65
 
66
  What design workflows are you automating?
67
 
68
+ #UXDesign #AIEngineering #DesignSystems #Figma #HuggingFace #Accessibility #WCAG #DesignTokens #W3CDTCG #BuildInPublic
69
 
70
  ---
71
 
72
  ## First Comment (Post Immediately After)
73
 
74
+ **Resources:**
75
 
76
+ Medium Article: [link]
77
+ Complete architecture breakdown + V1 -> V2 -> V3 evolution + Figma integration
78
 
79
+ Live Demo: [HuggingFace Space link]
80
  Try it with any website URL
81
 
82
+ GitHub: [link]
83
+ Open source - star it if useful!
84
 
85
  ---
86
 
87
+ **The Naming Authority Problem (V3's key insight):**
88
 
89
+ V2 had THREE competing systems naming colors:
90
+ - Normalizer: "color.blue.light" (word-based)
91
+ - Export layer: "color.blue.500" (numeric)
92
+ - AURORA LLM: "brand.primary" (whatever it wanted)
93
 
94
+ Result in Figma: blue.300, blue.dark, blue.light, blue.base ALL in the same export.
95
 
96
+ V3 fix: ONE authority. Color Classifier (deterministic) is PRIMARY. AURORA is advisory onlyit can suggest "this blue should be brand.primary" but can't rename palette colors.
97
 
98
+ `filter_aurora_naming_map()` enforces the boundary. Clean Figma output, every time.
99
+
100
+ ---
101
+
102
+ **What's Next — Episode 7: Automated Component Generation**
103
+
104
+ Researched 30+ tools. Found a genuine market gap:
105
+ No production tool takes DTCG JSON and outputs Figma components with variants.
106
+
107
+ Building it. Button (60 variants), TextInput, Card, Toast, Checkbox/Radio.
108
+ Figma Plugin API supports everything: createComponent(), combineAsVariants(), setBoundVariable().
109
+
110
+ Same tokens in = same components out. Fully deterministic.
111
 
112
  ---
113
 
114
  **Previous Episodes:**
115
+ - Episode 5: UX Friction Analysis (7 agents + Databricks)
116
+ - Episode 4: UI Regression Testing
117
+ - Episode 3: Review Intelligence System
118
 
119
+ What should I build for Episode 7? Drop ideas below
120
 
121
  ---
122
 
123
+ ## Alternative Version (Shorter, Story-Driven)
124
 
125
  ---
126
 
127
  "Can you audit their design system and document it in Figma?"
128
 
129
+ 3-5 days of DevTools, spreadsheets, and manual Figma work.
 
 
130
 
131
+ I built something different. Three versions, actually.
132
 
133
+ V1: Used LLMs for everything. $0.50/run. They hallucinate math.
 
 
 
 
 
 
134
 
135
+ V2: Split into rules + AI. $0.003/run. But three systems fought over color names. Figma output was chaos.
136
 
137
+ V3: Clear authority chain. One color classifier (deterministic, 815 lines). LLMs are advisory only. W3C DTCG-compliant JSON. Auto-generated visual spec in Figma.
 
138
 
139
+ What it does now:
140
+ - 7-source extraction from any website
141
+ - Rule-based color classification (brand/text/bg/border/feedback)
142
+ - WCAG AA check on 220 actual FG/BG pairs
143
+ - 4 AI agents for brand analysis, benchmarking, audit, synthesis
144
+ - W3C standard JSON output
145
+ - Figma plugin: variables + styles + visual spec page
146
 
147
+ 15 minutes. $0.003. I stay in control.
148
 
149
+ Full architecture: [Medium link]
150
+ Demo: [HuggingFace link]
151
 
152
  Episode 6 of "AI in My Daily Work"
153
 
154
+ #DesignSystems #AIAgents #UXDesign #Figma #Automation #HuggingFace #WCAG #W3CDTCG
155
 
156
  ---
157
 
158
  ## Image Suggestions
159
 
160
+ 1. **Hero:** V1 vs V2 vs V3 comparison table showing the evolution
161
+ 2. **Architecture:** 3-layer diagram (Extraction -> Classification+Rules -> 4 Agents)
162
+ 3. **Naming Authority:** Before/after showing Figma chaos vs clean output
163
+ 4. **Figma Visual Spec:** Screenshot of auto-generated spec page
164
+ 5. **Agent Output:** NEXUS synthesis with scores + top 3 actions
165
 
166
  ---
167
 
 
171
  #UXDesign #AIEngineering #DesignSystems #Figma #HuggingFace
172
 
173
  Secondary (mix based on audience):
174
+ #DesignTokens #W3CDTCG #Accessibility #WCAG #BuildInPublic #Automation #MultiAgent
175
 
176
  ---
177
 
178
  ## Posting Strategy
179
 
180
+ **Best time:** Tuesday-Thursday, 8-10 AM your timezone
181
 
182
+ **Key messages for V3:**
183
+ 1. V1 -> V2 -> V3 evolution story (naming authority problem)
184
+ 2. Color Classifier (815 lines, deterministic) as key innovation
185
+ 3. W3C DTCG v1 compliance standards over proprietary formats
186
+ 4. Figma visual spec auto-generation
187
+ 5. Component generation gap (Episode 7 teaser)
188
 
189
  **Differentiation from Episode 5:**
190
  - Episode 5 = UX friction analysis (GA4 + Clarity + Databricks)
191
+ - Episode 6 = Design system extraction (Playwright + Classifier + Figma + HuggingFace)
192
  - Same philosophy: deterministic code for certainty, LLMs for ambiguity
docs/MEDIUM_ARTICLE_EPISODE_6.md CHANGED
@@ -1,10 +1,10 @@
1
- # 🚅 AI in My Daily Work — Episode 6: Reverse-Engineering Design Systems with 4 AI Agents + a Free Rule Engine
2
 
3
- ## A Semi-Automated Workflow: From Website URL to Figma-Ready Design System
4
 
5
- *How I built a system that extracts any website's design tokens and audits them like a senior design team — for ~$0.003 per run.*
6
 
7
- [IMAGE: Hero - Complete workflow showing Website AI Agents Figma]
8
 
9
  ---
10
 
@@ -22,23 +22,25 @@ Whether it's analyzing a competitor, inheriting a legacy project, or bringing co
22
  6. Repeat for spacing, shadows, border radius...
23
  7. Spend days organizing into a coherent system
24
  8. Manually recreate in Figma as variables
 
25
 
26
- I've done this dozens of times. It takes **35 days** for a single website. And by the time you're done, something has already changed.
27
 
28
  I wanted a system that could think like a design team:
29
 
30
- - a **data engineer** validating extraction quality
 
31
  - an **analyst** identifying brand colors and patterns
32
  - a **senior reviewer** benchmarking against industry standards
33
  - and a **chief architect** synthesizing everything into action
34
 
35
- So I built one.
36
 
37
  ---
38
 
39
  ## The Solution (In One Sentence)
40
 
41
- I built a 4-agent system backed by a free rule engine that acts like an entire design audit team: data extraction + WCAG compliance + benchmark comparison + brand analysis + prioritized recommendations. It runs on HuggingFace Spaces, costs ~$0.003 per analysis, and feeds directly into Figma via a custom plugin.
42
 
43
  ---
44
 
@@ -49,292 +51,373 @@ I built a 4-agent system backed by a free rule engine that acts like an entire d
49
  Here's the end-to-end process I now use:
50
 
51
  ```
52
- ┌──────────────────────────────────────────────────────────────┐
53
- MY DESIGN SYSTEM WORKFLOW
54
- ├──────────────────────────────────────────────────────────────┤
55
-
56
- STEP 1: Extract AS-IS (AI Agent App)
57
- ────────────────────────────────────── │
58
- Enter website URL
59
- AI auto-discovers pages
60
- Extracts colors, typography, spacing, shadows, radius
61
- Rule Engine checks WCAG + type scale + spacing grid │
62
- Download AS-IS JSON file │
63
- │ │
64
- │ ↓ │
65
-
66
- │ STEP 2: Import to Figma (My Plugin) │
67
- │ ──────────────────────────────────── │
68
- │ • Open Figma │
69
- Upload AS-IS JSON via custom plugin │
70
- • Plugin creates Variables automatically
71
- │ │
72
- │ ↓ │
73
- │ │
74
- STEP 3: View AS-IS Specimen (Figma) │
75
- ──────────────────────────────────── │
76
- │ • Visual display of current design system │
77
- │ • Typography (Desktop + Mobile), Colors, Spacing, etc. │
78
- │ • Review what exists before modernizing │
79
-
80
- │ ↓ │
81
- │ │
82
- STEP 4: AI Analysis (AI Agent App - Stage 2) │
83
- ───────────────────────────────────────────── │
84
- Free Rule Engine: WCAG, type scale, spacing grid │
85
- AURORA: Brand color identification │
86
- │ • ATLAS: Industry benchmark comparison (8 systems) │
87
- │ • SENTINEL: Best practices audit with priorities │
88
- │ • NEXUS: Final synthesis resolving all contradictions │
89
-
90
- │ ↓ │
91
- │ │
92
- STEP 5: Accept/Reject Suggestions (AI Agent App) │
93
- ───────────────────────────────────────────────── │
94
- Review each recommendation │
95
- Accept ☑️ or Reject individually │
96
- I stay in control of what changes │
97
-
98
-
99
- │ │
100
- │ STEP 6: Export TO-BE (AI Agent App - Stage 3) │
101
- ───────────────────────────────────────────── │
102
- • Generate modernized TO-BE JSON │
103
- Contains accepted improvements │
104
- Download new JSON file │
105
- │ │
106
- │ ↓ │
107
- │ │
108
- │ STEP 7: Import TO-BE to Figma (My Plugin) │
109
- │ ────────────────────────────────────────── │
110
- Upload TO-BE JSON via same plugin │
111
- • Figma Variables update with new values
112
- │ │
113
- │ ↓ │
114
- │ │
115
- │ STEP 8: View TO-BE Specimen (Figma) │
116
- │ ──────────────────────────────────── │
117
- │ • Visual display of modernized design system │
118
- │ • Compare AS-IS vs TO-BE │
119
- Ready to use in production │
120
- │ │
121
- └──────────────────────────────────────────────────────────────┘
 
 
 
 
 
 
 
 
 
 
 
 
 
122
  ```
123
 
124
- **Total time:** ~15 minutes (vs 35 days manual)
125
 
126
  ---
127
 
128
- ## Architecture Overview: Two Layers, Four Agents
129
 
130
  My first attempt (V1) made a classic mistake:
131
  **I used a large language model for everything.**
132
 
133
- ### Why Two Layers?
134
 
135
- My V1 mistake: Used LLMs for everything
136
- ❌ Cost: $0.50–1.00 per run
137
- ❌ Speed: 15+ seconds for basic math
138
- ❌ Accuracy: LLMs hallucinate contrast ratios
139
 
140
- The fix: **Not every task needs AI. Some need good engineering.**
141
 
142
- V2 flipped the approach.
143
 
144
- > **Deterministic code handles certainty. LLMs handle ambiguity.**
145
 
146
- This led to a two-layer architecture.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
147
 
148
- [IMAGE: Architecture diagram Layer 1 (Deterministic) → Layer 2 (4 Named Agents)]
149
 
150
  ```
151
- ┌─────────────────────────────────────────────────┐
152
- │ LAYER 1: DETERMINISTIC (Free $0.00) │
153
- │ ├─ Crawler + 7-Source Extractor + Normalizer │
154
- │ ├─ Semantic Color Analyzer (rule-based) │
155
- │ ├─ WCAG Contrast Checker (math)
156
- │ ├─ Type Scale Detection (ratio math) │
157
- │ ├─ Spacing Grid Analysis (GCD math) │
158
- │ └─ Color Statistics (deduplication) │
159
- ├─────────────────────────────────────────────────┤
160
- │ LAYER 2: 4 AI AGENTS (~$0.003) │
161
- │ ├─ AURORA Brand Color Analyst (Qwen 72B)
162
- │ ├─ ATLAS — Benchmark Advisor (Llama 70B) │
163
- │ ├─ SENTINEL — Best Practices Auditor (Qwen 72B)│
164
- │ └─ NEXUS — Head Synthesizer (Llama 70B) │
165
- └─────────────────────────────────────────────────┘
166
  ```
167
 
168
- ---
169
 
170
- ## Layer 1: Deterministic Intelligence (No LLM)
171
 
172
- These agents do the heavy lifting no LLMs involved.
173
 
174
- ### Stage 1: Extraction
175
 
176
- A Playwright-powered browser visits each page at **two viewports** (1440px desktop + 375px mobile) and extracts every design token from **7 sources**:
177
 
178
- [IMAGE: 7 Extraction Sources diagram]
179
 
180
  ```
181
- Source 1: Computed Styles What the browser actually renders
182
- Source 2: CSS Variables → --primary-color, --spacing-md
183
- Source 3: Inline Styles → style="color: #06b2c4"
184
- Source 4: SVG Attributes → fill, stroke colors
185
- Source 5: Stylesheets → External .css files
186
- Source 6: Style Blocks → <style> tags
187
- Source 7: Firecrawl → Deep CSS parsing (bypasses CORS)
 
 
 
 
188
  ```
189
 
190
- A **Normalizer** then deduplicates (exact match + Delta-E color distance), infers semantic roles from frequency, and assigns suggested names like `brand.primary`, `text.secondary`.
 
 
191
 
192
- A **Semantic Analyzer** categorizes every color by *actual CSS usage*:
193
 
194
- | Role | Detection Method |
195
- |------|------------------|
196
- | Brand | Saturated colors on buttons, CTAs, links |
197
- | Text | Low saturation with `color` property |
198
- | Background | Used with `background-color` on containers |
199
- | Border | Used with `border-color` properties |
200
- | Feedback | Red=error, Green=success, Yellow=warning |
 
 
 
 
201
 
202
  **Cost: $0.00 | Runtime: ~90 seconds**
203
 
204
- The user reviews these tokens before anything touches an LLM.
 
 
 
 
 
 
 
 
 
 
 
 
205
 
206
- ### The Rule Engine (The Single Biggest Optimization)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
207
 
208
- After extraction, a rule engine runs every check that can be done with pure math:
 
 
 
 
209
 
210
  ```
211
- 📐 TYPE SCALE ANALYSIS
212
- ├─ Detected Ratio: 1.167
213
- ├─ Closest Standard: Minor Third (1.2)
214
- ├─ Consistent: ⚠️ No (variance: 0.24)
215
- └─ 💡 Recommendation: 1.25 (Major Third)
216
-
217
- ACCESSIBILITY CHECK (WCAG AA/AAA)
218
- ├─ Colors Analyzed: 210
219
- ├─ FG/BG Pairs Checked: 220
220
- ├─ AA Pass: 143
221
- ├─ AA Fail (real FG/BG pairs): 67
222
- ├─ fg:#06b2c4 on bg:#ffffff 💡 Fix: #048391 (4.5:1)
223
- ├─ fg:#999999 on bg:#ffffff 💡 Fix: #757575 (4.6:1)
224
- └─ ... and 62 more
225
-
226
- 📏 SPACING GRID
227
- ├─ Detected Base: 1px (GCD)
228
- ├─ Grid Aligned: ⚠️ 0%
229
- └─ 💡 Recommendation: 8px grid
230
-
231
- 📊 CONSISTENCY SCORE: 52/100
232
  ```
233
 
234
- Not just "color vs white" — it tests **actual foreground/background pairs** found on the page. And algorithmically generates compliant alternatives.
235
 
236
- This entire layer runs **in under 1 second** and costs nothing beyond compute — the single biggest cost optimization in the system.
237
 
238
  ---
239
 
240
- ## Layer 2: AI Analysis & Interpretation (4 Named Agents)
241
 
242
- This is where language models actually add value — tasks that require **context, reasoning, and judgment**.
243
 
244
- [IMAGE: Agent pipeline diagram AURORA ATLAS SENTINEL NEXUS]
245
 
246
  ---
247
 
248
- ### Agent 1: AURORA — Brand Color Analyst
249
  **Model:** Qwen 72B (HuggingFace PRO)
250
- **Cost:** Free within PRO subscription ($9/month)
251
- **Temperature:** 0.4
252
 
253
- **The Challenge:** The rule engine found 143 colors. Which one is the *brand* primary?
254
 
255
- A rule engine can count that `#06b2c4` appears in 33 buttons. But it can't reason: "33 buttons + 12 CTAs + dominant accent positioning = this is almost certainly the brand primary." That requires **context understanding**.
 
 
 
 
256
 
257
- **Sample Output:**
258
 
259
  ```
260
  AURORA's Analysis:
261
- ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
262
- 🎨 Brand Primary: #06b2c4 (confidence: HIGH)
263
- └─ 33 buttons, 12 CTAs, dominant accent
 
264
 
265
- 🎨 Brand Secondary: #c1df1f (confidence: MEDIUM)
266
- └─ 15 accent elements, secondary CTA
267
 
268
  Palette Strategy: Complementary
269
  Cohesion Score: 7/10
270
- └─ "Clear hierarchy, accent colors differentiated"
271
-
272
- Self-Evaluation: confidence=8/10, data=good
273
  ```
274
 
275
  ---
276
 
277
  ### Agent 2: ATLAS — Benchmark Advisor
278
  **Model:** Llama 3.3 70B (128K context)
279
- **Cost:** Free within PRO subscription
280
- **Temperature:** 0.25
281
 
282
  **Unique Capability:** Industry benchmarking against **8 design systems** (Material 3, Polaris, Atlassian, Carbon, Apple HIG, Tailwind, Ant, Chakra).
283
 
284
  [IMAGE: Benchmark comparison table from the UI]
285
 
286
- This agent doesn't just pick the closest match — it reasons about **effort vs. value**:
287
 
288
  ```
289
  ATLAS's Recommendation:
290
- ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
291
- 🥇 Shopify Polaris: 87% match
292
 
293
  Alignment Changes:
294
- ├─ Type scale: 1.17 1.25 (effort: medium)
295
- ├─ Spacing grid: mixed 4px (effort: high)
296
- └─ Base size: 16px 16px (already aligned)
297
 
298
  Pros: Closest match, e-commerce proven, well-documented
299
  Cons: Spacing migration is significant effort
300
 
301
- 🥈 Alternative: Material 3 (77% match)
302
- └─ "Stronger mobile patterns, but 8px grid
303
  requires more restructuring"
304
  ```
305
 
306
- ATLAS's Value Add:
307
 
308
- > "You're 87% aligned to Polaris already. Closing the gap on type scale takes ~1 hour and makes your system industry-standard. **Priority: MEDIUM.**"
309
 
310
  ---
311
 
312
  ### Agent 3: SENTINEL — Best Practices Auditor
313
  **Model:** Qwen 72B
314
- **Cost:** Free within PRO subscription
315
- **Temperature:** 0.2 (strict, consistent)
316
-
317
- **The Challenge:** The rule engine says "67 AA failures." But which ones matter most?
318
 
319
  SENTINEL prioritizes by **business impact** — not just severity:
320
 
321
  ```
322
  SENTINEL's Audit:
323
- ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
324
  Overall Score: 68/100
325
 
326
  Checks:
327
- ├─ Type Scale Standard (1.25 ratio)
328
- ├─ ⚠️ Type Scale Consistency (variance 0.18)
329
- ├─ Base Size Accessible (16px)
330
- ├─ AA Compliance (67 failures)
331
- ├─ ⚠️ Spacing Grid (0% aligned)
332
- └─ Near-Duplicates (351 pairs)
333
 
334
  Priority Fixes:
335
  #1 Fix brand color AA compliance
336
  Impact: HIGH | Effort: 5 min
337
- "Affects 40% of interactive elements"
338
 
339
  #2 Consolidate near-duplicate colors
340
  Impact: MEDIUM | Effort: 2 hours
@@ -343,95 +426,135 @@ Priority Fixes:
343
  Impact: MEDIUM | Effort: 1 hour
344
  ```
345
 
 
 
346
  ---
347
 
348
- ### Agent 4: NEXUS — Head Synthesizer (Final Output)
349
  **Model:** Llama 3.3 70B (128K context)
350
- **Cost:** ~$0.001
351
- **Temperature:** 0.3
352
 
353
- NEXUS is the senior architect. It takes outputs from **all three agents + the rule engine** and synthesizes a final recommendation **resolving contradictions**, weighting scores, and producing the executive summary the user sees.
 
 
 
354
 
355
- If ATLAS says "close to Polaris" but SENTINEL says "spacing misaligned," NEXUS reconciles: *"Align to Polaris type scale now (low effort) but defer spacing migration (high effort)."*
356
 
357
  ```
358
  NEXUS Final Synthesis:
359
- ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
360
- 📝 Executive Summary:
361
  "Your design system scores 68/100. Critical:
362
  67 color pairs fail AA. Top action: fix brand
363
  primary contrast (5 min, high impact)."
364
 
365
- 📊 Scores:
366
- ├─ Overall: 68/100
367
- ├─ Accessibility: 45/100
368
- ├─ Consistency: 75/100
369
- └─ Organization: 70/100
370
 
371
- 🎯 Top 3 Actions:
372
- 1. Fix brand color AA (#06b2c4 #048391)
373
  Impact: HIGH | Effort: 5 min
374
  2. Align type scale to 1.25
375
  Impact: MEDIUM | Effort: 1 hour
376
- 3. Consolidate 143 ~20 semantic colors
377
  Impact: MEDIUM | Effort: 2 hours
378
 
379
- 🎨 Color Recommendations:
380
- ├─ brand.primary: #06b2c4 #048391 (auto-accept)
381
- ├─ text.secondary: #999999 #757575 (auto-accept)
382
- └─ brand.accent: #FF6B35 #E65100 (user decides)
383
  ```
384
 
385
  ---
386
 
387
- ## The Figma Bridge: JSON Variables Specimen
388
 
389
  [IMAGE: Figma plugin UI showing import options]
390
 
391
- I built a custom Figma plugin that closes the loop:
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
392
 
393
- 1. **Imports JSON** → Creates Figma Variables
394
- 2. **Maps token types:**
395
- - Colors → Color Variables
396
- - Typography → Text Styles
397
- - Spacing → Number Variables
398
- - Radius → Number Variables
399
- - Shadows → Effect Styles
400
- 3. **Generates a Specimen Page** — visual display of the entire system
401
 
402
- The plugin handles both AS-IS and TO-BE imports identically — just different JSON files.
403
 
404
- ### Viewing the Specimen
 
 
 
405
 
406
- [IMAGE: Figma specimen page showing all tokens visually]
407
 
408
  ```
409
- ┌─────────────────────────────────────────────────────────────┐
410
- 🎨 BRAND 📝 TEXT 🖼️ BACKGROUND 🚨 FEEDBACK
411
- ├─────────────────────────────────────────────────────────────┤
412
- ┌────┐ ┌────┐ ┌────┐ ┌────┐ ┌────┐ ┌────┐ ┌────┐ │
413
- Prim Sec Prim Sec Prim Sec Err │ │
414
- └────┘ └────┘ └────┘ └────┘ └────┘ └────┘ └────┘ │
415
- #06b2c4 #c1df1f #373737 #666666 #fff #f5f5f5 #dc2626
416
- AA:⚠️ AA:⚠️ AA:AA:AA:✓ │
417
- └─────────────────────────────────────────────────────────────┘
418
  ```
419
 
 
 
420
  ---
421
 
422
  ## Comparing AS-IS vs TO-BE
423
 
424
  [IMAGE: Side-by-side comparison of AS-IS and TO-BE specimens]
425
 
426
- The real power is seeing the transformation:
427
-
428
  | Token | AS-IS | TO-BE | Change |
429
  |-------|-------|-------|--------|
430
- | Type Scale | ~1.18 (random) | 1.25 (Major Third) | Consistent |
431
- | brand.primary | #06b2c4 | #048391 | AA: 3.2 4.5 |
432
- | Spacing Grid | Mixed | 8px base | Standardized |
433
- | Color Ramps | None | 50-950 | Generated |
434
- | Unique Colors | 143 | ~20 semantic | Consolidated |
 
 
435
 
436
  ---
437
 
@@ -439,13 +562,16 @@ The real power is seeing the transformation:
439
 
440
  | Metric | Manual Process | My Workflow |
441
  |--------|---------------|-------------|
442
- | Time | 35 days | ~15 minutes |
443
  | Cost | Designer salary | ~$0.003 |
444
- | Coverage | ~50 colors | 143 colors (7 sources) |
445
  | Accuracy | Human error | Computed styles (exact) |
446
  | Accessibility | Manual spot checks | Full AA/AAA (all 220 pairs) |
447
  | Benchmarking | Subjective | 8 industry systems compared |
448
- | Figma Ready | Hours more | Instant (JSON plugin) |
 
 
 
449
 
450
  ---
451
 
@@ -457,8 +583,10 @@ Different agents use different models — intentionally.
457
 
458
  | Agent | Model | Why This Model | Cost |
459
  |-------|-------|---------------|------|
 
 
460
  | Rule Engine | None | Math doesn't need AI | $0.00 |
461
- | AURORA | Qwen 72B | Creative color reasoning | ~Free (HF PRO) |
462
  | ATLAS | Llama 3.3 70B | 128K context for benchmarks | ~Free (HF PRO) |
463
  | SENTINEL | Qwen 72B | Strict, consistent evaluation | ~Free (HF PRO) |
464
  | NEXUS | Llama 3.3 70B | 128K context for synthesis | ~$0.001 |
@@ -466,10 +594,10 @@ Different agents use different models — intentionally.
466
 
467
  For designer-scale usage (weekly runs), inference costs are effectively negligible, with HuggingFace PRO ($9/month) covering most models.
468
 
469
- Compared to V1 (LLM-for-everything):
470
- - **~100–300x cost reduction**
471
- - **Faster execution** (rule engine: <1s vs LLM: 15s for the same math)
472
- - **Better accuracy** (LLMs hallucinate math; rule engines don't)
473
 
474
  ---
475
 
@@ -479,11 +607,12 @@ The system **always produces output**, even when components fail:
479
 
480
  | If This Fails... | What Happens |
481
  |-------------------|-------------|
482
- | LLM agents down | Rule engine analysis still works (free) |
483
  | Firecrawl unavailable | DOM-only extraction (slightly fewer tokens) |
484
  | Benchmark fetch fails | Hardcoded fallback data from 8 systems |
485
  | NEXUS synthesis fails | `create_fallback_synthesis()` from rule engine |
486
- | **Entire AI layer** | **Full rule-engine-only report still useful** |
 
487
 
488
  ---
489
 
@@ -492,18 +621,21 @@ The system **always produces output**, even when components fail:
492
  [IMAGE: Tech stack diagram with logos]
493
 
494
  **AI Agent App:**
495
- - Playwright (browser automation, 7-source extraction)
496
  - Firecrawl (deep CSS parsing)
497
  - Gradio (UI framework)
498
  - Qwen/Qwen2.5-72B-Instruct (AURORA + SENTINEL)
499
  - meta-llama/Llama-3.3-70B-Instruct (ATLAS + NEXUS)
500
  - HuggingFace Spaces (hosting) + HF Inference API
501
  - Docker (containerized deployment)
 
502
 
503
  **Figma Integration:**
504
- - Custom Figma Plugin
505
- - Variables API
506
- - Tokens Studio compatible JSON
 
 
507
 
508
  ---
509
 
@@ -513,17 +645,23 @@ The system **always produces output**, even when components fail:
513
 
514
  If rules can do it faster and cheaper — use rules. My WCAG checker is 100% accurate. An LLM's contrast ratio calculation? Maybe 85% accurate, and 100x slower.
515
 
516
- The rule engine does 80% of the work for $0.
 
 
517
 
518
- ### 2. Industry Benchmarks Are Gold
519
 
520
- Without benchmarks: "Your type scale is inconsistent" *PM nods*
521
- With benchmarks: "You're 87% aligned to Shopify Polaris. Closing the gap takes 1 hour and makes your system industry-standard." → *PM schedules meeting*
 
 
 
 
522
 
523
  Time to build benchmark database: 1 day.
524
  Value: Transforms analysis into prioritized action.
525
 
526
- ### 3. Semi-Automation > Full Automation
527
 
528
  I don't want AI to make all decisions. The workflow has human checkpoints:
529
  - Review AS-IS in Figma before modernizing
@@ -532,21 +670,17 @@ I don't want AI to make all decisions. The workflow has human checkpoints:
532
 
533
  AI as **copilot**, not autopilot.
534
 
535
- ### 4. Specialized Agents > One Big Prompt
536
 
537
  One mega-prompt doing brand analysis + benchmark comparison + accessibility audit + synthesis = confused, unfocused output. Four agents, each with a single responsibility = sharp, reliable analysis.
538
 
539
- ### 5. The JSON Bridge Works
540
 
541
- JSON is the perfect interchange format:
542
- - AI agents export JSON
543
- - Figma plugin imports JSON
544
- - No direct integration needed
545
- - Each tool does what it's best at
546
 
547
- ### 6. Semantic Context Changes Everything
548
 
549
- Raw hex values are useless. Knowing that `#06b2c4` is the **brand primary used on 33 buttons** changes how you evaluate it and how agents reason about it.
550
 
551
  ---
552
 
@@ -554,22 +688,24 @@ Raw hex values are useless. Knowing that `#06b2c4` is the **brand primary used o
554
 
555
  **On HuggingFace Spaces:** I'm using HF Spaces as the hosting platform with a Gradio frontend running in Docker. The LLM models (Qwen 72B, Llama 3.3 70B) are called via HuggingFace Inference API. Browser automation (Playwright + Chromium) runs inside the container.
556
 
557
- **On the Data:** This system works on **live websites** — point it at any URL and it extracts real design tokens from the actual DOM. No synthetic data. The architecture, LLM integrations, and rule engine are production-ready.
 
 
558
 
559
  ---
560
 
561
  ## Try It Yourself
562
 
563
  **AI Agent App:**
564
- - 🚀 Live Demo: [HuggingFace Space link]
565
- - 💻 GitHub: [Repository link]
566
 
567
  **Workflow:**
568
- 1. Enter website URL Extract AS-IS
569
- 2. Download JSON Import to Figma
570
- 3. Review specimen Run AI analysis
571
- 4. Accept suggestions Export TO-BE
572
- 5. Import to Figma Compare specimens
573
 
574
  ---
575
 
@@ -579,21 +715,36 @@ AI engineering isn't about fancy models or complex architecture. It's about know
579
 
580
  It's **compression** — compressing days of manual audit, multiple expert perspectives, and industry benchmarking into something a team can act on Monday morning.
581
 
582
- Instead of 35 days reviewing DevTools, your team gets:
583
- > "Top 3 issues, ranked by impact, with specific fixes, benchmark alignment, and a Figma-ready specimen to compare before and after."
584
 
585
  That's AI amplifying design systems impact.
586
 
587
- 🔗 Full code on GitHub: [link]
588
 
589
  ---
590
 
591
- ## What's Next
 
 
 
 
 
 
 
 
 
 
 
 
 
 
592
 
593
  **Coming in Episode 7:**
594
- - Auto-generating Figma components from tokens
595
- - Component pattern detection (buttons, cards, forms)
596
- - Design system documentation generation
 
597
 
598
  ---
599
 
@@ -620,8 +771,8 @@ I'm Riaz, a UX Design Manager with 10+ years of experience in consumer apps. I c
620
 
621
  ---
622
 
623
- #AIAgents #DesignSystems #UXDesign #Figma #MultiAgentSystems #DesignTokens #Automation #AIEngineering #HuggingFace #WCAG
624
 
625
  ---
626
 
627
- *Published on Medium ~10 min read*
 
1
+ # AI in My Daily Work — Episode 6: Reverse-Engineering Design Systems with 4 AI Agents, a Rule-Based Color Classifier & a Free Rule Engine
2
 
3
+ ## A Semi-Automated Workflow: From Website URL to Figma-Ready Design System (v3.2)
4
 
5
+ *How I built a system that extracts any website's design tokens, classifies colors deterministically, audits them like a senior design team, and generates a visual spec in Figma — for ~$0.003 per run.*
6
 
7
+ [IMAGE: Hero - Complete workflow showing Website -> AI Agents -> Figma Visual Spec]
8
 
9
  ---
10
 
 
22
  6. Repeat for spacing, shadows, border radius...
23
  7. Spend days organizing into a coherent system
24
  8. Manually recreate in Figma as variables
25
+ 9. Manually build a visual spec page
26
 
27
+ I've done this dozens of times. It takes **3-5 days** for a single website. And by the time you're done, something has already changed.
28
 
29
  I wanted a system that could think like a design team:
30
 
31
+ - a **data engineer** extracting and normalizing every token
32
+ - a **color scientist** classifying colors by actual CSS usage (not guessing)
33
  - an **analyst** identifying brand colors and patterns
34
  - a **senior reviewer** benchmarking against industry standards
35
  - and a **chief architect** synthesizing everything into action
36
 
37
+ So I built one. Three versions later, here's what works.
38
 
39
  ---
40
 
41
  ## The Solution (In One Sentence)
42
 
43
+ I built a 3-layer system deterministic extraction + rule-based color classification + 4 AI agents — that acts like an entire design audit team. It outputs W3C DTCG-compliant JSON that feeds directly into Figma via a custom plugin that auto-generates a visual spec page. Cost: ~$0.003 per analysis.
44
 
45
  ---
46
 
 
51
  Here's the end-to-end process I now use:
52
 
53
  ```
54
+ +--------------------------------------------------------------+
55
+ | MY DESIGN SYSTEM WORKFLOW |
56
+ +--------------------------------------------------------------+
57
+ | |
58
+ | STEP 1: Extract AS-IS (AI Agent App) |
59
+ | ---------------------------------------- |
60
+ | * Enter website URL |
61
+ | * AI auto-discovers pages |
62
+ | * Extracts colors, typography, spacing, shadows, radius |
63
+ | * Normalizes: dedup, sort, name (radius, shadows, colors) |
64
+ | * Color Classifier: deterministic role assignment |
65
+ | * Rule Engine: WCAG + type scale + spacing grid |
66
+ | * Download AS-IS JSON (W3C DTCG v1 format) |
67
+ | |
68
+ | | |
69
+ | v |
70
+ | |
71
+ | STEP 2: Import to Figma (My Plugin) |
72
+ | ---------------------------------------- |
73
+ | * Open Figma |
74
+ | * Upload AS-IS JSON via custom plugin |
75
+ | * Plugin auto-detects DTCG format |
76
+ | * Creates Variables + Paint/Text/Effect Styles |
77
+ | * Auto-generates Visual Spec Page |
78
+ | |
79
+ | | |
80
+ | v |
81
+ | |
82
+ | STEP 3: View AS-IS Visual Spec (Figma) |
83
+ | ---------------------------------------- |
84
+ | * Typography (Desktop + Mobile) with AA badges |
85
+ | * Colors organized by semantic role |
86
+ | * Spacing scale, Radius display, Shadow elevation |
87
+ | * Review what exists before modernizing |
88
+ | |
89
+ | | |
90
+ | v |
91
+ | |
92
+ | STEP 4: AI Analysis (AI Agent App - Stage 2) |
93
+ | ---------------------------------------- |
94
+ | * Free Rule Engine: WCAG, type scale, spacing grid |
95
+ | * AURORA: Brand color identification (advisory) |
96
+ | * ATLAS: Industry benchmark comparison (8 systems) |
97
+ | * SENTINEL: Best practices audit with priorities |
98
+ | * NEXUS: Final synthesis resolving all contradictions |
99
+ | |
100
+ | | |
101
+ | v |
102
+ | |
103
+ | STEP 5: Accept/Reject Suggestions (AI Agent App) |
104
+ | ---------------------------------------- |
105
+ | * Review each recommendation |
106
+ | * Accept or Reject individually |
107
+ | * I stay in control of what changes |
108
+ | |
109
+ | | |
110
+ | v |
111
+ | |
112
+ | STEP 6: Export TO-BE (AI Agent App - Stage 3) |
113
+ | ---------------------------------------- |
114
+ | * Generate modernized TO-BE JSON (DTCG compliant) |
115
+ | * Contains accepted improvements |
116
+ | * Download new JSON file |
117
+ | |
118
+ | | |
119
+ | v |
120
+ | |
121
+ | STEP 7: Import TO-BE to Figma (My Plugin) |
122
+ | ---------------------------------------- |
123
+ | * Upload TO-BE JSON via same plugin |
124
+ | * Figma Variables update with new values |
125
+ | * New Visual Spec generated for comparison |
126
+ | |
127
+ | | |
128
+ | v |
129
+ | |
130
+ | STEP 8: Compare AS-IS vs TO-BE (Figma) |
131
+ | ---------------------------------------- |
132
+ | * Side-by-side visual spec pages |
133
+ | * See exactly what changed and why |
134
+ | * Ready to use in production |
135
+ | |
136
+ +--------------------------------------------------------------+
137
  ```
138
 
139
+ **Total time:** ~15 minutes (vs 3-5 days manual)
140
 
141
  ---
142
 
143
+ ## Architecture Overview: Three Layers, One Clear Authority Chain
144
 
145
  My first attempt (V1) made a classic mistake:
146
  **I used a large language model for everything.**
147
 
148
+ V1 cost $0.50-1.00 per run, took 15+ seconds for basic math, and LLMs hallucinated contrast ratios.
149
 
150
+ V2 split the work into rules vs AI. Better, but a new problem emerged: **three competing naming systems** for colors. The normalizer used word-based shades ("blue.light"), the export layer used numeric shades ("blue.500"), and the LLM agent used whatever it felt like ("brand.primary"). The output in Figma was chaos.
 
 
 
151
 
152
+ V3 fixed this with a clear authority chain and a dedicated color classifier:
153
 
154
+ > **Rule-based code handles certainty. LLMs handle ambiguity. And there's ONE naming authority.**
155
 
156
+ [IMAGE: Architecture diagram - Layer 1 (Extraction) -> Layer 2 (Classification + Analysis) -> Layer 3 (4 Named Agents)]
157
 
158
+ ```
159
+ +--------------------------------------------------+
160
+ | LAYER 1: EXTRACTION + NORMALIZATION (Free) |
161
+ | +- Crawler + 7-Source Extractor (Playwright) |
162
+ | +- Normalizer: colors, radius, shadows, typo |
163
+ | | +- Radius: parse, deduplicate, sort, name |
164
+ | | +- Shadows: parse, sort by blur, name |
165
+ | | +- Colors: hue + numeric shade (50-900) |
166
+ | +- Firecrawl: deep CSS parsing (bypass CORS) |
167
+ +--------------------------------------------------+
168
+ | LAYER 2: CLASSIFICATION + RULE ENGINE (Free) |
169
+ | +- Color Classifier (815 lines, deterministic) |
170
+ | | +- CSS evidence -> category -> token name |
171
+ | | +- Capped: brand(3), text(3), bg(3), etc. |
172
+ | | +- Every decision logged with evidence |
173
+ | +- WCAG Contrast Checker (actual FG/BG pairs) |
174
+ | +- Type Scale Detection (ratio math) |
175
+ | +- Spacing Grid Analysis (GCD math) |
176
+ | +- Color Statistics (deduplication) |
177
+ +--------------------------------------------------+
178
+ | LAYER 3: 4 AI AGENTS (~$0.003) |
179
+ | +- AURORA - Brand Advisor (Qwen 72B) |
180
+ | +- ATLAS - Benchmark Advisor (Llama 70B) |
181
+ | +- SENTINEL - Best Practices Audit (Qwen 72B) |
182
+ | +- NEXUS - Head Synthesizer (Llama 70B) |
183
+ +--------------------------------------------------+
184
+ ```
185
+
186
+ ### The Naming Authority Chain (V3's Key Innovation)
187
+
188
+ This was the single hardest problem to solve. In V2, three systems produced color names:
189
+
190
+ | System | Convention | Example | Problem |
191
+ |--------|-----------|---------|---------|
192
+ | Normalizer | Word shades | `color.blue.light` | Inconsistent |
193
+ | Export function | Numeric shades | `color.blue.500` | Conflicts |
194
+ | AURORA LLM | Whatever it wants | `brand.primary` | Unpredictable |
195
+
196
+ **Result in Figma: `blue.300`, `blue.dark`, `blue.light`, `blue.base` in the same export. Unusable.**
197
 
198
+ V3 established a clear chain:
199
 
200
  ```
201
+ 1. Color Classifier (PRIMARY) - deterministic, covers ALL colors
202
+ +- Rule-based: CSS evidence -> category -> token name
203
+ +- 100% reproducible, logged with evidence
204
+
205
+ 2. AURORA LLM (SECONDARY) - semantic role enhancer ONLY
206
+ +- Can promote "color.blue.500" -> "color.brand.primary"
207
+ +- CANNOT rename palette colors
208
+ +- Only brand/text/bg/border/feedback roles accepted
209
+
210
+ 3. Normalizer (FALLBACK) - preliminary hue+shade names
211
+ +- Only used if classifier hasn't run yet
 
 
 
 
212
  ```
213
 
214
+ One naming authority. No conflicts. Clean Figma output every time.
215
 
216
+ ---
217
 
218
+ ## Layer 1: Extraction + Normalization (No LLM)
219
 
220
+ ### Extraction: 7 Sources
221
 
222
+ A Playwright-powered browser visits each page at **two viewports** (1440px desktop + 375px mobile) and extracts every design token from **8 sources**:
223
 
224
+ [IMAGE: 8 Extraction Sources diagram]
225
 
226
  ```
227
+ --- Playwright (7 internal sources) ---
228
+ Source 1: Computed Styles -> What the browser actually renders
229
+ Source 2: CSS Variables -> --primary-color, --spacing-md
230
+ Source 3: Inline Styles -> style="color: #06b2c4"
231
+ Source 4: SVG Attributes -> fill, stroke colors
232
+ Source 5: Stylesheets -> CSS rules, hover states, pseudo-elements
233
+ Source 6: External CSS -> Fetched & parsed CSS files
234
+ Source 7: Page Scan -> Brute-force regex on style blocks
235
+
236
+ --- Separate deep extraction ---
237
+ Source 8: Firecrawl -> Deep CSS parsing (bypasses CORS)
238
  ```
239
 
240
+ ### Normalization: Not Just Dedup
241
+
242
+ The normalizer in V2 was a major pain point. Colors got named, but radius and shadows were passed through raw. Multi-value CSS like `"0px 0px 16px 16px"` became garbage tokens. Percentage values like `"50%"` couldn't be used in Figma.
243
 
244
+ V3's normalizer actually processes everything:
245
 
246
+ **Colors:** Deduplicate by exact hex + RGB distance < 30. Assign hue family + numeric shade (50-900). Never use words like "light" or "dark" for shades. Add role hints from CSS context for the classifier.
247
+
248
+ **Radius:** Parse multi-value shorthand (take max), convert rem/em/% to px, deduplicate by resolved value, sort by size, name semantically (none/sm/md/lg/xl/2xl/full). A raw extraction of `["8px", "0px 0px 16px 16px", "50%", "1rem"]` becomes:
249
+ ```
250
+ radius.sm = 4px (from 0.25rem context)
251
+ radius.md = 8px
252
+ radius.xl = 16px (max of 0 0 16 16)
253
+ radius.full = 9999px (from 50%)
254
+ ```
255
+
256
+ **Shadows:** Parse CSS shadow strings into components (offset, blur, spread, color). Filter out spread-only (border simulation) and inset shadows. Sort by blur radius. Deduplicate by blur bucket. Name by elevation (xs/sm/md/lg/xl). If fewer than 5 shadows extracted, interpolate to always produce 5 elevation levels.
257
 
258
  **Cost: $0.00 | Runtime: ~90 seconds**
259
 
260
+ ---
261
+
262
+ ## Layer 2: Color Classification + Rule Engine (No LLM)
263
+
264
+ ### The Color Classifier (V3's Biggest Addition)
265
+
266
+ This is 815 lines of deterministic code that replaced what AURORA used to do badly.
267
+
268
+ **The problem it solves:** Given 30+ extracted colors, which is the brand primary? Which are text colors? Which are backgrounds?
269
+
270
+ An LLM can reason about this, but inconsistently. The same color might be called "brand.primary" in one run and "accent.main" in the next. And it only named 10 colors, leaving the rest in chaos.
271
+
272
+ The classifier uses CSS evidence:
273
 
274
+ ```
275
+ CSS Evidence -> Category:
276
+ background-color on <button> + saturated + freq>5 -> BRAND
277
+ color on <p>/<span> + low saturation -> TEXT
278
+ background-color on <div>/<body> + neutral -> BG
279
+ border-color + low saturation -> BORDER
280
+ red hue + sat>0.6 + low freq -> FEEDBACK (error)
281
+ everything else -> PALETTE (by hue.shade)
282
+ ```
283
+
284
+ **Key features:**
285
+ - **Aggressive deduplication**: Colors within RGB distance < 30 AND same category get merged (13 text grays become 3)
286
+ - **Capped categories**: brand (max 3), text (max 3), bg (max 3), border (max 3), feedback (max 4), palette (rest)
287
+ - **User-selectable naming convention**: semantic, tailwind, or material
288
+ - **Every decision logged with evidence**: `[DEDUP] merged #1a1a1a with #1b1b1b (dist=1.7)`, `[CLASSIFY] #06b2c4 -> brand (background-color on <button>, freq=33)`
289
 
290
+ **Cost: $0.00 | Reproducible: 100% | Runtime: <100ms**
291
+
292
+ ### The Rule Engine
293
+
294
+ After classification, the rule engine runs every check that can be done with pure math:
295
 
296
  ```
297
+ TYPE SCALE ANALYSIS
298
+ +- Detected Ratio: 1.167
299
+ +- Closest Standard: Minor Third (1.2)
300
+ +- Consistent: Warning (variance: 0.24)
301
+ +- Recommendation: 1.25 (Major Third)
302
+
303
+ ACCESSIBILITY CHECK (WCAG AA/AAA)
304
+ +- Colors Analyzed: 210
305
+ +- FG/BG Pairs Checked: 220
306
+ +- AA Pass: 143
307
+ +- AA Fail (real FG/BG pairs): 67
308
+ | +- fg:#06b2c4 on bg:#ffffff -> Fix: #048391 (4.5:1)
309
+ | +- fg:#999999 on bg:#ffffff -> Fix: #757575 (4.6:1)
310
+ | +- ... and 62 more
311
+
312
+ SPACING GRID
313
+ +- Detected Base: 1px (GCD)
314
+ +- Grid Aligned: Warning 0%
315
+ +- Recommendation: 8px grid
316
+
317
+ CONSISTENCY SCORE: 52/100
318
  ```
319
 
320
+ Not just "color vs white" — it tests **actual foreground/background pairs** found on the page. And algorithmically generates AA-compliant alternatives.
321
 
322
+ This entire layer runs **in under 1 second** and costs nothing — the single biggest cost optimization in the system.
323
 
324
  ---
325
 
326
+ ## Layer 3: AI Analysis & Interpretation (4 Named Agents)
327
 
328
+ This is where language models actually add value — tasks that require **context, reasoning, and judgment**. But in V3, they're advisory only. They don't control naming.
329
 
330
+ [IMAGE: Agent pipeline diagram - AURORA -> ATLAS -> SENTINEL -> NEXUS]
331
 
332
  ---
333
 
334
+ ### Agent 1: AURORA — Brand Color Advisor
335
  **Model:** Qwen 72B (HuggingFace PRO)
336
+ **Role change in V3:** Advisory only. Cannot rename colors. Can promote palette colors to semantic roles.
 
337
 
338
+ **What AURORA does now:**
339
 
340
+ The color classifier handles the naming. AURORA's job shifted to:
341
+ - Identify brand strategy (complementary? analogous? monochrome?)
342
+ - Suggest which palette colors deserve semantic roles (e.g., "color.blue.500 should be color.brand.primary")
343
+ - Assess palette cohesion (score 1-10)
344
+ - Provide reasoning that helps designers understand the brand's color story
345
 
346
+ **The key constraint:** `filter_aurora_naming_map()` strips any non-semantic names from AURORA's output. If AURORA tries to rename `color.blue.500` to `color.ocean.primary`, it's rejected. Only `brand.`, `text.`, `bg.`, `border.`, `feedback.` role assignments pass through.
347
 
348
  ```
349
  AURORA's Analysis:
350
+ ------------------------------------------
351
+ Brand Primary: #06b2c4 (confidence: HIGH)
352
+ +- 33 buttons, 12 CTAs, dominant accent
353
+ +- Classifier already tagged as brand
354
 
355
+ Brand Secondary: #c1df1f (confidence: MEDIUM)
356
+ +- 15 accent elements, secondary CTA
357
 
358
  Palette Strategy: Complementary
359
  Cohesion Score: 7/10
360
+ +- "Clear hierarchy, accent colors differentiated"
 
 
361
  ```
362
 
363
  ---
364
 
365
  ### Agent 2: ATLAS — Benchmark Advisor
366
  **Model:** Llama 3.3 70B (128K context)
 
 
367
 
368
  **Unique Capability:** Industry benchmarking against **8 design systems** (Material 3, Polaris, Atlassian, Carbon, Apple HIG, Tailwind, Ant, Chakra).
369
 
370
  [IMAGE: Benchmark comparison table from the UI]
371
 
372
+ This agent reasons about **effort vs. value**:
373
 
374
  ```
375
  ATLAS's Recommendation:
376
+ ------------------------------------------
377
+ 1st: Shopify Polaris: 87% match
378
 
379
  Alignment Changes:
380
+ +- Type scale: 1.17 -> 1.25 (effort: medium)
381
+ +- Spacing grid: mixed -> 4px (effort: high)
382
+ +- Base size: 16px -> 16px (already aligned)
383
 
384
  Pros: Closest match, e-commerce proven, well-documented
385
  Cons: Spacing migration is significant effort
386
 
387
+ 2nd: Material 3 (77% match)
388
+ +- "Stronger mobile patterns, but 8px grid
389
  requires more restructuring"
390
  ```
391
 
392
+ ATLAS adds the context that turns analysis into action:
393
 
394
+ > "You're 87% aligned to Polaris already. Closing the gap on type scale takes ~1 hour and makes your system industry-standard."
395
 
396
  ---
397
 
398
  ### Agent 3: SENTINEL — Best Practices Auditor
399
  **Model:** Qwen 72B
400
+ **V3 improvement:** Must cite specific data from rule engine. Cross-reference critic validates that scores match actual data.
 
 
 
401
 
402
  SENTINEL prioritizes by **business impact** — not just severity:
403
 
404
  ```
405
  SENTINEL's Audit:
406
+ ------------------------------------------
407
  Overall Score: 68/100
408
 
409
  Checks:
410
+ +- PASS: Type Scale Standard (1.25 ratio)
411
+ +- WARNING: Type Scale Consistency (variance 0.18)
412
+ +- PASS: Base Size Accessible (16px)
413
+ +- FAIL: AA Compliance (67 failures)
414
+ +- WARNING: Spacing Grid (0% aligned)
415
+ +- FAIL: Near-Duplicates (351 pairs)
416
 
417
  Priority Fixes:
418
  #1 Fix brand color AA compliance
419
  Impact: HIGH | Effort: 5 min
420
+ -> "Affects 40% of interactive elements"
421
 
422
  #2 Consolidate near-duplicate colors
423
  Impact: MEDIUM | Effort: 2 hours
 
426
  Impact: MEDIUM | Effort: 1 hour
427
  ```
428
 
429
+ **V3's grounding rule:** If the rule engine says 67 AA failures, SENTINEL's AA check **must** be "fail." A cross-reference critic catches contradictions.
430
+
431
  ---
432
 
433
+ ### Agent 4: NEXUS — Head Synthesizer
434
  **Model:** Llama 3.3 70B (128K context)
 
 
435
 
436
+ NEXUS takes outputs from **all three agents + the rule engine** and synthesizes a final recommendation using a two-perspective evaluation:
437
+
438
+ - **Perspective A (Accessibility-First):** Weights AA compliance at 40%
439
+ - **Perspective B (Balanced):** Equal weights across dimensions
440
 
441
+ It evaluates both, then picks the perspective that best reflects the actual data.
442
 
443
  ```
444
  NEXUS Final Synthesis:
445
+ ------------------------------------------
446
+ Executive Summary:
447
  "Your design system scores 68/100. Critical:
448
  67 color pairs fail AA. Top action: fix brand
449
  primary contrast (5 min, high impact)."
450
 
451
+ Scores:
452
+ +- Overall: 68/100
453
+ +- Accessibility: 45/100
454
+ +- Consistency: 75/100
455
+ +- Organization: 70/100
456
 
457
+ Top 3 Actions:
458
+ 1. Fix brand color AA (#06b2c4 -> #048391)
459
  Impact: HIGH | Effort: 5 min
460
  2. Align type scale to 1.25
461
  Impact: MEDIUM | Effort: 1 hour
462
+ 3. Consolidate 143 -> ~20 semantic colors
463
  Impact: MEDIUM | Effort: 2 hours
464
 
465
+ Color Recommendations:
466
+ +- PASS: brand.primary: #06b2c4 -> #048391 (auto-accept)
467
+ +- PASS: text.secondary: #999999 -> #757575 (auto-accept)
468
+ +- REJECT: brand.accent: #FF6B35 -> #E65100 (user decides)
469
  ```
470
 
471
  ---
472
 
473
+ ## The Figma Bridge: DTCG JSON -> Variables -> Visual Spec
474
 
475
  [IMAGE: Figma plugin UI showing import options]
476
 
477
+ ### W3C DTCG v1 Compliance
478
+
479
+ V3's export follows the W3C Design Tokens Community Group specification (stable October 2025):
480
+
481
+ ```json
482
+ {
483
+ "color": {
484
+ "brand": {
485
+ "primary": {
486
+ "$type": "color",
487
+ "$value": "#005aa3",
488
+ "$description": "[classifier] brand: primary_action",
489
+ "$extensions": {
490
+ "com.design-system-extractor": {
491
+ "frequency": 47,
492
+ "confidence": "high",
493
+ "category": "brand",
494
+ "evidence": ["background-color on <a>", "background-color on <button>"]
495
+ }
496
+ }
497
+ }
498
+ }
499
+ },
500
+ "radius": {
501
+ "md": { "$type": "dimension", "$value": "8px" }
502
+ },
503
+ "shadow": {
504
+ "sm": {
505
+ "$type": "shadow",
506
+ "$value": {
507
+ "offsetX": "0px", "offsetY": "2px",
508
+ "blur": "8px", "spread": "0px",
509
+ "color": "#00000026"
510
+ }
511
+ }
512
+ }
513
+ }
514
+ ```
515
+
516
+ Every token includes `$type`, `$value`, and `$description`. Colors include `$extensions` with extraction metadata (frequency, confidence, category, evidence). This means any DTCG-compatible tool can consume our output.
517
 
518
+ ### The Custom Figma Plugin
 
 
 
 
 
 
 
519
 
520
+ The plugin closes the loop:
521
 
522
+ 1. **Auto-detects DTCG format** (vs legacy JSON)
523
+ 2. **Creates Figma Variables** — Color, Number, and String variable collections
524
+ 3. **Creates Styles** — Paint styles, Text styles, Effect styles
525
+ 4. **Generates Visual Spec Page** — Separate frames for typography, colors, spacing, radius, shadows
526
 
527
+ [IMAGE: Figma visual spec page showing all tokens]
528
 
529
  ```
530
+ +-------------------------------------------------------------+
531
+ | BRAND TEXT BACKGROUND FEEDBACK |
532
+ +-------------------------------------------------------------+
533
+ | +----+ +----+ +----+ +----+ +----+ +----+ +----+ |
534
+ | |Prim| |Sec | |Prim| |Sec | |Prim| |Sec | |Err | |
535
+ | +----+ +----+ +----+ +----+ +----+ +----+ +----+ |
536
+ | #005aa3 #c1df1f #373737 #666666 #fff #f5f5f5 #dc2626 |
537
+ | AA:Pass AA:Warn AA:Pass AA:Pass AA:Pass |
538
+ +-------------------------------------------------------------+
539
  ```
540
 
541
+ The visual spec uses horizontal auto-layout with AA compliance badges on every color swatch. Typography renders in the actual detected font family with size, weight, and line-height metadata.
542
+
543
  ---
544
 
545
  ## Comparing AS-IS vs TO-BE
546
 
547
  [IMAGE: Side-by-side comparison of AS-IS and TO-BE specimens]
548
 
 
 
549
  | Token | AS-IS | TO-BE | Change |
550
  |-------|-------|-------|--------|
551
+ | Type Scale | ~1.18 (random) | 1.25 (Major Third) | Consistent |
552
+ | brand.primary | #06b2c4 | #048391 | AA: 3.2 -> 4.5 |
553
+ | Spacing Grid | Mixed | 8px base | Standardized |
554
+ | Color Ramps | None | 50-950 | Generated |
555
+ | Unique Colors | 143 | ~20 semantic | Consolidated |
556
+ | Radius | Raw CSS garbage | none/sm/md/lg/xl/full | Normalized |
557
+ | Shadows | Unsorted, unnamed | xs/sm/md/lg/xl (5 levels) | Progressive |
558
 
559
  ---
560
 
 
562
 
563
  | Metric | Manual Process | My Workflow |
564
  |--------|---------------|-------------|
565
+ | Time | 3-5 days | ~15 minutes |
566
  | Cost | Designer salary | ~$0.003 |
567
+ | Coverage | ~50 colors | 143 colors (8 sources) |
568
  | Accuracy | Human error | Computed styles (exact) |
569
  | Accessibility | Manual spot checks | Full AA/AAA (all 220 pairs) |
570
  | Benchmarking | Subjective | 8 industry systems compared |
571
+ | Color naming | Manual | Deterministic classifier (100% reproducible) |
572
+ | Radius/shadows | Copy raw CSS | Normalized, sorted, named |
573
+ | Figma ready | Hours more | Instant (DTCG plugin + visual spec) |
574
+ | Format | Proprietary | W3C DTCG v1 standard |
575
 
576
  ---
577
 
 
583
 
584
  | Agent | Model | Why This Model | Cost |
585
  |-------|-------|---------------|------|
586
+ | Normalizer | None | Math doesn't need AI | $0.00 |
587
+ | Color Classifier | None (815 lines) | Deterministic, reproducible | $0.00 |
588
  | Rule Engine | None | Math doesn't need AI | $0.00 |
589
+ | AURORA | Qwen 72B | Creative brand reasoning | ~Free (HF PRO) |
590
  | ATLAS | Llama 3.3 70B | 128K context for benchmarks | ~Free (HF PRO) |
591
  | SENTINEL | Qwen 72B | Strict, consistent evaluation | ~Free (HF PRO) |
592
  | NEXUS | Llama 3.3 70B | 128K context for synthesis | ~$0.001 |
 
594
 
595
  For designer-scale usage (weekly runs), inference costs are effectively negligible, with HuggingFace PRO ($9/month) covering most models.
596
 
597
+ The V1-to-V3 journey:
598
+ - **V1:** LLM for everything. $0.50-1.00/run. Hallucinated contrast ratios.
599
+ - **V2:** Rules + LLM split. $0.003/run. But 3 naming systems fighting.
600
+ - **V3:** Rules + Classifier + Advisory LLM. $0.003/run. One naming authority. Clean output.
601
 
602
  ---
603
 
 
607
 
608
  | If This Fails... | What Happens |
609
  |-------------------|-------------|
610
+ | LLM agents down | Color classifier + rule engine still works (free) |
611
  | Firecrawl unavailable | DOM-only extraction (slightly fewer tokens) |
612
  | Benchmark fetch fails | Hardcoded fallback data from 8 systems |
613
  | NEXUS synthesis fails | `create_fallback_synthesis()` from rule engine |
614
+ | AURORA returns garbage | `filter_aurora_naming_map()` strips invalid names |
615
+ | **Entire AI layer** | **Full classifier + rule-engine-only report - still useful** |
616
 
617
  ---
618
 
 
621
  [IMAGE: Tech stack diagram with logos]
622
 
623
  **AI Agent App:**
624
+ - Playwright (browser automation, 8-source extraction)
625
  - Firecrawl (deep CSS parsing)
626
  - Gradio (UI framework)
627
  - Qwen/Qwen2.5-72B-Instruct (AURORA + SENTINEL)
628
  - meta-llama/Llama-3.3-70B-Instruct (ATLAS + NEXUS)
629
  - HuggingFace Spaces (hosting) + HF Inference API
630
  - Docker (containerized deployment)
631
+ - 148 tests (82 deterministic + 27 agent evals + 35 live evals + 4 pipeline)
632
 
633
  **Figma Integration:**
634
+ - Custom Figma Plugin (v7)
635
+ - W3C DTCG v1 compliant JSON
636
+ - Variables API + Paint/Text/Effect Styles
637
+ - Auto-generated Visual Spec pages
638
+ - Tokens Studio compatible
639
 
640
  ---
641
 
 
645
 
646
  If rules can do it faster and cheaper — use rules. My WCAG checker is 100% accurate. An LLM's contrast ratio calculation? Maybe 85% accurate, and 100x slower.
647
 
648
+ The rule engine + color classifier do 90% of the work for $0.
649
+
650
+ ### 2. The Naming Authority Problem Is Real
651
 
652
+ V2's biggest failure wasn't technical — it was organizational. Three systems producing color names with no clear hierarchy. The fix wasn't better AI, it was a clear authority chain: classifier is PRIMARY, LLM is SECONDARY (advisory only), normalizer is FALLBACK.
653
 
654
+ **Lesson:** When multiple systems touch the same data, establish ONE authority. Don't merge competing outputs.
655
+
656
+ ### 3. Industry Benchmarks Are Gold
657
+
658
+ Without benchmarks: "Your type scale is inconsistent" -- *PM nods*
659
+ With benchmarks: "You're 87% aligned to Shopify Polaris. Closing the gap takes 1 hour and makes your system industry-standard." -- *PM schedules meeting*
660
 
661
  Time to build benchmark database: 1 day.
662
  Value: Transforms analysis into prioritized action.
663
 
664
+ ### 4. Semi-Automation > Full Automation
665
 
666
  I don't want AI to make all decisions. The workflow has human checkpoints:
667
  - Review AS-IS in Figma before modernizing
 
670
 
671
  AI as **copilot**, not autopilot.
672
 
673
+ ### 5. Specialized Agents > One Big Prompt
674
 
675
  One mega-prompt doing brand analysis + benchmark comparison + accessibility audit + synthesis = confused, unfocused output. Four agents, each with a single responsibility = sharp, reliable analysis.
676
 
677
+ ### 6. W3C Standards Matter
678
 
679
+ Adopting the DTCG v1 spec (October 2025) means our JSON output works with Tokens Studio, Style Dictionary v4, and any tool that follows the standard. Custom formats create lock-in. Standards create ecosystems.
 
 
 
 
680
 
681
+ ### 7. Deterministic Classification Beats LLM Classification
682
 
683
+ AURORA (LLM) named 10 colors per run, inconsistently. The color classifier names ALL colors, every time, with logged evidence. For categorization tasks where you have structured input data (CSS properties, element types, frequency), rules beat LLMs on accuracy, speed, cost, and reproducibility.
684
 
685
  ---
686
 
 
688
 
689
  **On HuggingFace Spaces:** I'm using HF Spaces as the hosting platform with a Gradio frontend running in Docker. The LLM models (Qwen 72B, Llama 3.3 70B) are called via HuggingFace Inference API. Browser automation (Playwright + Chromium) runs inside the container.
690
 
691
+ **On the Data:** This system works on **live websites** — point it at any URL and it extracts real design tokens from the actual DOM. No synthetic data. The architecture, LLM integrations, and rule engine are production-ready with 148 passing tests.
692
+
693
+ **On the Standard:** The W3C DTCG specification reached stable v1 in October 2025. Our output includes `$type`, `$value`, `$description`, and `$extensions` with namespaced metadata. Any DTCG-compatible tool can consume it.
694
 
695
  ---
696
 
697
  ## Try It Yourself
698
 
699
  **AI Agent App:**
700
+ - Live Demo: [HuggingFace Space link]
701
+ - GitHub: [Repository link]
702
 
703
  **Workflow:**
704
+ 1. Enter website URL -> Extract AS-IS
705
+ 2. Download DTCG JSON -> Import to Figma
706
+ 3. Review visual spec -> Run AI analysis
707
+ 4. Accept suggestions -> Export TO-BE
708
+ 5. Import to Figma -> Compare visual specs
709
 
710
  ---
711
 
 
715
 
716
  It's **compression** — compressing days of manual audit, multiple expert perspectives, and industry benchmarking into something a team can act on Monday morning.
717
 
718
+ Instead of 3-5 days reviewing DevTools, your team gets:
719
+ > "Top 3 issues, ranked by impact, with specific fixes, benchmark alignment, and a Figma-ready visual spec to compare before and after."
720
 
721
  That's AI amplifying design systems impact.
722
 
723
+ Full code on GitHub: [link]
724
 
725
  ---
726
 
727
+ ## What's Next: Automated Component Generation (Part 2)
728
+
729
+ The token extraction and analysis story is complete. But design systems aren't just tokens — they're **components**.
730
+
731
+ After exhaustive research into 30+ tools (Tokens Studio, Figr Identity, Figma Make, MCP bridges, story.to.design, and more), I found a genuine market gap:
732
+
733
+ **No production tool takes DTCG JSON and outputs Figma components with proper variants.**
734
+
735
+ Every tool either:
736
+ - Imports tokens as variables (but doesn't create components)
737
+ - Creates components from brand config (but can't consume YOUR tokens)
738
+ - Uses AI to write to Figma (but is non-deterministic)
739
+ - Needs a full Storybook pipeline as intermediary
740
+
741
+ So I'm building it. The Figma Plugin API supports everything needed: `createComponent()`, `combineAsVariants()`, `setBoundVariable()`. Our existing plugin already imports tokens and creates variables.
742
 
743
  **Coming in Episode 7:**
744
+ - Auto-generating Figma components from extracted tokens
745
+ - Button (60 variants), TextInput (8), Card, Toast, Checkbox/Radio
746
+ - Token-to-component binding: `color.brand.primary` -> Button fill, `radius.md` -> Button corners
747
+ - Fully deterministic: same tokens in = same components out
748
 
749
  ---
750
 
 
771
 
772
  ---
773
 
774
+ #AIAgents #DesignSystems #UXDesign #Figma #MultiAgentSystems #DesignTokens #Automation #AIEngineering #HuggingFace #WCAG #W3CDTCG
775
 
776
  ---
777
 
778
+ *Published on Medium - ~12 min read*