riazmo commited on
Commit
64ebfd5
Β·
verified Β·
1 Parent(s): efdcf76

Upload CONTEXT.md

Browse files
Files changed (1) hide show
  1. docs/CONTEXT.md +111 -38
docs/CONTEXT.md CHANGED
@@ -10,10 +10,12 @@
10
 
11
  | File | What Changed |
12
  |------|--------------|
13
- | `agents/extractor.py` | Enhanced 5-source extraction (DOM, CSS vars, SVG, inline, stylesheets) |
14
- | `core/preview_generator.py` | Added AS-IS previews for Stage 1 (colors, spacing, radius, shadows) |
15
- | `app.py` | Stage 1 UI now has 5 preview tabs, enhanced logging shows extraction sources |
16
- | `docs/CONTEXT.md` | Updated with all changes, enhanced architecture diagrams |
 
 
17
 
18
  ---
19
 
@@ -314,17 +316,79 @@ Shows detailed extraction progress:
314
  πŸ”€ Merging Firecrawl colors with Playwright extraction...
315
  βœ… Added 12 new colors from Firecrawl
316
  πŸ“Š Total colors now: 44
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
317
  ```
318
 
319
- ### Stage 2 LLM Analysis Logs
320
 
321
- Shows detailed reasoning from each agent:
322
- - What they analyzed
323
- - Scores per category (Typography, Colors, AA, Spacing)
324
- - Specific findings and recommendations
325
- - Confidence levels
326
- - How HEAD resolved disagreements
327
- - Cost per call and total
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
328
 
329
  ---
330
 
@@ -347,7 +411,7 @@ Shows detailed reasoning from each agent:
347
  7. Page content scan (brute-force regex on HTML)
348
  - **Output:** Raw tokens with frequency, context, confidence, source type
349
 
350
- ### Agent 1B: Firecrawl CSS Deep Diver (NEW)
351
  - **Persona:** CSS Deep Diver
352
  - **Tool:** Firecrawl / httpx fallback
353
  - **Job:**
@@ -357,6 +421,22 @@ Shows detailed reasoning from each agent:
357
  - Find colors missed by DOM inspection
358
  - **Output:** Additional colors merged into main extraction
359
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
360
  ### Agent 2: Token Normalizer & Structurer
361
  - **Persona:** Design System Librarian
362
  - **Job:**
@@ -389,12 +469,13 @@ Shows detailed reasoning from each agent:
389
  - **Purpose:** Trust building β€” show exactly what was extracted
390
  - **Shows:**
391
  - Token tables (colors, typography, spacing)
392
- - **5 Visual Preview Tabs (AS-IS, no enhancements):**
393
  1. πŸ”€ Typography β€” actual font rendered
394
- 2. 🎨 Colors β€” simple swatches (no ramps)
395
- 3. πŸ“ Spacing β€” visual bars
396
- 4. πŸ”˜ Radius β€” rounded boxes
397
- 5. πŸŒ‘ Shadows β€” shadow cards
 
398
  - **Human Actions:** Accept/reject tokens, flag anomalies, toggle Desktop↔Mobile
399
 
400
  ### Stage 2: Upgrade Playground (MOST IMPORTANT)
@@ -403,6 +484,7 @@ Shows detailed reasoning from each agent:
403
  - Side-by-side option selector + live preview
404
  - **Color Ramps (50-950 shades with AA compliance)**
405
  - Type scale options (1.2, 1.25, 1.333)
 
406
  - **Human Actions:** Select type scale A/B/C, spacing system, color ramps β€” preview updates instantly
407
 
408
  ### Stage 3: Final Review & Export
@@ -429,44 +511,35 @@ design-system-extractor/
429
  β”‚ β”œβ”€β”€ __init__.py
430
  β”‚ β”œβ”€β”€ state.py # LangGraph state definitions
431
  β”‚ β”œβ”€β”€ graph.py # LangGraph workflow orchestration
432
- β”‚ β”œβ”€β”€ crawler.py # Agent 1: Website crawler
433
- β”‚ β”œβ”€β”€ extractor.py # Agent 1: Token extraction
 
 
434
  β”‚ β”œβ”€β”€ normalizer.py # Agent 2: Token normalization
435
  β”‚ β”œβ”€β”€ advisor.py # Agent 3: Best practices
 
436
  β”‚ └── generator.py # Agent 4: JSON generator
437
  β”‚
438
  β”œβ”€β”€ core/
439
  β”‚ β”œβ”€β”€ __init__.py
440
- β”‚ β”œβ”€β”€ browser.py # Playwright browser management
441
- β”‚ β”œβ”€β”€ css_parser.py # CSS/computed style extraction
442
  β”‚ β”œβ”€β”€ color_utils.py # Color analysis, contrast, ramps
443
- β”‚ β”œβ”€β”€ typography_utils.py # Type scale detection & generation
444
- β”‚ β”œβ”€β”€ spacing_utils.py # Spacing pattern detection
445
  β”‚ └── token_schema.py # Token data structures (Pydantic)
446
  β”‚
447
  β”œβ”€β”€ ui/
448
- β”‚ β”œβ”€β”€ __init__.py
449
- β”‚ β”œβ”€β”€ components.py # Reusable Gradio components
450
- β”‚ β”œβ”€β”€ stage1_extraction.py # Stage 1 UI
451
- β”‚ β”œβ”€β”€ stage2_upgrade.py # Stage 2 UI
452
- β”‚ β”œβ”€β”€ stage3_export.py # Stage 3 UI
453
- β”‚ └── preview_generator.py # HTML preview generation
454
  β”‚
455
  β”œβ”€β”€ templates/
456
- β”‚ β”œβ”€β”€ preview.html # Live preview base template
457
- β”‚ └── specimen.html # Design system specimen template
458
  β”‚
459
  β”œβ”€β”€ storage/
460
- β”‚ └── persistence.py # HF Spaces storage management
461
  β”‚
462
  β”œβ”€β”€ tests/
463
- β”‚ β”œβ”€β”€ test_crawler.py
464
- β”‚ β”œβ”€β”€ test_extractor.py
465
- β”‚ └── test_normalizer.py
466
  β”‚
467
  └── docs/
468
- β”œβ”€β”€ CONTEXT.md # THIS FILE - upload for context refresh
469
- └── API.md # API documentation
470
  ```
471
 
472
  ---
 
10
 
11
  | File | What Changed |
12
  |------|--------------|
13
+ | `agents/extractor.py` | Enhanced 7-source extraction (DOM, CSS vars, SVG, inline, stylesheets, external CSS, page scan) |
14
+ | `agents/firecrawl_extractor.py` | **NEW** Agent 1B for deep CSS parsing |
15
+ | `agents/semantic_analyzer.py` | **NEW** Agent 1C for semantic color categorization (brand/text/bg/border) |
16
+ | `core/preview_generator.py` | AS-IS previews + Color Ramps sorted by brand priority |
17
+ | `app.py` | Stage 1 UI now has 6 preview tabs including Semantic Colors |
18
+ | `docs/CONTEXT.md` | Updated with semantic analyzer, full architecture diagrams |
19
 
20
  ---
21
 
 
316
  πŸ”€ Merging Firecrawl colors with Playwright extraction...
317
  βœ… Added 12 new colors from Firecrawl
318
  πŸ“Š Total colors now: 44
319
+
320
+ ============================================================
321
+ 🧠 SEMANTIC COLOR ANALYSIS
322
+ ============================================================
323
+
324
+ πŸ“Š Analyzing 143 colors...
325
+ Using rule-based analysis (no LLM)
326
+
327
+ πŸ“Š SEMANTIC ANALYSIS RESULTS:
328
+
329
+ 🎨 BRAND COLORS:
330
+ primary: #06b2c4 (high)
331
+ └─ Most frequent saturated color on interactive elements (freq: 33)
332
+ secondary: #c1df1f (medium)
333
+ └─ Second most frequent brand color (freq: 15)
334
+
335
+ πŸ“ TEXT COLORS:
336
+ primary: #373737 (high)
337
+ secondary: #666666 (medium)
338
+
339
+ πŸ–ΌοΈ BACKGROUND COLORS:
340
+ primary: #ffffff (high)
341
+ secondary: #f5f5f5 (medium)
342
+
343
+ πŸ“ˆ SUMMARY:
344
+ Total colors analyzed: 143
345
+ Brand colors found: 2
346
+ Clear hierarchy: Yes
347
+ Analysis method: rule-based
348
  ```
349
 
350
+ ### Stage 2 LLM Analysis Logs (With Semantic Context)
351
 
352
+ Shows detailed reasoning from each agent WITH semantic context:
353
+
354
+ ```
355
+ ============================================================
356
+ 🧠 STAGE 2: MULTI-AGENT ANALYSIS
357
+ ============================================================
358
+
359
+ 🧠 SEMANTIC CONTEXT FROM STAGE 1:
360
+ Brand Primary: #06b2c4
361
+ Text Primary: #373737
362
+ Analysis Method: rule-based
363
+
364
+ =======================================================
365
+ πŸ€– LLM 1: meta-llama/Llama-3.1-70B-Instruct
366
+ =======================================================
367
+ Provider: novita
368
+ πŸ’° Cost: $0.29/M in, $0.59/M out
369
+ πŸ“ Task: Typography, Colors, AA, Spacing analysis
370
+ 🧠 Semantic context: Yes ← NEW: LLM knows color roles!
371
+
372
+ πŸ“Š LLM 1 FINDINGS:
373
+
374
+ COLORS (with semantic context):
375
+ β”œβ”€ Brand Primary (#06b2c4): "Fails AA on white (3.2:1)"
376
+ β”œβ”€ Suggested fix: "#0891a8 (4.6:1)"
377
+ └─ Score: 6/10
378
+
379
+ =======================================================
380
+ 🎯 HEAD: Compiling final recommendations...
381
+ =======================================================
382
+
383
+ πŸ“₯ INPUT: Analyzing outputs from LLM 1 + LLM 2 + Rules + Semantic...
384
+
385
+ πŸ“Š HEAD SYNTHESIS:
386
+
387
+ COLOR RECOMMENDATIONS (per semantic role):
388
+ β”œβ”€ brand.primary: #06b2c4 β†’ Keep for branding, use #0891a8 for text
389
+ β”œβ”€ text.primary: #373737 β†’ Keep (passes AA)
390
+ └─ Generate ramps for: brand.primary, brand.secondary, neutral
391
+ ```
392
 
393
  ---
394
 
 
411
  7. Page content scan (brute-force regex on HTML)
412
  - **Output:** Raw tokens with frequency, context, confidence, source type
413
 
414
+ ### Agent 1B: Firecrawl CSS Deep Diver
415
  - **Persona:** CSS Deep Diver
416
  - **Tool:** Firecrawl / httpx fallback
417
  - **Job:**
 
421
  - Find colors missed by DOM inspection
422
  - **Output:** Additional colors merged into main extraction
423
 
424
+ ### Agent 1C: Semantic Color Analyzer (NEW - LLM)
425
+ - **Persona:** Design System Semanticist
426
+ - **Tool:** Rule-based analysis (LLM optional)
427
+ - **Job:**
428
+ - Analyze colors based on actual CSS usage (not guessing)
429
+ - Categorize into semantic roles:
430
+ - **Brand Colors:** Used on buttons, CTAs, links (interactive elements)
431
+ - **Text Colors:** Used with `color` property on p, span, h1-h6
432
+ - **Background Colors:** Used with `background-color` on containers
433
+ - **Border Colors:** Used with `border-color` properties
434
+ - **Feedback Colors:** Error (red), success (green), warning (yellow)
435
+ - Detect color hierarchy (primary β†’ secondary β†’ muted)
436
+ - **Input:** Colors WITH context data (css_properties, elements, frequency)
437
+ - **Output:** Semantic categorization with confidence levels
438
+ - **Why:** Stage 2 LLMs can now give SPECIFIC recommendations per role
439
+
440
  ### Agent 2: Token Normalizer & Structurer
441
  - **Persona:** Design System Librarian
442
  - **Job:**
 
469
  - **Purpose:** Trust building β€” show exactly what was extracted
470
  - **Shows:**
471
  - Token tables (colors, typography, spacing)
472
+ - **6 Visual Preview Tabs (AS-IS, no enhancements):**
473
  1. πŸ”€ Typography β€” actual font rendered
474
+ 2. 🎨 Colors β€” simple swatches sorted by frequency (no ramps)
475
+ 3. 🧠 Semantic Colors β€” colors organized by usage (brand/text/bg/border)
476
+ 4. πŸ“ Spacing β€” visual bars
477
+ 5. πŸ”˜ Radius β€” rounded boxes
478
+ 6. πŸŒ‘ Shadows β€” shadow cards
479
  - **Human Actions:** Accept/reject tokens, flag anomalies, toggle Desktop↔Mobile
480
 
481
  ### Stage 2: Upgrade Playground (MOST IMPORTANT)
 
484
  - Side-by-side option selector + live preview
485
  - **Color Ramps (50-950 shades with AA compliance)**
486
  - Type scale options (1.2, 1.25, 1.333)
487
+ - **Semantic-aware recommendations:** "Your brand primary #06b2c4 fails AA, consider #0891a8"
488
  - **Human Actions:** Select type scale A/B/C, spacing system, color ramps β€” preview updates instantly
489
 
490
  ### Stage 3: Final Review & Export
 
511
  β”‚ β”œβ”€β”€ __init__.py
512
  β”‚ β”œβ”€β”€ state.py # LangGraph state definitions
513
  β”‚ β”œβ”€β”€ graph.py # LangGraph workflow orchestration
514
+ β”‚ β”œβ”€β”€ crawler.py # Agent 1A: Website crawler
515
+ β”‚ β”œβ”€β”€ extractor.py # Agent 1A: Token extraction (7 sources)
516
+ β”‚ β”œβ”€β”€ firecrawl_extractor.py # Agent 1B: Deep CSS parsing
517
+ β”‚ β”œβ”€β”€ semantic_analyzer.py # Agent 1C: Semantic color categorization
518
  β”‚ β”œβ”€β”€ normalizer.py # Agent 2: Token normalization
519
  β”‚ β”œβ”€β”€ advisor.py # Agent 3: Best practices
520
+ β”‚ β”œβ”€β”€ stage2_graph.py # Stage 2 multi-agent LLM workflow
521
  β”‚ └── generator.py # Agent 4: JSON generator
522
  β”‚
523
  β”œβ”€β”€ core/
524
  β”‚ β”œβ”€β”€ __init__.py
 
 
525
  β”‚ β”œβ”€β”€ color_utils.py # Color analysis, contrast, ramps
526
+ β”‚ β”œβ”€β”€ preview_generator.py # HTML preview generation
527
+ β”‚ β”œβ”€β”€ hf_inference.py # HuggingFace LLM inference
528
  β”‚ └── token_schema.py # Token data structures (Pydantic)
529
  β”‚
530
  β”œβ”€β”€ ui/
531
+ β”‚ └── __init__.py
 
 
 
 
 
532
  β”‚
533
  β”œβ”€β”€ templates/
 
 
534
  β”‚
535
  β”œβ”€β”€ storage/
536
+ β”‚ └── __init__.py
537
  β”‚
538
  β”œβ”€β”€ tests/
539
+ β”‚ └── __init__.py
 
 
540
  β”‚
541
  └── docs/
542
+ └── CONTEXT.md # THIS FILE - upload for context refresh
 
543
  ```
544
 
545
  ---