Upload PROJECT_CONTEXT.md
Browse files- PROJECT_CONTEXT.md +170 -0
PROJECT_CONTEXT.md
ADDED
|
@@ -0,0 +1,170 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# Design System Extractor v2 β Project Context
|
| 2 |
+
|
| 3 |
+
## Architecture Overview
|
| 4 |
+
|
| 5 |
+
```
|
| 6 |
+
Stage 0: Configuration Stage 1: Discovery & Extraction Stage 2: AI Analysis Stage 3: Export
|
| 7 |
+
ββββββββββββββββββββ ββββββββββββββββββββββββββββ ββββββββββββββββββββββββββββ ββββββββββββββββ
|
| 8 |
+
β HF Token Setup β ββββββ> β URL Discovery (sitemap/ β ββββββ> β Layer 1: Rule Engine β ββ> β Figma Tokens β
|
| 9 |
+
β Benchmark Select β β crawl) + Token Extraction β β Layer 2: Benchmarks β β JSON Export β
|
| 10 |
+
ββββββββββββββββββββ β (Desktop + Mobile CSS) β β Layer 3: LLM Agents (x3) β ββββββββββββββββ
|
| 11 |
+
ββββββββββββββββββββββββββββ β Layer 4: HEAD Synthesizerβ
|
| 12 |
+
ββββββββββββββββββββββββββββ
|
| 13 |
+
```
|
| 14 |
+
|
| 15 |
+
### Stage 1: Discovery & Extraction (Rule-Based, Free)
|
| 16 |
+
- **Discover Pages**: Fetches sitemap.xml or crawls site to find pages
|
| 17 |
+
- **Extract Tokens**: Playwright visits each page at 2 viewports (Desktop 1440px, Mobile 375px), extracts computed CSS for colors, typography, spacing, radius, shadows
|
| 18 |
+
- **User Review**: Interactive tables with Accept/Reject checkboxes + visual previews
|
| 19 |
+
|
| 20 |
+
### Stage 2: AI-Powered Analysis (4 Layers)
|
| 21 |
+
|
| 22 |
+
| Layer | Type | What It Does | Cost |
|
| 23 |
+
|-------|------|--------------|------|
|
| 24 |
+
| **Layer 1** | Rule Engine | Type scale detection, AA contrast checking, spacing grid analysis, color statistics | FREE |
|
| 25 |
+
| **Layer 2** | Benchmark Research | Compare against Material Design 3, Apple HIG, Tailwind, etc. | ~$0.001 |
|
| 26 |
+
| **Layer 3** | LLM Agents (x3) | AURORA (Brand ID) + ATLAS (Benchmark) + SENTINEL (Best Practices) | ~$0.002 |
|
| 27 |
+
| **Layer 4** | HEAD Synthesizer | NEXUS combines all outputs into final recommendations | ~$0.001 |
|
| 28 |
+
|
| 29 |
+
### Stage 3: Export
|
| 30 |
+
- Apply/reject individual color, typography, spacing recommendations
|
| 31 |
+
- Export Figma Tokens Studio-compatible JSON
|
| 32 |
+
|
| 33 |
+
---
|
| 34 |
+
|
| 35 |
+
## Agent Roster
|
| 36 |
+
|
| 37 |
+
| Agent | Codename | Model | Temp | Input | Output | Specialty |
|
| 38 |
+
|-------|----------|-------|------|-------|--------|-----------|
|
| 39 |
+
| Brand Identifier | **AURORA** | Qwen/Qwen2.5-72B-Instruct | 0.4 | Color tokens + semantic CSS analysis | Brand primary/secondary/accent, palette strategy, cohesion score, semantic names | Creative/visual reasoning, color harmony assessment |
|
| 40 |
+
| Benchmark Advisor | **ATLAS** | meta-llama/Llama-3.3-70B-Instruct | 0.25 | User's type scale, spacing, font sizes + benchmark comparison data | Recommended benchmark, alignment changes, pros/cons | 128K context for large benchmark data, comparative reasoning |
|
| 41 |
+
| Best Practices Validator | **SENTINEL** | Qwen/Qwen2.5-72B-Instruct | 0.2 | Rule Engine results (typography, accessibility, spacing, color stats) | Overall score (0-100), check results, prioritized fix list | Methodical rule-following, precise judgment |
|
| 42 |
+
| HEAD Synthesizer | **NEXUS** | meta-llama/Llama-3.3-70B-Instruct | 0.3 | All 3 agent outputs + Rule Engine facts | Executive summary, scores, top 3 actions, color/type/spacing recs | 128K context for combined inputs, synthesis capability |
|
| 43 |
+
|
| 44 |
+
### Why These Models
|
| 45 |
+
|
| 46 |
+
- **Qwen 72B** (AURORA, SENTINEL): Strong creative reasoning for brand analysis; methodical structured output for best practices. Available on HF serverless without gated access.
|
| 47 |
+
- **Llama 3.3 70B** (ATLAS, NEXUS): 128K context window handles large combined inputs from multiple agents. Excellent comparative and synthesis reasoning.
|
| 48 |
+
- **Fallback**: Qwen/Qwen2.5-7B-Instruct (free tier, available when primary models fail)
|
| 49 |
+
|
| 50 |
+
### Temperature Rationale
|
| 51 |
+
|
| 52 |
+
- **0.4** (AURORA): Allows creative interpretation of color stories and palette harmony
|
| 53 |
+
- **0.25** (ATLAS): Analytical comparison needs consistency but some flexibility for trade-off reasoning
|
| 54 |
+
- **0.2** (SENTINEL): Strict rule evaluation β consistency is critical for compliance scoring
|
| 55 |
+
- **0.3** (NEXUS): Balanced β needs to synthesize creatively but stay grounded in agent data
|
| 56 |
+
|
| 57 |
+
---
|
| 58 |
+
|
| 59 |
+
## Evaluation & Scoring
|
| 60 |
+
|
| 61 |
+
### Self-Evaluation (All Agents)
|
| 62 |
+
Each agent includes a `self_evaluation` block in its JSON output:
|
| 63 |
+
```json
|
| 64 |
+
{
|
| 65 |
+
"confidence": 8, // 1-10: How confident the agent is
|
| 66 |
+
"reasoning": "Clear usage patterns with 20+ colors",
|
| 67 |
+
"data_quality": "good", // good | fair | poor
|
| 68 |
+
"flags": [] // e.g., ["insufficient_context", "ambiguous_data"]
|
| 69 |
+
}
|
| 70 |
+
```
|
| 71 |
+
|
| 72 |
+
### AURORA Scoring Rubric (Cohesion 1-10)
|
| 73 |
+
- **9-10**: Clear harmony rule, distinct brand colors, consistent palette
|
| 74 |
+
- **7-8**: Mostly harmonious, clear brand identity
|
| 75 |
+
- **5-6**: Some relationships visible but not systematic
|
| 76 |
+
- **3-4**: Random palette, no clear strategy
|
| 77 |
+
- **1-2**: Conflicting colors, no brand identity
|
| 78 |
+
|
| 79 |
+
### SENTINEL Scoring Rubric (Overall 0-100)
|
| 80 |
+
Weighted checks:
|
| 81 |
+
- AA Compliance: 25 points
|
| 82 |
+
- Type Scale Consistency: 15 points
|
| 83 |
+
- Base Size Accessible: 15 points
|
| 84 |
+
- Spacing Grid: 15 points
|
| 85 |
+
- Type Scale Standard Ratio: 10 points
|
| 86 |
+
- Color Count: 10 points
|
| 87 |
+
- No Near-Duplicates: 10 points
|
| 88 |
+
|
| 89 |
+
### NEXUS Scoring Rubric (Overall 0-100)
|
| 90 |
+
- **90-100**: Production-ready, minor polishing only
|
| 91 |
+
- **75-89**: Solid foundation, 2-3 targeted improvements
|
| 92 |
+
- **60-74**: Functional but needs focused attention
|
| 93 |
+
- **40-59**: Significant gaps requiring systematic improvement
|
| 94 |
+
- **20-39**: Major rework needed
|
| 95 |
+
- **0-19**: Fundamental redesign recommended
|
| 96 |
+
|
| 97 |
+
### Evaluation Summary (Logged After Analysis)
|
| 98 |
+
```
|
| 99 |
+
βββββββββββββββββββββββββββββββββββββββββββββββββββ
|
| 100 |
+
π AGENT EVALUATION SUMMARY
|
| 101 |
+
βββββββββββββββββββββββββββββββββββββββββββββββββββ
|
| 102 |
+
π¨ AURORA (Brand ID): confidence=8/10, data=good
|
| 103 |
+
π’ ATLAS (Benchmark): confidence=7/10, data=good
|
| 104 |
+
β
SENTINEL (Practices): confidence=9/10, data=good, score=72/100
|
| 105 |
+
π§ NEXUS (Synthesis): confidence=8/10, data=good, overall=65/100
|
| 106 |
+
βββββββββββββββββββββββββββββββββββββββββββββββββββ
|
| 107 |
+
```
|
| 108 |
+
|
| 109 |
+
---
|
| 110 |
+
|
| 111 |
+
## User Journey
|
| 112 |
+
|
| 113 |
+
1. **Enter HF Token** β Required for LLM inference (free tier works)
|
| 114 |
+
2. **Enter Website URL** β The site to extract design tokens from
|
| 115 |
+
3. **Discover Pages** β Auto-finds pages via sitemap or crawling
|
| 116 |
+
4. **Select Pages** β Check/uncheck pages to include (max 10)
|
| 117 |
+
5. **Extract Tokens** β Scans selected pages at Desktop + Mobile viewports
|
| 118 |
+
6. **Review Stage 1** β Interactive tables: Colors, Typography, Spacing, Radius, Shadows, Semantic Colors. Each tab has a data table + visual preview accordion. Accept/reject individual tokens.
|
| 119 |
+
7. **Proceed to Stage 2** β Select benchmarks to compare against
|
| 120 |
+
8. **Run AI Analysis** β 4-layer pipeline executes (Rule Engine -> Benchmarks -> LLM Agents -> Synthesis)
|
| 121 |
+
9. **Review Analysis** β Dashboard with scores, recommendations, benchmark comparison, color recs
|
| 122 |
+
10. **Apply Upgrades** β Accept/reject individual recommendations
|
| 123 |
+
11. **Export JSON** β Download Figma Tokens Studio-compatible JSON
|
| 124 |
+
|
| 125 |
+
---
|
| 126 |
+
|
| 127 |
+
## File Structure
|
| 128 |
+
|
| 129 |
+
| File | Responsibility |
|
| 130 |
+
|------|----------------|
|
| 131 |
+
| `app.py` | Main Gradio UI β all stages, CSS, event bindings, formatting functions |
|
| 132 |
+
| `agents/llm_agents.py` | 4 LLM agent classes (AURORA, ATLAS, SENTINEL, NEXUS) + dataclasses |
|
| 133 |
+
| `agents/semantic_analyzer.py` | Semantic color categorization (brand, text, background, etc.) |
|
| 134 |
+
| `config/settings.py` | Model routing, env var loading, agent-to-model mapping |
|
| 135 |
+
| `core/hf_inference.py` | HF Inference API client, model registry, temperature mapping |
|
| 136 |
+
| `core/preview_generator.py` | HTML preview generators for Stage 1 visual previews |
|
| 137 |
+
| `core/rule_engine.py` | Layer 1: Type scale, AA contrast, spacing grid, color stats |
|
| 138 |
+
| `core/benchmarks.py` | Benchmark definitions (Material Design 3, Apple HIG, etc.) |
|
| 139 |
+
| `core/extractor.py` | Playwright-based CSS token extraction |
|
| 140 |
+
| `core/discovery.py` | Page discovery via sitemap.xml / crawling |
|
| 141 |
+
|
| 142 |
+
---
|
| 143 |
+
|
| 144 |
+
## Configuration
|
| 145 |
+
|
| 146 |
+
### Environment Variables
|
| 147 |
+
|
| 148 |
+
| Variable | Default | Description |
|
| 149 |
+
|----------|---------|-------------|
|
| 150 |
+
| `HF_TOKEN` | (required) | HuggingFace API token |
|
| 151 |
+
| `BRAND_IDENTIFIER_MODEL` | `Qwen/Qwen2.5-72B-Instruct` | Model for AURORA |
|
| 152 |
+
| `BENCHMARK_ADVISOR_MODEL` | `meta-llama/Llama-3.3-70B-Instruct` | Model for ATLAS |
|
| 153 |
+
| `BEST_PRACTICES_MODEL` | `Qwen/Qwen2.5-72B-Instruct` | Model for SENTINEL |
|
| 154 |
+
| `HEAD_SYNTHESIZER_MODEL` | `meta-llama/Llama-3.3-70B-Instruct` | Model for NEXUS |
|
| 155 |
+
| `FALLBACK_MODEL` | `Qwen/Qwen2.5-7B-Instruct` | Fallback when primary fails |
|
| 156 |
+
| `HF_MAX_NEW_TOKENS` | `2048` | Max tokens per LLM response |
|
| 157 |
+
| `HF_TEMPERATURE` | `0.3` | Global default temperature |
|
| 158 |
+
| `MAX_PAGES` | `20` | Max pages to discover |
|
| 159 |
+
| `BROWSER_TIMEOUT` | `30000` | Playwright timeout (ms) |
|
| 160 |
+
|
| 161 |
+
### Model Override Examples
|
| 162 |
+
```bash
|
| 163 |
+
# Use Llama for all agents
|
| 164 |
+
export BRAND_IDENTIFIER_MODEL="meta-llama/Llama-3.3-70B-Instruct"
|
| 165 |
+
export BEST_PRACTICES_MODEL="meta-llama/Llama-3.3-70B-Instruct"
|
| 166 |
+
|
| 167 |
+
# Use budget models
|
| 168 |
+
export BRAND_IDENTIFIER_MODEL="Qwen/Qwen2.5-7B-Instruct"
|
| 169 |
+
export BENCHMARK_ADVISOR_MODEL="mistralai/Mixtral-8x7B-Instruct-v0.1"
|
| 170 |
+
```
|