File size: 6,300 Bytes
a23a2ee 9f5ee50 a23a2ee 9f5ee50 a23a2ee 9f5ee50 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 |
---
title: Design System Extractor v2
emoji: π¨
colorFrom: purple
colorTo: blue
sdk: docker
pinned: false
license: mit
---
# Design System Extractor v2
> π¨ A semi-automated, human-in-the-loop agentic system that reverse-engineers design systems from live websites.
## π― What It Does
When you have a website but no design system documentation (common when the original Sketch/Figma files are lost), this tool helps you:
1. **Crawl** your website to discover pages
2. **Extract** design tokens (colors, typography, spacing, shadows)
3. **Review** and validate extracted tokens with visual previews
4. **Upgrade** your system with modern best practices (optional)
5. **Export** production-ready JSON tokens for Figma/code
## π§ Philosophy
This is **not a magic button** β it's a design-aware co-pilot.
- **Agents propose β Humans decide**
- **Every action is visible, reversible, and previewed**
- **No irreversible automation**
## ποΈ Architecture
```
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β TECH STACK β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β Frontend: Gradio (interactive UI with live preview) β
β Orchestration: LangGraph (agent workflow management) β
β Models: Claude API (reasoning) + Rule-based β
β Browser: Playwright (crawling & extraction) β
β Hosting: Hugging Face Spaces β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
```
### Agent Personas
| Agent | Persona | Job |
|-------|---------|-----|
| **Agent 1** | Design Archaeologist | Discover pages, extract raw tokens |
| **Agent 2** | Design System Librarian | Normalize, dedupe, structure tokens |
| **Agent 3** | Senior DS Architect | Recommend upgrades (type scales, spacing, a11y) |
| **Agent 4** | Automation Engineer | Generate final JSON for Figma/code |
## π Quick Start
### Prerequisites
- Python 3.11+
- Node.js (for some dependencies)
### Installation
```bash
# Clone the repository
git clone <repo-url>
cd design-system-extractor
# Create virtual environment
python -m venv venv
source venv/bin/activate # or `venv\Scripts\activate` on Windows
# Install dependencies
pip install -r requirements.txt
# Install Playwright browsers
playwright install chromium
# Copy environment file
cp config/.env.example config/.env
# Edit .env and add your ANTHROPIC_API_KEY
```
### Running
```bash
python app.py
```
Open `http://localhost:7860` in your browser.
## π Usage Guide
### Stage 1: Discovery
1. Enter your website URL (e.g., `https://example.com`)
2. Click "Discover Pages"
3. Review discovered pages and select which to extract from
4. Ensure you have a mix of page types (homepage, listing, detail, etc.)
### Stage 2: Extraction
1. Choose viewport (Desktop 1440px or Mobile 375px)
2. Click "Extract Tokens"
3. Review extracted:
- **Colors**: With frequency, context, and AA compliance
- **Typography**: Font families, sizes, weights
- **Spacing**: Values with 8px grid fit indicators
4. Accept or reject individual tokens
### Stage 3: Export
1. Review final token set
2. Export as JSON
3. Import into Figma via Tokens Studio or your plugin
## π Project Structure
```
design-system-extractor/
βββ app.py # Main Gradio application
βββ requirements.txt
βββ README.md
β
βββ config/
β βββ .env.example # Environment template
β βββ agents.yaml # Agent personas & settings
β βββ settings.py # Configuration loader
β
βββ agents/
β βββ state.py # LangGraph state definitions
β βββ graph.py # Workflow orchestration
β βββ crawler.py # Agent 1: Page discovery
β βββ extractor.py # Agent 1: Token extraction
β βββ normalizer.py # Agent 2: Normalization
β βββ advisor.py # Agent 3: Best practices
β βββ generator.py # Agent 4: JSON generation
β
βββ core/
β βββ token_schema.py # Pydantic data models
β βββ color_utils.py # Color analysis utilities
β
βββ ui/
β βββ (Gradio components)
β
βββ docs/
βββ CONTEXT.md # Context file for AI assistance
```
## π§ Configuration
### Environment Variables
```env
# Required
ANTHROPIC_API_KEY=your_key_here
# Optional
DEBUG=false
LOG_LEVEL=INFO
BROWSER_HEADLESS=true
```
### Agent Configuration
Agent personas and behavior are defined in `config/agents.yaml`. This includes:
- Extraction targets (colors, typography, spacing)
- Naming conventions
- Confidence thresholds
- Upgrade options
## π οΈ Development
### Running Tests
```bash
pytest tests/
```
### Adding New Features
1. Update token schema in `core/token_schema.py`
2. Add agent logic in `agents/`
3. Update UI in `app.py`
4. Update `docs/CONTEXT.md` for AI assistance
## π¦ Output Format
Tokens are exported in a platform-agnostic JSON format:
```json
{
"metadata": {
"source_url": "https://example.com",
"version": "v1-recovered",
"viewport": "desktop"
},
"colors": {
"primary-500": {
"value": "#007bff",
"source": "detected",
"contrast_white": 4.5
}
},
"typography": {
"heading-lg": {
"fontFamily": "Inter",
"fontSize": "24px",
"fontWeight": 700
}
},
"spacing": {
"md": {
"value": "16px",
"source": "detected"
}
}
}
```
## π€ Contributing
Contributions are welcome! Please read the contribution guidelines first.
## π License
MIT
---
Built with β€οΈ for designers who've lost their source files.
|