metadata
title: Design System Extractor v2
emoji: π¨
colorFrom: purple
colorTo: blue
sdk: docker
pinned: false
license: mit
Design System Extractor v2
π¨ A semi-automated, human-in-the-loop agentic system that reverse-engineers design systems from live websites.
π― What It Does
When you have a website but no design system documentation (common when the original Sketch/Figma files are lost), this tool helps you:
- Crawl your website to discover pages
- Extract design tokens (colors, typography, spacing, shadows)
- Review and validate extracted tokens with visual previews
- Upgrade your system with modern best practices (optional)
- Export production-ready JSON tokens for Figma/code
π§ Philosophy
This is not a magic button β it's a design-aware co-pilot.
- Agents propose β Humans decide
- Every action is visible, reversible, and previewed
- No irreversible automation
ποΈ Architecture
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β TECH STACK β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β Frontend: Gradio (interactive UI with live preview) β
β Orchestration: LangGraph (agent workflow management) β
β Models: Claude API (reasoning) + Rule-based β
β Browser: Playwright (crawling & extraction) β
β Hosting: Hugging Face Spaces β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Agent Personas
| Agent | Persona | Job |
|---|---|---|
| Agent 1 | Design Archaeologist | Discover pages, extract raw tokens |
| Agent 2 | Design System Librarian | Normalize, dedupe, structure tokens |
| Agent 3 | Senior DS Architect | Recommend upgrades (type scales, spacing, a11y) |
| Agent 4 | Automation Engineer | Generate final JSON for Figma/code |
π Quick Start
Prerequisites
- Python 3.11+
- Node.js (for some dependencies)
Installation
# Clone the repository
git clone <repo-url>
cd design-system-extractor
# Create virtual environment
python -m venv venv
source venv/bin/activate # or `venv\Scripts\activate` on Windows
# Install dependencies
pip install -r requirements.txt
# Install Playwright browsers
playwright install chromium
# Copy environment file
cp config/.env.example config/.env
# Edit .env and add your ANTHROPIC_API_KEY
Running
python app.py
Open http://localhost:7860 in your browser.
π Usage Guide
Stage 1: Discovery
- Enter your website URL (e.g.,
https://example.com) - Click "Discover Pages"
- Review discovered pages and select which to extract from
- Ensure you have a mix of page types (homepage, listing, detail, etc.)
Stage 2: Extraction
- Choose viewport (Desktop 1440px or Mobile 375px)
- Click "Extract Tokens"
- Review extracted:
- Colors: With frequency, context, and AA compliance
- Typography: Font families, sizes, weights
- Spacing: Values with 8px grid fit indicators
- Accept or reject individual tokens
Stage 3: Export
- Review final token set
- Export as JSON
- Import into Figma via Tokens Studio or your plugin
π Project Structure
design-system-extractor/
βββ app.py # Main Gradio application
βββ requirements.txt
βββ README.md
β
βββ config/
β βββ .env.example # Environment template
β βββ agents.yaml # Agent personas & settings
β βββ settings.py # Configuration loader
β
βββ agents/
β βββ state.py # LangGraph state definitions
β βββ graph.py # Workflow orchestration
β βββ crawler.py # Agent 1: Page discovery
β βββ extractor.py # Agent 1: Token extraction
β βββ normalizer.py # Agent 2: Normalization
β βββ advisor.py # Agent 3: Best practices
β βββ generator.py # Agent 4: JSON generation
β
βββ core/
β βββ token_schema.py # Pydantic data models
β βββ color_utils.py # Color analysis utilities
β
βββ ui/
β βββ (Gradio components)
β
βββ docs/
βββ CONTEXT.md # Context file for AI assistance
π§ Configuration
Environment Variables
# Required
ANTHROPIC_API_KEY=your_key_here
# Optional
DEBUG=false
LOG_LEVEL=INFO
BROWSER_HEADLESS=true
Agent Configuration
Agent personas and behavior are defined in config/agents.yaml. This includes:
- Extraction targets (colors, typography, spacing)
- Naming conventions
- Confidence thresholds
- Upgrade options
π οΈ Development
Running Tests
pytest tests/
Adding New Features
- Update token schema in
core/token_schema.py - Add agent logic in
agents/ - Update UI in
app.py - Update
docs/CONTEXT.mdfor AI assistance
π¦ Output Format
Tokens are exported in a platform-agnostic JSON format:
{
"metadata": {
"source_url": "https://example.com",
"version": "v1-recovered",
"viewport": "desktop"
},
"colors": {
"primary-500": {
"value": "#007bff",
"source": "detected",
"contrast_white": 4.5
}
},
"typography": {
"heading-lg": {
"fontFamily": "Inter",
"fontSize": "24px",
"fontWeight": 700
}
},
"spacing": {
"md": {
"value": "16px",
"source": "detected"
}
}
}
π€ Contributing
Contributions are welcome! Please read the contribution guidelines first.
π License
MIT
Built with β€οΈ for designers who've lost their source files.