Spaces:
Sleeping
Sleeping
| title: Design System Extractor v2 | |
| emoji: π¨ | |
| colorFrom: purple | |
| colorTo: blue | |
| sdk: docker | |
| pinned: false | |
| license: mit | |
| # Design System Extractor v2 | |
| > π¨ A semi-automated, human-in-the-loop agentic system that reverse-engineers design systems from live websites. | |
| ## π― What It Does | |
| When you have a website but no design system documentation (common when the original Sketch/Figma files are lost), this tool helps you: | |
| 1. **Crawl** your website to discover pages | |
| 2. **Extract** design tokens (colors, typography, spacing, shadows) | |
| 3. **Review** and validate extracted tokens with visual previews | |
| 4. **Upgrade** your system with modern best practices (optional) | |
| 5. **Export** production-ready JSON tokens for Figma/code | |
| ## π§ Philosophy | |
| This is **not a magic button** β it's a design-aware co-pilot. | |
| - **Agents propose β Humans decide** | |
| - **Every action is visible, reversible, and previewed** | |
| - **No irreversible automation** | |
| ## ποΈ Architecture | |
| ``` | |
| ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | |
| β TECH STACK β | |
| ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€ | |
| β Frontend: Gradio (interactive UI with live preview) β | |
| β Orchestration: LangGraph (agent workflow management) β | |
| β Models: Claude API (reasoning) + Rule-based β | |
| β Browser: Playwright (crawling & extraction) β | |
| β Hosting: Hugging Face Spaces β | |
| ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | |
| ``` | |
| ### Agent Personas | |
| | Agent | Persona | Job | | |
| |-------|---------|-----| | |
| | **Agent 1** | Design Archaeologist | Discover pages, extract raw tokens | | |
| | **Agent 2** | Design System Librarian | Normalize, dedupe, structure tokens | | |
| | **Agent 3** | Senior DS Architect | Recommend upgrades (type scales, spacing, a11y) | | |
| | **Agent 4** | Automation Engineer | Generate final JSON for Figma/code | | |
| ## π Quick Start | |
| ### Prerequisites | |
| - Python 3.11+ | |
| - Node.js (for some dependencies) | |
| ### Installation | |
| ```bash | |
| # Clone the repository | |
| git clone <repo-url> | |
| cd design-system-extractor | |
| # Create virtual environment | |
| python -m venv venv | |
| source venv/bin/activate # or `venv\Scripts\activate` on Windows | |
| # Install dependencies | |
| pip install -r requirements.txt | |
| # Install Playwright browsers | |
| playwright install chromium | |
| # Copy environment file | |
| cp config/.env.example config/.env | |
| # Edit .env and add your ANTHROPIC_API_KEY | |
| ``` | |
| ### Running | |
| ```bash | |
| python app.py | |
| ``` | |
| Open `http://localhost:7860` in your browser. | |
| ## π Usage Guide | |
| ### Stage 1: Discovery | |
| 1. Enter your website URL (e.g., `https://example.com`) | |
| 2. Click "Discover Pages" | |
| 3. Review discovered pages and select which to extract from | |
| 4. Ensure you have a mix of page types (homepage, listing, detail, etc.) | |
| ### Stage 2: Extraction | |
| 1. Choose viewport (Desktop 1440px or Mobile 375px) | |
| 2. Click "Extract Tokens" | |
| 3. Review extracted: | |
| - **Colors**: With frequency, context, and AA compliance | |
| - **Typography**: Font families, sizes, weights | |
| - **Spacing**: Values with 8px grid fit indicators | |
| 4. Accept or reject individual tokens | |
| ### Stage 3: Export | |
| 1. Review final token set | |
| 2. Export as JSON | |
| 3. Import into Figma via Tokens Studio or your plugin | |
| ## π Project Structure | |
| ``` | |
| design-system-extractor/ | |
| βββ app.py # Main Gradio application | |
| βββ requirements.txt | |
| βββ README.md | |
| β | |
| βββ config/ | |
| β βββ .env.example # Environment template | |
| β βββ agents.yaml # Agent personas & settings | |
| β βββ settings.py # Configuration loader | |
| β | |
| βββ agents/ | |
| β βββ state.py # LangGraph state definitions | |
| β βββ graph.py # Workflow orchestration | |
| β βββ crawler.py # Agent 1: Page discovery | |
| β βββ extractor.py # Agent 1: Token extraction | |
| β βββ normalizer.py # Agent 2: Normalization | |
| β βββ advisor.py # Agent 3: Best practices | |
| β βββ generator.py # Agent 4: JSON generation | |
| β | |
| βββ core/ | |
| β βββ token_schema.py # Pydantic data models | |
| β βββ color_utils.py # Color analysis utilities | |
| β | |
| βββ ui/ | |
| β βββ (Gradio components) | |
| β | |
| βββ docs/ | |
| βββ CONTEXT.md # Context file for AI assistance | |
| ``` | |
| ## π§ Configuration | |
| ### Environment Variables | |
| ```env | |
| # Required | |
| ANTHROPIC_API_KEY=your_key_here | |
| # Optional | |
| DEBUG=false | |
| LOG_LEVEL=INFO | |
| BROWSER_HEADLESS=true | |
| ``` | |
| ### Agent Configuration | |
| Agent personas and behavior are defined in `config/agents.yaml`. This includes: | |
| - Extraction targets (colors, typography, spacing) | |
| - Naming conventions | |
| - Confidence thresholds | |
| - Upgrade options | |
| ## π οΈ Development | |
| ### Running Tests | |
| ```bash | |
| pytest tests/ | |
| ``` | |
| ### Adding New Features | |
| 1. Update token schema in `core/token_schema.py` | |
| 2. Add agent logic in `agents/` | |
| 3. Update UI in `app.py` | |
| 4. Update `docs/CONTEXT.md` for AI assistance | |
| ## π¦ Output Format | |
| Tokens are exported in a platform-agnostic JSON format: | |
| ```json | |
| { | |
| "metadata": { | |
| "source_url": "https://example.com", | |
| "version": "v1-recovered", | |
| "viewport": "desktop" | |
| }, | |
| "colors": { | |
| "primary-500": { | |
| "value": "#007bff", | |
| "source": "detected", | |
| "contrast_white": 4.5 | |
| } | |
| }, | |
| "typography": { | |
| "heading-lg": { | |
| "fontFamily": "Inter", | |
| "fontSize": "24px", | |
| "fontWeight": 700 | |
| } | |
| }, | |
| "spacing": { | |
| "md": { | |
| "value": "16px", | |
| "source": "detected" | |
| } | |
| } | |
| } | |
| ``` | |
| ## π€ Contributing | |
| Contributions are welcome! Please read the contribution guidelines first. | |
| ## π License | |
| MIT | |
| --- | |
| Built with β€οΈ for designers who've lost their source files. | |