riazmo's picture
Upload 20 files
9f5ee50 verified
metadata
title: Design System Extractor v2
emoji: 🎨
colorFrom: purple
colorTo: blue
sdk: docker
pinned: false
license: mit

Design System Extractor v2

🎨 A semi-automated, human-in-the-loop agentic system that reverse-engineers design systems from live websites.

🎯 What It Does

When you have a website but no design system documentation (common when the original Sketch/Figma files are lost), this tool helps you:

  1. Crawl your website to discover pages
  2. Extract design tokens (colors, typography, spacing, shadows)
  3. Review and validate extracted tokens with visual previews
  4. Upgrade your system with modern best practices (optional)
  5. Export production-ready JSON tokens for Figma/code

🧠 Philosophy

This is not a magic button β€” it's a design-aware co-pilot.

  • Agents propose β†’ Humans decide
  • Every action is visible, reversible, and previewed
  • No irreversible automation

πŸ—οΈ Architecture

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                        TECH STACK                            β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚  Frontend:       Gradio (interactive UI with live preview)   β”‚
β”‚  Orchestration:  LangGraph (agent workflow management)       β”‚
β”‚  Models:         Claude API (reasoning) + Rule-based         β”‚
β”‚  Browser:        Playwright (crawling & extraction)          β”‚
β”‚  Hosting:        Hugging Face Spaces                         β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Agent Personas

Agent Persona Job
Agent 1 Design Archaeologist Discover pages, extract raw tokens
Agent 2 Design System Librarian Normalize, dedupe, structure tokens
Agent 3 Senior DS Architect Recommend upgrades (type scales, spacing, a11y)
Agent 4 Automation Engineer Generate final JSON for Figma/code

πŸš€ Quick Start

Prerequisites

  • Python 3.11+
  • Node.js (for some dependencies)

Installation

# Clone the repository
git clone <repo-url>
cd design-system-extractor

# Create virtual environment
python -m venv venv
source venv/bin/activate  # or `venv\Scripts\activate` on Windows

# Install dependencies
pip install -r requirements.txt

# Install Playwright browsers
playwright install chromium

# Copy environment file
cp config/.env.example config/.env
# Edit .env and add your ANTHROPIC_API_KEY

Running

python app.py

Open http://localhost:7860 in your browser.

πŸ“– Usage Guide

Stage 1: Discovery

  1. Enter your website URL (e.g., https://example.com)
  2. Click "Discover Pages"
  3. Review discovered pages and select which to extract from
  4. Ensure you have a mix of page types (homepage, listing, detail, etc.)

Stage 2: Extraction

  1. Choose viewport (Desktop 1440px or Mobile 375px)
  2. Click "Extract Tokens"
  3. Review extracted:
    • Colors: With frequency, context, and AA compliance
    • Typography: Font families, sizes, weights
    • Spacing: Values with 8px grid fit indicators
  4. Accept or reject individual tokens

Stage 3: Export

  1. Review final token set
  2. Export as JSON
  3. Import into Figma via Tokens Studio or your plugin

πŸ“ Project Structure

design-system-extractor/
β”œβ”€β”€ app.py                          # Main Gradio application
β”œβ”€β”€ requirements.txt
β”œβ”€β”€ README.md
β”‚
β”œβ”€β”€ config/
β”‚   β”œβ”€β”€ .env.example                # Environment template
β”‚   β”œβ”€β”€ agents.yaml                 # Agent personas & settings
β”‚   └── settings.py                 # Configuration loader
β”‚
β”œβ”€β”€ agents/
β”‚   β”œβ”€β”€ state.py                    # LangGraph state definitions
β”‚   β”œβ”€β”€ graph.py                    # Workflow orchestration
β”‚   β”œβ”€β”€ crawler.py                  # Agent 1: Page discovery
β”‚   β”œβ”€β”€ extractor.py                # Agent 1: Token extraction
β”‚   β”œβ”€β”€ normalizer.py               # Agent 2: Normalization
β”‚   β”œβ”€β”€ advisor.py                  # Agent 3: Best practices
β”‚   └── generator.py                # Agent 4: JSON generation
β”‚
β”œβ”€β”€ core/
β”‚   β”œβ”€β”€ token_schema.py             # Pydantic data models
β”‚   └── color_utils.py              # Color analysis utilities
β”‚
β”œβ”€β”€ ui/
β”‚   └── (Gradio components)
β”‚
└── docs/
    └── CONTEXT.md                  # Context file for AI assistance

πŸ”§ Configuration

Environment Variables

# Required
ANTHROPIC_API_KEY=your_key_here

# Optional
DEBUG=false
LOG_LEVEL=INFO
BROWSER_HEADLESS=true

Agent Configuration

Agent personas and behavior are defined in config/agents.yaml. This includes:

  • Extraction targets (colors, typography, spacing)
  • Naming conventions
  • Confidence thresholds
  • Upgrade options

πŸ› οΈ Development

Running Tests

pytest tests/

Adding New Features

  1. Update token schema in core/token_schema.py
  2. Add agent logic in agents/
  3. Update UI in app.py
  4. Update docs/CONTEXT.md for AI assistance

πŸ“¦ Output Format

Tokens are exported in a platform-agnostic JSON format:

{
  "metadata": {
    "source_url": "https://example.com",
    "version": "v1-recovered",
    "viewport": "desktop"
  },
  "colors": {
    "primary-500": {
      "value": "#007bff",
      "source": "detected",
      "contrast_white": 4.5
    }
  },
  "typography": {
    "heading-lg": {
      "fontFamily": "Inter",
      "fontSize": "24px",
      "fontWeight": 700
    }
  },
  "spacing": {
    "md": {
      "value": "16px",
      "source": "detected"
    }
  }
}

🀝 Contributing

Contributions are welcome! Please read the contribution guidelines first.

πŸ“„ License

MIT


Built with ❀️ for designers who've lost their source files.