Spaces:

riazmo
/

Design-System-Extractor-2

Running

App Files Files Community

Design-System-Extractor-2 / README.md

riazmo

Upload 20 files

9f5ee50 verified 5 days ago

preview code

raw

history blame contribute delete

6.3 kB

	---
	title: Design System Extractor v2
	emoji: 🎨
	colorFrom: purple
	colorTo: blue
	sdk: docker
	pinned: false
	license: mit
	---

	# Design System Extractor v2

	> 🎨 A semi-automated, human-in-the-loop agentic system that reverse-engineers design systems from live websites.

	## 🎯 What It Does

	When you have a website but no design system documentation (common when the original Sketch/Figma files are lost), this tool helps you:

	1. Crawl your website to discover pages
	2. Extract design tokens (colors, typography, spacing, shadows)
	3. Review and validate extracted tokens with visual previews
	4. Upgrade your system with modern best practices (optional)
	5. Export production-ready JSON tokens for Figma/code

	## 🧠 Philosophy

	This is not a magic button — it's a design-aware co-pilot.

	- Agents propose → Humans decide
	- Every action is visible, reversible, and previewed
	- No irreversible automation

	## 🏗️ Architecture

	```
	┌──────────────────────────────────────────────────────────────┐
	│ TECH STACK │
	├──────────────────────────────────────────────────────────────┤
	│ Frontend: Gradio (interactive UI with live preview) │
	│ Orchestration: LangGraph (agent workflow management) │
	│ Models: Claude API (reasoning) + Rule-based │
	│ Browser: Playwright (crawling & extraction) │
	│ Hosting: Hugging Face Spaces │
	└──────────────────────────────────────────────────────────────┘
	```

	### Agent Personas

	\| Agent \| Persona \| Job \|
	\|-------\|---------\|-----\|
	\| Agent 1 \| Design Archaeologist \| Discover pages, extract raw tokens \|
	\| Agent 2 \| Design System Librarian \| Normalize, dedupe, structure tokens \|
	\| Agent 3 \| Senior DS Architect \| Recommend upgrades (type scales, spacing, a11y) \|
	\| Agent 4 \| Automation Engineer \| Generate final JSON for Figma/code \|

	## 🚀 Quick Start

	### Prerequisites

	- Python 3.11+
	- Node.js (for some dependencies)

	### Installation

	```bash
	# Clone the repository
	git clone <repo-url>
	cd design-system-extractor

	# Create virtual environment
	python -m venv venv
	source venv/bin/activate # or `venv\Scripts\activate` on Windows

	# Install dependencies
	pip install -r requirements.txt

	# Install Playwright browsers
	playwright install chromium

	# Copy environment file
	cp config/.env.example config/.env
	# Edit .env and add your ANTHROPIC_API_KEY
	```

	### Running

	```bash
	python app.py
	```

	Open `http://localhost:7860` in your browser.

	## 📖 Usage Guide

	### Stage 1: Discovery

	1. Enter your website URL (e.g., `https://example.com`)
	2. Click "Discover Pages"
	3. Review discovered pages and select which to extract from
	4. Ensure you have a mix of page types (homepage, listing, detail, etc.)

	### Stage 2: Extraction

	1. Choose viewport (Desktop 1440px or Mobile 375px)
	2. Click "Extract Tokens"
	3. Review extracted:
	- Colors: With frequency, context, and AA compliance
	- Typography: Font families, sizes, weights
	- Spacing: Values with 8px grid fit indicators
	4. Accept or reject individual tokens

	### Stage 3: Export

	1. Review final token set
	2. Export as JSON
	3. Import into Figma via Tokens Studio or your plugin

	## 📁 Project Structure

	```
	design-system-extractor/
	├── app.py # Main Gradio application
	├── requirements.txt
	├── README.md
	│
	├── config/
	│ ├── .env.example # Environment template
	│ ├── agents.yaml # Agent personas & settings
	│ └── settings.py # Configuration loader
	│
	├── agents/
	│ ├── state.py # LangGraph state definitions
	│ ├── graph.py # Workflow orchestration
	│ ├── crawler.py # Agent 1: Page discovery
	│ ├── extractor.py # Agent 1: Token extraction
	│ ├── normalizer.py # Agent 2: Normalization
	│ ├── advisor.py # Agent 3: Best practices
	│ └── generator.py # Agent 4: JSON generation
	│
	├── core/
	│ ├── token_schema.py # Pydantic data models
	│ └── color_utils.py # Color analysis utilities
	│
	├── ui/
	│ └── (Gradio components)
	│
	└── docs/
	└── CONTEXT.md # Context file for AI assistance
	```

	## 🔧 Configuration

	### Environment Variables

	```env
	# Required
	ANTHROPIC_API_KEY=your_key_here

	# Optional
	DEBUG=false
	LOG_LEVEL=INFO
	BROWSER_HEADLESS=true
	```

	### Agent Configuration

	Agent personas and behavior are defined in `config/agents.yaml`. This includes:

	- Extraction targets (colors, typography, spacing)
	- Naming conventions
	- Confidence thresholds
	- Upgrade options

	## 🛠️ Development

	### Running Tests

	```bash
	pytest tests/
	```

	### Adding New Features

	1. Update token schema in `core/token_schema.py`
	2. Add agent logic in `agents/`
	3. Update UI in `app.py`
	4. Update `docs/CONTEXT.md` for AI assistance

	## 📦 Output Format

	Tokens are exported in a platform-agnostic JSON format:

	```json
	{
	"metadata": {
	"source_url": "https://example.com",
	"version": "v1-recovered",
	"viewport": "desktop"
	},
	"colors": {
	"primary-500": {
	"value": "#007bff",
	"source": "detected",
	"contrast_white": 4.5
	}
	},
	"typography": {
	"heading-lg": {
	"fontFamily": "Inter",
	"fontSize": "24px",
	"fontWeight": 700
	}
	},
	"spacing": {
	"md": {
	"value": "16px",
	"source": "detected"
	}
	}
	}
	```

	## 🤝 Contributing

	Contributions are welcome! Please read the contribution guidelines first.

	## 📄 License

	MIT

	---

	Built with ❤️ for designers who've lost their source files.