Spaces:

vanishingradient
/

SCoDA

Sleeping

App Files Files Community

SCoDA / README.md

vanishingradient

Modified emoji

60e42af 7 days ago

preview code

raw

history blame contribute delete

5.71 kB

	---
	title: SCoDA
	emoji: 🎨
	colorFrom: indigo
	colorTo: indigo
	sdk: gradio
	sdk_version: 6.5.1
	app_file: app.py
	pinned: false
	license: mit
	---

	# CoDA: Collaborative Data Visualization Agents

	A production-grade multi-agent system for automated data visualization from natural language queries.

	[![Hugging Face Spaces](https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-Spaces-blue)](https://huggingface.co/spaces)
	[![Python 3.10+](https://img.shields.io/badge/python-3.10+-blue.svg)](https://www.python.org/downloads/)
	[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)

	## Overview

	CoDA reframes data visualization as a collaborative multi-agent problem. Instead of treating it as a monolithic task, CoDA employs specialized LLM agents that work together:

	- Query Analyzer - Interprets natural language and extracts visualization intent
	- Data Processor - Extracts metadata without token-heavy data loading
	- VizMapping Agent - Maps semantics to visualization primitives
	- Search Agent - Retrieves relevant code patterns
	- Design Explorer - Generates aesthetic specifications
	- Code Generator - Synthesizes executable Python code
	- Debug Agent - Executes code and fixes errors
	- Visual Evaluator - Assesses quality and triggers refinement

	## Quick Start

	### Installation

	```bash
	# Clone the repository
	git clone https://github.com/yourusername/CoDA.git
	cd CoDA

	# Install dependencies
	pip install -r requirements.txt

	# Configure API key
	cp .env.example .env
	# Edit .env and add your GROQ_API_KEY
	```

	### Usage

	#### Web Interface (Gradio)

	```bash
	python app.py
	```

	Open http://localhost:7860 in your browser.

	#### Command Line

	```bash
	python main.py --query "Create a bar chart of sales by category" --data sales.csv
	```

	Options:
	- `-q, --query`: Visualization query (required)
	- `-d, --data`: Data file path(s) (required)
	- `-o, --output`: Output directory (default: outputs)
	- `--max-iterations`: Refinement iterations (default: 3)
	- `--min-score`: Quality threshold (default: 7.0)

	### Python API

	```python
	from coda.orchestrator import CodaOrchestrator

	orchestrator = CodaOrchestrator()
	result = orchestrator.run(
	query="Show sales trends over time",
	data_paths=["sales_data.csv"]
	)

	if result.success:
	print(f"Visualization saved to: {result.output_file}")
	print(f"Quality Score: {result.scores['overall']}/10")
	```

	## Hugging Face Spaces Deployment

	1. Create a new Space on [Hugging Face](https://huggingface.co/new-space)
	2. Select "Gradio" as the SDK
	3. Upload all files from this repository
	4. Add `GROQ_API_KEY` as a Secret in Space Settings
	5. The Space will automatically build and deploy

	## Architecture

	```
	Natural Language Query + Data Files
	│
	▼
	┌───────────────┐
	│ Query Analyzer │ ─── Extracts intent, TODO list
	└───────────────┘
	│
	▼
	┌───────────────┐
	│ Data Processor │ ─── Metadata extraction (no full load)
	└───────────────┘
	│
	▼
	┌───────────────┐
	│ VizMapping │ ─── Chart type, encodings
	└───────────────┘
	│
	▼
	┌───────────────┐
	│ Search Agent │ ─── Code examples
	└───────────────┘
	│
	▼
	┌───────────────┐
	│Design Explorer│ ─── Colors, layout, styling
	└───────────────┘
	│
	▼
	┌───────────────┐
	│Code Generator │ ─── Python visualization code
	└───────────────┘
	│
	▼
	┌───────────────┐
	│ Debug Agent │ ─── Execute & fix errors
	└───────────────┘
	│
	▼
	┌───────────────┐
	│Visual Evaluator│ ─── Quality assessment
	└───────────────┘
	│
	───────┴───────
	↓ Feedback Loop ↓
	(if quality < threshold)
	```

	## Configuration

	Environment variables (in `.env`):

	\| Variable \| Default \| Description \|
	\|----------\|---------\|-------------\|
	\| `GROQ_API_KEY` \| Required \| Your Groq API key \|
	\| `CODA_DEFAULT_MODEL` \| llama-3.3-70b-versatile \| Text model \|
	\| `CODA_VISION_MODEL` \| llama-3.2-90b-vision-preview \| Vision model \|
	\| `CODA_MIN_OVERALL_SCORE` \| 7.0 \| Quality threshold \|
	\| `CODA_MAX_ITERATIONS` \| 3 \| Max refinement loops \|

	## Supported Data Formats

	- CSV (`.csv`)
	- JSON (`.json`)
	- Excel (`.xlsx`, `.xls`)
	- Parquet (`.parquet`)

	## Requirements

	- Python 3.10+
	- Groq API key ([Get one free](https://console.groq.com))

	## License

	MIT License - See LICENSE for details.

	## Citation

	If you use CoDA in your research, please cite:

	```bibtex
	@article{chen2025coda,
	title={CoDA: Agentic Systems for Collaborative Data Visualization},
	author={Chen, Zichen and Chen, Jiefeng and Arik, Sercan {\"O}. and Sra, Misha and Pfister, Tomas and Yoon, Jinsung},
	journal={arXiv preprint arXiv:2510.03194},
	year={2025},
	url={https://arxiv.org/abs/2510.03194},
	doi={10.48550/arXiv.2510.03194}
	}
	```

	Paper: [arXiv:2510.03194](https://arxiv.org/abs/2510.03194)