open-navigator / README.md
jcbowyer's picture
Clean HuggingFace deployment without binary files
61d29fc
---
title: Open Navigator
emoji: πŸ›οΈ
colorFrom: blue
colorTo: green
sdk: docker
app_port: 7860
pinned: false
license: apache-2.0
---
# πŸ›οΈ Open Navigator
> **CommunityOne: The open path to everything local**
>
> AI-powered civic engagement platform with React + FastAPI web interface
[![License: Apache 2.0](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](https://opensource.org/licenses/Apache-2.0)
[![Python 3.11+](https://img.shields.io/badge/python-3.11+-blue.svg)](https://www.python.org/downloads/)
[![React](https://img.shields.io/badge/React-18.2-61DAFB.svg)](https://reactjs.org)
[![FastAPI](https://img.shields.io/badge/FastAPI-0.109.0-009688.svg)](https://fastapi.tiangolo.com)
## οΏ½ Quick Links
**[βš›οΈ Open Navigator β†’](https://www.communityone.com)** - **LIVE APPLICATION** (search, filters, heatmap, data exploration)
**[πŸ“– Documentation β†’](https://www.communityone.com/docs)** - Complete guides, architecture, and feature details
The documentation site includes:
- Features and capabilities
- Data sources and integrations
- Architecture and deployment options
- Policy topics and advocacy tools
- API reference and examples
---
## Quick Start
### Three Services
This project runs three separate services:
| Service | Port (Local) | Live URL | Description |
|---------|------|----------|-------------|
| **βš›οΈ Open Navigator** πŸš€ | 5173 | [www.communityone.com](https://www.communityone.com) | **MAIN APPLICATION** - Search, filters, heatmap, data exploration |
| **πŸ“š Documentation** | 3000 | [www.communityone.com/docs](https://www.communityone.com/docs) | Docusaurus site with complete guides and tutorials |
| **πŸ”₯ API Backend** | 8000 | [www.communityone.com/api](https://www.communityone.com/api) | FastAPI server with AI agents |
> **πŸ’‘ LIVE DEMO:** Visit **[www.communityone.com](https://www.communityone.com)** to use the application!
>
> **πŸ’» LOCAL DEV:** After running `./start-all.sh`, visit **http://localhost:5173**
## πŸš€ Deployment
**Deploy to Hugging Face Spaces** in 3 commands:
```bash
echo "HF_USERNAME=your_username" >> .env
./deploy-huggingface.sh
# Configure hardware and secrets at https://huggingface.co/spaces/YOUR_USERNAME/www.communityone.com
```
**Full deployment guides:**
- **[Hugging Face Spaces](website/docs/deployment/huggingface-spaces.md)** - Docker deployment (~$22/month)
- **[Databricks Apps](website/docs/deployment/databricks-apps.md)** - Enterprise deployment
- **[Local Development](website/docs/deployment/)** - Complete deployment documentation
The `deploy-huggingface.sh` script automatically:
- βœ… Tests builds locally (catches errors before pushing)
- βœ… Creates the Space on Hugging Face
- βœ… Pushes code and triggers automatic build (~10-15 min)
### Prerequisites
- Python 3.11+
- Node.js 18+
- Docker (optional)
- OpenAI API key
### Installation
**Option 1: Start Everything at Once (Recommended)**
```bash
# Clone repository
git clone https://github.com/getcommunityone/open-navigator.git
cd open-navigator
# Install dependencies
./install.sh # Python backend
cd frontend && npm install && cd .. # React app
cd website && npm install && cd .. # Documentation
# Setup git hooks for build protection (one-time)
./setup-git-hooks.sh
# Start all services in tmux
./start-all.sh
```
**Option 2: Using Makefile**
```bash
# Install
make install
make install-frontend
make install-docs
# Start all services
make start-all
# Or individually:
make dev # API only
make dev-frontend # React app only
make dev-docs # Docs only
```
**Option 3: Manual Setup**
```bash
# Python backend
python3 -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt
# React app
cd frontend && npm install && cd ..
# Documentation
cd website && npm install && cd ..
# Configure environment
cp .env.example .env
# Edit .env with your API keys
# Start services (separate terminals)
source .venv/bin/activate && python main.py serve # Terminal 1
cd frontend && npm run dev # Terminal 2
cd website && npm start # Terminal 3
```
### Access Points
**🌐 LIVE APPLICATION:**
- **πŸš€ Open Navigator:** https://www.communityone.com - Main application
- πŸ“š **Documentation:** https://www.communityone.com/docs - Guides and API reference
- πŸ”₯ **API Docs:** https://www.communityone.com/api/docs - FastAPI interactive documentation
**πŸ’» LOCAL DEVELOPMENT:**
- **πŸš€ Main App:** http://localhost:5173
- πŸ“š **Documentation:** http://localhost:3000
- πŸ”₯ **API Docs:** http://localhost:8000/docs
### Stop Services
```bash
./stop-all.sh
# or
make stop-all
```
---
## Usage
### Command Line Interface
Always activate the virtual environment first:
```bash
source .venv/bin/activate
```
**API Server**
```bash
python main.py serve --host 0.0.0.0 --port 8000
```
**Jurisdiction Discovery**
```bash
# Test run
python main.py discover-jurisdictions --limit 100
# Single state
python main.py discover-jurisdictions --state CA
# Full discovery (~30k jurisdictions)
python main.py discover-jurisdictions
# View statistics
python main.py discovery-stats
```
**Data Ingestion**
```bash
# Census data (90,000+ jurisdictions)
python -m discovery.census_ingestion
# NCES school districts (13,000+)
python -m discovery.nces_ingestion
# Pre-built meeting datasets
python discovery/meetingbank_ingestion.py
python discovery/city_scrapers_urls.py
python discovery/openstates_sources.py
# LocalView (requires Dataverse API key)
python discovery/localview_ingestion.py
```
**Scraping & Analysis**
```bash
# Scrape batch from discovered sites
python main.py scrape-batch --source discovered --limit 50
# Scrape single source
python main.py scrape --url "https://city.legistar.com" \
--state "CA" \
--municipality "San Francisco"
# Run analysis pipeline
python main.py analyze --targets-file examples/targets.json
# Generate heatmap
python main.py generate-heatmap --output heatmap.html
```
**Publishing Datasets**
```bash
# Publish to HuggingFace (requires HUGGINGFACE_TOKEN in .env)
python main.py publish-to-hf --dataset all
python main.py publish-to-hf --dataset discovered-urls
python main.py publish-to-hf --dataset census --sample
```
### API Usage
**Start a workflow:**
```bash
curl -X POST "http://localhost:8000/workflow/start" \
-H "Content-Type: application/json" \
-d '{
"scrape_targets": [
{
"url": "https://example-city.legistar.com",
"municipality": "Example City",
"state": "CA",
"platform": "legistar"
}
]
}'
```
**Query opportunities:**
```bash
curl "http://localhost:8000/opportunities?state=CA&urgency=critical"
```
**Get heatmap:**
```bash
curl "http://localhost:8000/heatmap" > heatmap.html
```
### Python API
```python
import asyncio
from agents.orchestrator import OrchestratorAgent
from agents.scraper import ScraperAgent
from agents.parser import ParserAgent
from agents.classifier import ClassifierAgent
# Initialize orchestrator
orchestrator = OrchestratorAgent()
# Register agents
orchestrator.register_agent(ScraperAgent())
orchestrator.register_agent(ParserAgent())
orchestrator.register_agent(ClassifierAgent())
# Execute pipeline
targets = [
{
"url": "https://city.legistar.com",
"municipality": "Example City",
"state": "CA",
"platform": "legistar"
}
]
results = await orchestrator.execute_pipeline(targets)
```
---
## Project Structure
```
open-navigator/
β”œβ”€β”€ agents/ # Multi-agent AI system
β”œβ”€β”€ api/ # FastAPI application
β”œβ”€β”€ frontend/ # React application (Open Navigator)
β”œβ”€β”€ website/ # Docusaurus documentation
β”œβ”€β”€ discovery/ # Data discovery modules
β”œβ”€β”€ extraction/ # Document extraction
β”œβ”€β”€ pipeline/ # Data pipeline components
β”œβ”€β”€ visualization/ # Heatmap and charts
β”œβ”€β”€ config/ # Configuration
β”œβ”€β”€ tests/ # Test suite
β”œβ”€β”€ main.py # CLI entry point
└── requirements.txt # Python dependencies
```
---
## Deployment Options
### 1. Databricks Apps (Production)
```bash
export DATABRICKS_HOST=https://your-workspace.cloud.databricks.com
export DATABRICKS_TOKEN=dapi...
export OPENAI_API_KEY=sk-...
./scripts/deploy-databricks-app.sh
```
See [DATABRICKS_APP_GUIDE.md](DATABRICKS_APP_GUIDE.md) for details.
### 2. Docker
```bash
docker-compose up -d
```
Starts:
- API server (port 8000)
- Qdrant vector database (port 6333)
- Jupyter notebook (port 8888)
### 3. Local Development
See [Quick Start](#quick-start) above.
---
## ⚑ Intel Arc GPU Optimization
**Run Llama 4 at NVIDIA-like speeds on Intel Arc integrated graphics!**
If you have **Intel Core Ultra 7** (or similar) with Arc Graphics + NPU, you can use **DuckDB + VSS** for 10-50x faster legislative analysis:
```bash
# Setup Intel-optimized environment
./scripts/intel_llm_setup.sh
source .venv-intel/bin/activate
# Run DuckDB vector search demo
python scripts/duckdb_vss_demo.py
# Run legislative analysis with LLM
python scripts/legislative_analysis_intel.py
```
**Why DuckDB for Local AI?**
- ⚑ **10-50x faster** than Postgres for context injection
- 🎯 **< 20ms** vector similarity search across 10K bills
- 🧠 **Embedded** - no server needed, runs locally
- πŸ€— **Hugging Face Integration** - query HF datasets directly
**Performance:**
- **Context Injection**: 20ms vs 500ms (Postgres) = **25x faster**
- **LLM Inference**: 1,200 tok/s (Arc GPU) vs 350 tok/s (CPU) = **3.4x faster**
- **Vector Search**: 18ms vs 800ms = **44x faster**
**Features:**
- Extract interest groups from legislative testimony
- Identify lobbyists and their positions
- Analyze support/oppose scores with confidence
- Detect tradeoffs and compromises
**See full guide:** [Intel Arc Optimization Guide](website/docs/guides/intel-arc-optimization.md)
---
## πŸ€– AI Integration (MCP Server)
**Connect your civic data to Claude and other AI assistants!**
Open Navigator includes a **Model Context Protocol (MCP)** server that lets AI assistants directly access your data:
```bash
# Install MCP dependencies
pip install mcp anthropic-mcp-sdk
# Run the server
python scripts/mcp/open_navigator_server.py
```
**What AI assistants can do:**
- πŸ›οΈ Search 90,000+ jurisdictions by name or location
- 🏒 Query 1.8M nonprofits with Form 990 data
- πŸ“œ Semantic search across 4.5M+ legislative documents
- πŸ“Š Get real-time statistics and analytics
- πŸ” Vector search meetings and bills with natural language
**Example queries to Claude:**
> "Find all cities named Springfield in the database"
> "Show me 501c3 nonprofits in San Francisco focused on education"
> "What bills related to oral health were introduced in California?"
**Configure Claude Desktop:**
Add to `~/.config/Claude/claude_desktop_config.json`:
```json
{
"mcpServers": {
"open-navigator": {
"command": "python",
"args": ["/path/to/open-navigator/scripts/mcp/open_navigator_server.py"],
"env": {
"DATABASE_URL": "postgresql://postgres:password@localhost:5433/open_navigator"
}
}
}
}
```
**See full guide:** [MCP Server Documentation](website/docs/integrations/mcp-server.md)
---
## Testing
```bash
# Run all tests
pytest
# With coverage
pytest --cov=agents --cov=pipeline --cov=visualization
# Specific test file
pytest tests/test_agents.py
```
---
## Configuration
Create `.env` file:
```bash
# OpenAI
OPENAI_API_KEY=sk-...
# Databricks (optional)
DATABRICKS_HOST=https://your-workspace.cloud.databricks.com
DATABRICKS_TOKEN=dapi...
# HuggingFace (optional)
HUGGINGFACE_TOKEN=hf_...
# Dataverse (optional)
DATAVERSE_API_KEY=...
```
---
## Contributing
Contributions are welcome! Please:
1. Fork the repository
2. Create a feature branch
3. Make your changes
4. Add tests
5. Submit a pull request
See [CONTRIBUTING.md](CONTRIBUTING.md) for details.
---
## Documentation
- **[Full Documentation](http://localhost:3000)** - Complete guides and API reference
- **[Architecture](http://localhost:3000/docs/architecture)** - System architecture overview
- **[Quick Start](http://localhost:3000/docs/quickstart)** - Detailed setup instructions
- **[Quick Reference](http://localhost:3000/docs/quick-reference)** - Command reference card
- **[MCP Server](http://localhost:3000/docs/integrations/mcp-server)** - AI assistant integration guide
- **[Deployment](http://localhost:3000/docs/deployment/databricks-apps)** - Production deployment guides
- **[Case Studies](http://localhost:3000/docs/case-studies/tuscaloosa-complete)** - Real-world examples
- [CONTRIBUTING.md](CONTRIBUTING.md) - How to contribute
---
## Citations
This project uses several open datasets and research contributions. **Please see [CITATIONS.md](CITATIONS.md) for complete citation information.**
**Key Dataset:**
- **MeetingBank**: Hu et al., "MeetingBank: A Benchmark Dataset for Meeting Summarization", ACL 2023
- Used for meeting discovery and analysis
- 1,366 city council meetings from 6 U.S. cities
- See [CITATIONS.md](CITATIONS.md) for full citation and BibTeX
---
## License
Apache License 2.0 - see [LICENSE](LICENSE) file for details.
---
## Support
- GitHub Issues: [github.com/getcommunityone/open-navigator-for-engagement/issues](https://github.com/getcommunityone/open-navigator-for-engagement/issues)
- Email: johnbowyer@communityone.com
---
**Note**: This system is designed to support advocacy efforts. All generated content should be reviewed by humans before use.