Spaces:

CommunityOne
/

open-navigator

Running on CPU Upgrade

App Files Files Community

open-navigator / README.md

jcbowyer

Clean HuggingFace deployment without binary files

61d29fc 28 days ago

preview code

raw

history blame contribute delete

13.8 kB

metadata

title: Open Navigator
emoji: 🏛️
colorFrom: blue
colorTo: green
sdk: docker
app_port: 7860
pinned: false
license: apache-2.0

🏛️ Open Navigator

CommunityOne: The open path to everything local

AI-powered civic engagement platform with React + FastAPI web interface

� Quick Links

⚛️ Open Navigator → - LIVE APPLICATION (search, filters, heatmap, data exploration)

📖 Documentation → - Complete guides, architecture, and feature details

The documentation site includes:

Features and capabilities
Data sources and integrations
Architecture and deployment options
Policy topics and advocacy tools
API reference and examples

Quick Start

Three Services

This project runs three separate services:

Service	Port (Local)	Live URL	Description
⚛️ Open Navigator 🚀	5173	www.communityone.com	MAIN APPLICATION - Search, filters, heatmap, data exploration
📚 Documentation	3000	www.communityone.com/docs	Docusaurus site with complete guides and tutorials
🔥 API Backend	8000	www.communityone.com/api	FastAPI server with AI agents

💡 LIVE DEMO: Visit www.communityone.com to use the application!

💻 LOCAL DEV: After running ./start-all.sh, visit http://localhost:5173

🚀 Deployment

Deploy to Hugging Face Spaces in 3 commands:

echo "HF_USERNAME=your_username" >> .env
./deploy-huggingface.sh
# Configure hardware and secrets at https://huggingface.co/spaces/YOUR_USERNAME/www.communityone.com

Full deployment guides:

Hugging Face Spaces - Docker deployment (~$22/month)
Databricks Apps - Enterprise deployment
Local Development - Complete deployment documentation

The deploy-huggingface.sh script automatically:

✅ Tests builds locally (catches errors before pushing)
✅ Creates the Space on Hugging Face
✅ Pushes code and triggers automatic build (~10-15 min)

Prerequisites

Python 3.11+
Node.js 18+
Docker (optional)
OpenAI API key

Installation

Option 1: Start Everything at Once (Recommended)

# Clone repository
git clone https://github.com/getcommunityone/open-navigator.git
cd open-navigator

# Install dependencies
./install.sh                          # Python backend
cd frontend && npm install && cd ..   # React app
cd website && npm install && cd ..    # Documentation

# Setup git hooks for build protection (one-time)
./setup-git-hooks.sh

# Start all services in tmux
./start-all.sh

Option 2: Using Makefile

# Install
make install
make install-frontend
make install-docs

# Start all services
make start-all

# Or individually:
make dev           # API only
make dev-frontend  # React app only
make dev-docs      # Docs only

Option 3: Manual Setup

# Python backend
python3 -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt

# React app
cd frontend && npm install && cd ..

# Documentation
cd website && npm install && cd ..

# Configure environment
cp .env.example .env
# Edit .env with your API keys

# Start services (separate terminals)
source .venv/bin/activate && python main.py serve  # Terminal 1
cd frontend && npm run dev                          # Terminal 2
cd website && npm start                             # Terminal 3

Access Points

🌐 LIVE APPLICATION:

🚀 Open Navigator: https://www.communityone.com - Main application
📚 Documentation: https://www.communityone.com/docs - Guides and API reference
🔥 API Docs: https://www.communityone.com/api/docs - FastAPI interactive documentation

💻 LOCAL DEVELOPMENT:

🚀 Main App: http://localhost:5173
📚 Documentation: http://localhost:3000
🔥 API Docs: http://localhost:8000/docs

Stop Services

./stop-all.sh
# or
make stop-all

Usage

Command Line Interface

Always activate the virtual environment first:

source .venv/bin/activate

API Server

python main.py serve --host 0.0.0.0 --port 8000

Jurisdiction Discovery

# Test run
python main.py discover-jurisdictions --limit 100

# Single state
python main.py discover-jurisdictions --state CA

# Full discovery (~30k jurisdictions)
python main.py discover-jurisdictions

# View statistics
python main.py discovery-stats

Data Ingestion

# Census data (90,000+ jurisdictions)
python -m discovery.census_ingestion

# NCES school districts (13,000+)
python -m discovery.nces_ingestion

# Pre-built meeting datasets
python discovery/meetingbank_ingestion.py
python discovery/city_scrapers_urls.py
python discovery/openstates_sources.py

# LocalView (requires Dataverse API key)
python discovery/localview_ingestion.py

Scraping & Analysis

# Scrape batch from discovered sites
python main.py scrape-batch --source discovered --limit 50

# Scrape single source
python main.py scrape --url "https://city.legistar.com" \
                      --state "CA" \
                      --municipality "San Francisco"

# Run analysis pipeline
python main.py analyze --targets-file examples/targets.json

# Generate heatmap
python main.py generate-heatmap --output heatmap.html

Publishing Datasets

# Publish to HuggingFace (requires HUGGINGFACE_TOKEN in .env)
python main.py publish-to-hf --dataset all
python main.py publish-to-hf --dataset discovered-urls
python main.py publish-to-hf --dataset census --sample

API Usage

Start a workflow:

curl -X POST "http://localhost:8000/workflow/start" \
     -H "Content-Type: application/json" \
     -d '{
       "scrape_targets": [
         {
           "url": "https://example-city.legistar.com",
           "municipality": "Example City",
           "state": "CA",
           "platform": "legistar"
         }
       ]
     }'

Query opportunities:

curl "http://localhost:8000/opportunities?state=CA&urgency=critical"

Get heatmap:

curl "http://localhost:8000/heatmap" > heatmap.html

Python API

import asyncio
from agents.orchestrator import OrchestratorAgent
from agents.scraper import ScraperAgent
from agents.parser import ParserAgent
from agents.classifier import ClassifierAgent

# Initialize orchestrator
orchestrator = OrchestratorAgent()

# Register agents
orchestrator.register_agent(ScraperAgent())
orchestrator.register_agent(ParserAgent())
orchestrator.register_agent(ClassifierAgent())

# Execute pipeline
targets = [
    {
        "url": "https://city.legistar.com",
        "municipality": "Example City",
        "state": "CA",
        "platform": "legistar"
    }
]

results = await orchestrator.execute_pipeline(targets)

Project Structure

open-navigator/
├── agents/                 # Multi-agent AI system
├── api/                   # FastAPI application
├── frontend/             # React application (Open Navigator)
├── website/              # Docusaurus documentation
├── discovery/            # Data discovery modules
├── extraction/           # Document extraction
├── pipeline/             # Data pipeline components
├── visualization/        # Heatmap and charts
├── config/               # Configuration
├── tests/                # Test suite
├── main.py              # CLI entry point
└── requirements.txt     # Python dependencies

Deployment Options

1. Databricks Apps (Production)

export DATABRICKS_HOST=https://your-workspace.cloud.databricks.com
export DATABRICKS_TOKEN=dapi...
export OPENAI_API_KEY=sk-...

./scripts/deploy-databricks-app.sh

See DATABRICKS_APP_GUIDE.md for details.

2. Docker

docker-compose up -d

Starts:

API server (port 8000)
Qdrant vector database (port 6333)
Jupyter notebook (port 8888)

3. Local Development

See Quick Start above.

⚡ Intel Arc GPU Optimization

Run Llama 4 at NVIDIA-like speeds on Intel Arc integrated graphics!

If you have Intel Core Ultra 7 (or similar) with Arc Graphics + NPU, you can use DuckDB + VSS for 10-50x faster legislative analysis:

# Setup Intel-optimized environment
./scripts/intel_llm_setup.sh
source .venv-intel/bin/activate

# Run DuckDB vector search demo
python scripts/duckdb_vss_demo.py

# Run legislative analysis with LLM
python scripts/legislative_analysis_intel.py

Why DuckDB for Local AI?

⚡ 10-50x faster than Postgres for context injection
🎯 < 20ms vector similarity search across 10K bills
🧠 Embedded - no server needed, runs locally
🤗 Hugging Face Integration - query HF datasets directly

Performance:

Context Injection: 20ms vs 500ms (Postgres) = 25x faster
LLM Inference: 1,200 tok/s (Arc GPU) vs 350 tok/s (CPU) = 3.4x faster
Vector Search: 18ms vs 800ms = 44x faster

Features:

Extract interest groups from legislative testimony
Identify lobbyists and their positions
Analyze support/oppose scores with confidence
Detect tradeoffs and compromises

See full guide: Intel Arc Optimization Guide

🤖 AI Integration (MCP Server)

Connect your civic data to Claude and other AI assistants!

Open Navigator includes a Model Context Protocol (MCP) server that lets AI assistants directly access your data:

# Install MCP dependencies
pip install mcp anthropic-mcp-sdk

# Run the server
python scripts/mcp/open_navigator_server.py

What AI assistants can do:

🏛️ Search 90,000+ jurisdictions by name or location
🏢 Query 1.8M nonprofits with Form 990 data
📜 Semantic search across 4.5M+ legislative documents
📊 Get real-time statistics and analytics
🔍 Vector search meetings and bills with natural language

Example queries to Claude:

"Find all cities named Springfield in the database"

"Show me 501c3 nonprofits in San Francisco focused on education"

"What bills related to oral health were introduced in California?"

Configure Claude Desktop:

Add to ~/.config/Claude/claude_desktop_config.json:

{
  "mcpServers": {
    "open-navigator": {
      "command": "python",
      "args": ["/path/to/open-navigator/scripts/mcp/open_navigator_server.py"],
      "env": {
        "DATABASE_URL": "postgresql://postgres:password@localhost:5433/open_navigator"
      }
    }
  }
}

See full guide: MCP Server Documentation

Testing

# Run all tests
pytest

# With coverage
pytest --cov=agents --cov=pipeline --cov=visualization

# Specific test file
pytest tests/test_agents.py

Configuration

Create .env file:

# OpenAI
OPENAI_API_KEY=sk-...

# Databricks (optional)
DATABRICKS_HOST=https://your-workspace.cloud.databricks.com
DATABRICKS_TOKEN=dapi...

# HuggingFace (optional)
HUGGINGFACE_TOKEN=hf_...

# Dataverse (optional)
DATAVERSE_API_KEY=...

Contributing

Contributions are welcome! Please:

Fork the repository
Create a feature branch
Make your changes
Add tests
Submit a pull request

See CONTRIBUTING.md for details.

Documentation

Full Documentation - Complete guides and API reference
Architecture - System architecture overview
Quick Start - Detailed setup instructions
Quick Reference - Command reference card
MCP Server - AI assistant integration guide
Deployment - Production deployment guides
Case Studies - Real-world examples
CONTRIBUTING.md - How to contribute

Citations

This project uses several open datasets and research contributions. Please see CITATIONS.md for complete citation information.

Key Dataset:

MeetingBank: Hu et al., "MeetingBank: A Benchmark Dataset for Meeting Summarization", ACL 2023
- Used for meeting discovery and analysis
- 1,366 city council meetings from 6 U.S. cities
- See CITATIONS.md for full citation and BibTeX

License

Apache License 2.0 - see LICENSE file for details.

Support

GitHub Issues: github.com/getcommunityone/open-navigator-for-engagement/issues
Email: johnbowyer@communityone.com

Note: This system is designed to support advocacy efforts. All generated content should be reviewed by humans before use.