open-navigator / README.md
jcbowyer's picture
Clean HuggingFace deployment without binary files
61d29fc
metadata
title: Open Navigator
emoji: πŸ›οΈ
colorFrom: blue
colorTo: green
sdk: docker
app_port: 7860
pinned: false
license: apache-2.0

πŸ›οΈ Open Navigator

CommunityOne: The open path to everything local

AI-powered civic engagement platform with React + FastAPI web interface

License: Apache 2.0 Python 3.11+ React FastAPI

οΏ½ Quick Links

βš›οΈ Open Navigator β†’ - LIVE APPLICATION (search, filters, heatmap, data exploration)

πŸ“– Documentation β†’ - Complete guides, architecture, and feature details

The documentation site includes:

  • Features and capabilities
  • Data sources and integrations
  • Architecture and deployment options
  • Policy topics and advocacy tools
  • API reference and examples

Quick Start

Three Services

This project runs three separate services:

Service Port (Local) Live URL Description
βš›οΈ Open Navigator πŸš€ 5173 www.communityone.com MAIN APPLICATION - Search, filters, heatmap, data exploration
πŸ“š Documentation 3000 www.communityone.com/docs Docusaurus site with complete guides and tutorials
πŸ”₯ API Backend 8000 www.communityone.com/api FastAPI server with AI agents

πŸ’‘ LIVE DEMO: Visit www.communityone.com to use the application!

πŸ’» LOCAL DEV: After running ./start-all.sh, visit http://localhost:5173

πŸš€ Deployment

Deploy to Hugging Face Spaces in 3 commands:

echo "HF_USERNAME=your_username" >> .env
./deploy-huggingface.sh
# Configure hardware and secrets at https://huggingface.co/spaces/YOUR_USERNAME/www.communityone.com

Full deployment guides:

The deploy-huggingface.sh script automatically:

  • βœ… Tests builds locally (catches errors before pushing)
  • βœ… Creates the Space on Hugging Face
  • βœ… Pushes code and triggers automatic build (~10-15 min)

Prerequisites

  • Python 3.11+
  • Node.js 18+
  • Docker (optional)
  • OpenAI API key

Installation

Option 1: Start Everything at Once (Recommended)

# Clone repository
git clone https://github.com/getcommunityone/open-navigator.git
cd open-navigator

# Install dependencies
./install.sh                          # Python backend
cd frontend && npm install && cd ..   # React app
cd website && npm install && cd ..    # Documentation

# Setup git hooks for build protection (one-time)
./setup-git-hooks.sh

# Start all services in tmux
./start-all.sh

Option 2: Using Makefile

# Install
make install
make install-frontend
make install-docs

# Start all services
make start-all

# Or individually:
make dev           # API only
make dev-frontend  # React app only
make dev-docs      # Docs only

Option 3: Manual Setup

# Python backend
python3 -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt

# React app
cd frontend && npm install && cd ..

# Documentation
cd website && npm install && cd ..

# Configure environment
cp .env.example .env
# Edit .env with your API keys

# Start services (separate terminals)
source .venv/bin/activate && python main.py serve  # Terminal 1
cd frontend && npm run dev                          # Terminal 2
cd website && npm start                             # Terminal 3

Access Points

🌐 LIVE APPLICATION:

πŸ’» LOCAL DEVELOPMENT:

Stop Services

./stop-all.sh
# or
make stop-all

Usage

Command Line Interface

Always activate the virtual environment first:

source .venv/bin/activate

API Server

python main.py serve --host 0.0.0.0 --port 8000

Jurisdiction Discovery

# Test run
python main.py discover-jurisdictions --limit 100

# Single state
python main.py discover-jurisdictions --state CA

# Full discovery (~30k jurisdictions)
python main.py discover-jurisdictions

# View statistics
python main.py discovery-stats

Data Ingestion

# Census data (90,000+ jurisdictions)
python -m discovery.census_ingestion

# NCES school districts (13,000+)
python -m discovery.nces_ingestion

# Pre-built meeting datasets
python discovery/meetingbank_ingestion.py
python discovery/city_scrapers_urls.py
python discovery/openstates_sources.py

# LocalView (requires Dataverse API key)
python discovery/localview_ingestion.py

Scraping & Analysis

# Scrape batch from discovered sites
python main.py scrape-batch --source discovered --limit 50

# Scrape single source
python main.py scrape --url "https://city.legistar.com" \
                      --state "CA" \
                      --municipality "San Francisco"

# Run analysis pipeline
python main.py analyze --targets-file examples/targets.json

# Generate heatmap
python main.py generate-heatmap --output heatmap.html

Publishing Datasets

# Publish to HuggingFace (requires HUGGINGFACE_TOKEN in .env)
python main.py publish-to-hf --dataset all
python main.py publish-to-hf --dataset discovered-urls
python main.py publish-to-hf --dataset census --sample

API Usage

Start a workflow:

curl -X POST "http://localhost:8000/workflow/start" \
     -H "Content-Type: application/json" \
     -d '{
       "scrape_targets": [
         {
           "url": "https://example-city.legistar.com",
           "municipality": "Example City",
           "state": "CA",
           "platform": "legistar"
         }
       ]
     }'

Query opportunities:

curl "http://localhost:8000/opportunities?state=CA&urgency=critical"

Get heatmap:

curl "http://localhost:8000/heatmap" > heatmap.html

Python API

import asyncio
from agents.orchestrator import OrchestratorAgent
from agents.scraper import ScraperAgent
from agents.parser import ParserAgent
from agents.classifier import ClassifierAgent

# Initialize orchestrator
orchestrator = OrchestratorAgent()

# Register agents
orchestrator.register_agent(ScraperAgent())
orchestrator.register_agent(ParserAgent())
orchestrator.register_agent(ClassifierAgent())

# Execute pipeline
targets = [
    {
        "url": "https://city.legistar.com",
        "municipality": "Example City",
        "state": "CA",
        "platform": "legistar"
    }
]

results = await orchestrator.execute_pipeline(targets)

Project Structure

open-navigator/
β”œβ”€β”€ agents/                 # Multi-agent AI system
β”œβ”€β”€ api/                   # FastAPI application
β”œβ”€β”€ frontend/             # React application (Open Navigator)
β”œβ”€β”€ website/              # Docusaurus documentation
β”œβ”€β”€ discovery/            # Data discovery modules
β”œβ”€β”€ extraction/           # Document extraction
β”œβ”€β”€ pipeline/             # Data pipeline components
β”œβ”€β”€ visualization/        # Heatmap and charts
β”œβ”€β”€ config/               # Configuration
β”œβ”€β”€ tests/                # Test suite
β”œβ”€β”€ main.py              # CLI entry point
└── requirements.txt     # Python dependencies

Deployment Options

1. Databricks Apps (Production)

export DATABRICKS_HOST=https://your-workspace.cloud.databricks.com
export DATABRICKS_TOKEN=dapi...
export OPENAI_API_KEY=sk-...

./scripts/deploy-databricks-app.sh

See DATABRICKS_APP_GUIDE.md for details.

2. Docker

docker-compose up -d

Starts:

  • API server (port 8000)
  • Qdrant vector database (port 6333)
  • Jupyter notebook (port 8888)

3. Local Development

See Quick Start above.


⚑ Intel Arc GPU Optimization

Run Llama 4 at NVIDIA-like speeds on Intel Arc integrated graphics!

If you have Intel Core Ultra 7 (or similar) with Arc Graphics + NPU, you can use DuckDB + VSS for 10-50x faster legislative analysis:

# Setup Intel-optimized environment
./scripts/intel_llm_setup.sh
source .venv-intel/bin/activate

# Run DuckDB vector search demo
python scripts/duckdb_vss_demo.py

# Run legislative analysis with LLM
python scripts/legislative_analysis_intel.py

Why DuckDB for Local AI?

  • ⚑ 10-50x faster than Postgres for context injection
  • 🎯 < 20ms vector similarity search across 10K bills
  • 🧠 Embedded - no server needed, runs locally
  • πŸ€— Hugging Face Integration - query HF datasets directly

Performance:

  • Context Injection: 20ms vs 500ms (Postgres) = 25x faster
  • LLM Inference: 1,200 tok/s (Arc GPU) vs 350 tok/s (CPU) = 3.4x faster
  • Vector Search: 18ms vs 800ms = 44x faster

Features:

  • Extract interest groups from legislative testimony
  • Identify lobbyists and their positions
  • Analyze support/oppose scores with confidence
  • Detect tradeoffs and compromises

See full guide: Intel Arc Optimization Guide


πŸ€– AI Integration (MCP Server)

Connect your civic data to Claude and other AI assistants!

Open Navigator includes a Model Context Protocol (MCP) server that lets AI assistants directly access your data:

# Install MCP dependencies
pip install mcp anthropic-mcp-sdk

# Run the server
python scripts/mcp/open_navigator_server.py

What AI assistants can do:

  • πŸ›οΈ Search 90,000+ jurisdictions by name or location
  • 🏒 Query 1.8M nonprofits with Form 990 data
  • πŸ“œ Semantic search across 4.5M+ legislative documents
  • πŸ“Š Get real-time statistics and analytics
  • πŸ” Vector search meetings and bills with natural language

Example queries to Claude:

"Find all cities named Springfield in the database"

"Show me 501c3 nonprofits in San Francisco focused on education"

"What bills related to oral health were introduced in California?"

Configure Claude Desktop:

Add to ~/.config/Claude/claude_desktop_config.json:

{
  "mcpServers": {
    "open-navigator": {
      "command": "python",
      "args": ["/path/to/open-navigator/scripts/mcp/open_navigator_server.py"],
      "env": {
        "DATABASE_URL": "postgresql://postgres:password@localhost:5433/open_navigator"
      }
    }
  }
}

See full guide: MCP Server Documentation


Testing

# Run all tests
pytest

# With coverage
pytest --cov=agents --cov=pipeline --cov=visualization

# Specific test file
pytest tests/test_agents.py

Configuration

Create .env file:

# OpenAI
OPENAI_API_KEY=sk-...

# Databricks (optional)
DATABRICKS_HOST=https://your-workspace.cloud.databricks.com
DATABRICKS_TOKEN=dapi...

# HuggingFace (optional)
HUGGINGFACE_TOKEN=hf_...

# Dataverse (optional)
DATAVERSE_API_KEY=...

Contributing

Contributions are welcome! Please:

  1. Fork the repository
  2. Create a feature branch
  3. Make your changes
  4. Add tests
  5. Submit a pull request

See CONTRIBUTING.md for details.


Documentation


Citations

This project uses several open datasets and research contributions. Please see CITATIONS.md for complete citation information.

Key Dataset:

  • MeetingBank: Hu et al., "MeetingBank: A Benchmark Dataset for Meeting Summarization", ACL 2023
    • Used for meeting discovery and analysis
    • 1,366 city council meetings from 6 U.S. cities
    • See CITATIONS.md for full citation and BibTeX

License

Apache License 2.0 - see LICENSE file for details.


Support


Note: This system is designed to support advocacy efforts. All generated content should be reviewed by humans before use.