open-navigator / .github /copilot-instructions.md
jcbowyer's picture
Deploy: Consolidated gold tables, fixed nginx docs routing
896453f verified

GitHub Copilot Instructions for Open Navigator

🚨 CRITICAL: Documentation Standards

⚠️ ALWAYS Use Docusaurus Format - NO EXCEPTIONS

MANDATORY RULE: When creating ANY documentation, guides, or markdown files:

βœ… DO THIS:

  • Create ALL documentation in website/docs/ subdirectories
  • Add YAML frontmatter to every documentation file
  • Use kebab-case filenames
  • Place in appropriate subdirectory

❌ NEVER DO THIS:

  • ❌ Create .md files in project root (except README.md, LICENSE, CONTRIBUTING.md)
  • ❌ Create files like VARIABLE_MIGRATION.md, DOCKER_BUILD_TROUBLESHOOTING.md in root
  • ❌ Create UPPERCASE_FILE.md files anywhere
  • ❌ Skip frontmatter in documentation files

Documentation File Location Rules

When creating or editing documentation:

  1. Location: ALWAYS place documentation in website/docs/ with appropriate subdirectories

    • Deployment guides β†’ website/docs/deployment/
    • How-to guides β†’ website/docs/guides/
    • Data sources β†’ website/docs/data-sources/
    • Case studies β†’ website/docs/case-studies/
    • Integration docs β†’ website/docs/integrations/
    • Development guides β†’ website/docs/development/
  2. Frontmatter: ALWAYS include YAML frontmatter at the top: ```markdown

    sidebar_position: 1

    Document Title

    
    
  3. File naming: ALWAYS use kebab-case (lowercase with hyphens)

    • βœ… huggingface-spaces.md
    • βœ… variable-migration.md
    • βœ… docker-troubleshooting.md
    • ❌ HUGGINGFACE_DEPLOYMENT.md
    • ❌ HuggingFaceSpaces.md
    • ❌ VARIABLE_MIGRATION.md
  4. Root directory: Keep root directory clean

    • βœ… Only keep these in root: README.md, LICENSE, CONTRIBUTING.md
    • βœ… Move ALL other docs to website/docs/
    • ❌ Don't create new .md files in project root

Examples

When asked to create troubleshooting documentation:

# ❌ WRONG
/home/developer/projects/open-navigator/DOCKER_BUILD_TROUBLESHOOTING.md

# βœ… CORRECT
/home/developer/projects/open-navigator/website/docs/deployment/docker-troubleshooting.md

When asked to create a migration guide:

# ❌ WRONG
/home/developer/projects/open-navigator/VARIABLE_MIGRATION.md

# βœ… CORRECT
/home/developer/projects/open-navigator/website/docs/deployment/variable-migration.md

When asked to document a new feature:

# ❌ WRONG
/home/developer/projects/open-navigator/NEW_FEATURE.md

# βœ… CORRECT
/home/developer/projects/open-navigator/website/docs/guides/new-feature.md

Sidebar Organization

The documentation uses audience-based navigation in website/sidebars.ts:

  • πŸš€ Getting Started: Landing pages (intro, dashboard)
  • πŸ“Š For Policy Makers & Advocates: Non-technical content
  • πŸ› οΈ For Developers & Technical Users: Technical content including:
    • Setup & Installation
    • Data Sources (Technical)
    • How-To Guides
    • Integrations
    • Deployment (uses autogenerated for deployment/ directory)
    • Development

When creating docs in a directory with autogenerated, they'll automatically appear in sidebar.

Scripts Organization

⚠️ ALWAYS Organize Scripts into Logical Folders

MANDATORY RULE: When creating ANY scripts in the scripts/ directory:

βœ… DO THIS:

  • Organize scripts into logical subdirectories by function
  • Use clear, descriptive folder names
  • Keep the root scripts/ directory clean
  • Add README.md to each subdirectory explaining its purpose

❌ NEVER DO THIS:

  • ❌ Create scripts directly in scripts/ root (except core workflow scripts)
  • ❌ Mix unrelated scripts together
  • ❌ Recreate scripts that already exist - search first!

Scripts Directory Structure

scripts/
β”œβ”€β”€ data/                    # Data processing and migration
β”‚   β”œβ”€β”€ aggregate_bills_from_postgres.py
β”‚   β”œβ”€β”€ create_all_gold_tables.py
β”‚   β”œβ”€β”€ migrate_to_events_naming.py
β”‚   └── README.md
β”œβ”€β”€ deployment/              # Deployment and setup
β”‚   β”œβ”€β”€ deploy-databricks-app.sh
β”‚   β”œβ”€β”€ setup-local.sh
β”‚   β”œβ”€β”€ setup_openstates_db.sh
β”‚   └── README.md
β”œβ”€β”€ enrichment/              # Data enrichment (990s, nonprofits)
β”‚   β”œβ”€β”€ enrich_nonprofits_async.py
β”‚   β”œβ”€β”€ batch_download_990s.py
β”‚   β”œβ”€β”€ extract_990_zips.sh
β”‚   └── README.md
β”œβ”€β”€ huggingface/             # HuggingFace dataset management
β”‚   β”œβ”€β”€ upload_to_huggingface.py
β”‚   β”œβ”€β”€ reorganize_for_huggingface.py
β”‚   β”œβ”€β”€ finalize_huggingface_structure.py
β”‚   └── README.md
β”œβ”€β”€ maintenance/             # Cleanup and maintenance
β”‚   β”œβ”€β”€ cleanup_disk_space.sh
β”‚   β”œβ”€β”€ cleanup_frontend_junk.sh
β”‚   └── README.md
└── README.md               # Overview of all script categories

Before Creating a New Script

  1. Search first: Use grep or file_search to find existing scripts
  2. Check for duplicates: Scripts like aggregate_bills_from_postgres.py already exist
  3. Use existing: Prefer modifying existing scripts over creating new ones
  4. Organize: If creating new, place in appropriate subdirectory

Code Style Preferences

Python

  • Use type hints for function parameters and return values
  • Follow PEP 8 naming conventions
  • Add docstrings to all public functions and classes
  • Prefer pathlib over os.path for file operations

TypeScript/React

  • Use functional components with hooks
  • Prefer named exports over default exports
  • Use TypeScript interfaces for props
  • Follow the existing Tailwind CSS patterns

Documentation

  • Use emoji headers sparingly and consistently (πŸš€, πŸ“Š, πŸ› οΈ, etc.)
  • Include code examples with syntax highlighting
  • Add "Prerequisites" section for setup guides
  • Include "Next Steps" at the end of tutorials

Project Context

This is Open Navigator - a civic engagement platform that:

  • Tracks 90,000+ jurisdictions (cities, counties, states)
  • Monitors 1.8M nonprofit organizations
  • Analyzes meeting minutes and public records
  • Provides oral health policy tracking

Three Services Architecture

Always mention all three services when documenting deployment:

  1. Documentation (Docusaurus) - Port 3000
  2. Main Application (React + Vite) - Port 5173 (MAIN APP)
  3. API Backend (FastAPI) - Port 8000

Common Patterns

When suggesting deployment or setup:

  • Use start-all.sh to launch all services
  • Reference environment variables from .env.example
  • Mention that secrets go in .env (gitignored)
  • Include verification steps to test deployment

Data Management Rules

CRITICAL - DO NOT DELETE APPLICATION CACHE:

  • ❌ NEVER recommend deleting /home/developer/projects/open-navigator/data/cache/
  • ❌ NEVER suggest rm -rf data/cache or similar commands
  • This directory contains critical application data from data processing pipelines
  • Deleting it will cause data loss and require expensive reprocessing
  • If disk space cleanup is needed, suggest cleaning:
    • Docker images/volumes: docker system prune
    • System caches: ~/.cache/pip, ~/.cache/npm, ~/.cache/huggingface
    • Build artifacts: frontend/dist, website/build
    • NOT the application data cache

File Organization Rules

What Goes Where

Root directory (minimal):

  • README.md (developer quick start)
  • LICENSE, CONTRIBUTING.md
  • Configuration files (Dockerfile, docker-compose.yml, requirements.txt, etc.)
  • Shell scripts (start-all.sh, deploy-huggingface.sh, etc.)

Documentation (website/docs/):

  • All markdown documentation
  • Organized by topic and audience
  • Automatically included in Docusaurus sidebar

Code (src/, api/, agents/, etc.):

  • Python modules and packages
  • Organized by functionality

When Creating New Features

  1. Code first: Implement the feature
  2. Tests: Add tests if applicable
  3. Documentation: Create docs in website/docs/ with proper frontmatter
  4. README: Update root README.md only if it affects quick start
  5. Examples: Add usage examples to documentation

Deployment Targets

When suggesting deployment options, consider:

  • Hugging Face Spaces: Full Docker deployment (all 3 apps)
  • Databricks Apps: React + FastAPI for enterprise
  • Local Development: Using start-all.sh with tmux

Always provide complete deployment instructions in website/docs/deployment/.