# GitHub Copilot Instructions for Open Navigator ## 🚨 CRITICAL: Documentation Standards ### ⚠️ ALWAYS Use Docusaurus Format - NO EXCEPTIONS **MANDATORY RULE:** When creating ANY documentation, guides, or markdown files: **✅ DO THIS:** - Create ALL documentation in `website/docs/` subdirectories - Add YAML frontmatter to every documentation file - Use kebab-case filenames - Place in appropriate subdirectory **❌ NEVER DO THIS:** - ❌ Create `.md` files in project root (except README.md, LICENSE, CONTRIBUTING.md) - ❌ Create files like `VARIABLE_MIGRATION.md`, `DOCKER_BUILD_TROUBLESHOOTING.md` in root - ❌ Create `UPPERCASE_FILE.md` files anywhere - ❌ Skip frontmatter in documentation files ### Documentation File Location Rules When creating or editing documentation: 1. **Location**: ALWAYS place documentation in `website/docs/` with appropriate subdirectories - Deployment guides → `website/docs/deployment/` - How-to guides → `website/docs/guides/` - Data sources → `website/docs/data-sources/` - Case studies → `website/docs/case-studies/` - Integration docs → `website/docs/integrations/` - Development guides → `website/docs/development/` 2. **Frontmatter**: ALWAYS include YAML frontmatter at the top: ```markdown --- sidebar_position: 1 --- # Document Title ``` 3. **File naming**: ALWAYS use kebab-case (lowercase with hyphens) - ✅ `huggingface-spaces.md` - ✅ `variable-migration.md` - ✅ `docker-troubleshooting.md` - ❌ `HUGGINGFACE_DEPLOYMENT.md` - ❌ `HuggingFaceSpaces.md` - ❌ `VARIABLE_MIGRATION.md` 4. **Root directory**: Keep root directory clean - ✅ Only keep these in root: README.md, LICENSE, CONTRIBUTING.md - ✅ Move ALL other docs to `website/docs/` - ❌ Don't create new `.md` files in project root ### Examples **When asked to create troubleshooting documentation:** ```bash # ❌ WRONG /home/developer/projects/open-navigator/DOCKER_BUILD_TROUBLESHOOTING.md # ✅ CORRECT /home/developer/projects/open-navigator/website/docs/deployment/docker-troubleshooting.md ``` **When asked to create a migration guide:** ```bash # ❌ WRONG /home/developer/projects/open-navigator/VARIABLE_MIGRATION.md # ✅ CORRECT /home/developer/projects/open-navigator/website/docs/deployment/variable-migration.md ``` **When asked to document a new feature:** ```bash # ❌ WRONG /home/developer/projects/open-navigator/NEW_FEATURE.md # ✅ CORRECT /home/developer/projects/open-navigator/website/docs/guides/new-feature.md ``` ### Sidebar Organization The documentation uses audience-based navigation in `website/sidebars.ts`: - **🚀 Getting Started**: Landing pages (intro, dashboard) - **📊 For Policy Makers & Advocates**: Non-technical content - **🛠️ For Developers & Technical Users**: Technical content including: - Setup & Installation - Data Sources (Technical) - How-To Guides - Integrations - Deployment (uses `autogenerated` for `deployment/` directory) - Development When creating docs in a directory with `autogenerated`, they'll automatically appear in sidebar. ## Scripts Organization ### ⚠️ ALWAYS Organize Scripts into Logical Folders **MANDATORY RULE:** When creating ANY scripts in the `scripts/` directory: **✅ DO THIS:** - Organize scripts into logical subdirectories by function - Use clear, descriptive folder names - Keep the root `scripts/` directory clean - Add README.md to each subdirectory explaining its purpose **❌ NEVER DO THIS:** - ❌ Create scripts directly in `scripts/` root (except core workflow scripts) - ❌ Mix unrelated scripts together - ❌ Recreate scripts that already exist - search first! ### Scripts Directory Structure ``` scripts/ ├── data/ # Data processing and migration │ ├── aggregate_bills_from_postgres.py │ ├── create_all_gold_tables.py │ ├── migrate_to_events_naming.py │ └── README.md ├── deployment/ # Deployment and setup │ ├── deploy-databricks-app.sh │ ├── setup-local.sh │ ├── setup_openstates_db.sh │ └── README.md ├── enrichment/ # Data enrichment (990s, nonprofits) │ ├── enrich_nonprofits_async.py │ ├── batch_download_990s.py │ ├── extract_990_zips.sh │ └── README.md ├── huggingface/ # HuggingFace dataset management │ ├── upload_to_huggingface.py │ ├── reorganize_for_huggingface.py │ ├── finalize_huggingface_structure.py │ └── README.md ├── maintenance/ # Cleanup and maintenance │ ├── cleanup_disk_space.sh │ ├── cleanup_frontend_junk.sh │ └── README.md └── README.md # Overview of all script categories ``` ### Before Creating a New Script 1. **Search first**: Use `grep` or `file_search` to find existing scripts 2. **Check for duplicates**: Scripts like `aggregate_bills_from_postgres.py` already exist 3. **Use existing**: Prefer modifying existing scripts over creating new ones 4. **Organize**: If creating new, place in appropriate subdirectory ## Code Style Preferences ### Python - Use type hints for function parameters and return values - Follow PEP 8 naming conventions - Add docstrings to all public functions and classes - Prefer pathlib over os.path for file operations ### TypeScript/React - Use functional components with hooks - Prefer named exports over default exports - Use TypeScript interfaces for props - Follow the existing Tailwind CSS patterns ### Documentation - Use emoji headers sparingly and consistently (🚀, 📊, 🛠️, etc.) - Include code examples with syntax highlighting - Add "Prerequisites" section for setup guides - Include "Next Steps" at the end of tutorials ## Project Context This is **Open Navigator** - a civic engagement platform that: - Tracks 90,000+ jurisdictions (cities, counties, states) - Monitors 1.8M nonprofit organizations - Analyzes meeting minutes and public records - Provides oral health policy tracking ### Three Services Architecture Always mention all three services when documenting deployment: 1. **Documentation** (Docusaurus) - Port 3000 2. **Main Application** (React + Vite) - Port 5173 (MAIN APP) 3. **API Backend** (FastAPI) - Port 8000 ### Common Patterns When suggesting deployment or setup: - Use `start-all.sh` to launch all services - Reference environment variables from `.env.example` - Mention that secrets go in `.env` (gitignored) - Include verification steps to test deployment ### Data Management Rules **CRITICAL - DO NOT DELETE APPLICATION CACHE:** - ❌ **NEVER** recommend deleting `/home/developer/projects/open-navigator/data/cache/` - ❌ **NEVER** suggest `rm -rf data/cache` or similar commands - This directory contains critical application data from data processing pipelines - Deleting it will cause data loss and require expensive reprocessing - If disk space cleanup is needed, suggest cleaning: - Docker images/volumes: `docker system prune` - System caches: `~/.cache/pip`, `~/.cache/npm`, `~/.cache/huggingface` - Build artifacts: `frontend/dist`, `website/build` - NOT the application data cache ## File Organization Rules ### What Goes Where **Root directory** (minimal): - README.md (developer quick start) - LICENSE, CONTRIBUTING.md - Configuration files (Dockerfile, docker-compose.yml, requirements.txt, etc.) - Shell scripts (start-all.sh, deploy-huggingface.sh, etc.) **Documentation** (`website/docs/`): - All markdown documentation - Organized by topic and audience - Automatically included in Docusaurus sidebar **Code** (`src/`, `api/`, `agents/`, etc.): - Python modules and packages - Organized by functionality ## When Creating New Features 1. **Code first**: Implement the feature 2. **Tests**: Add tests if applicable 3. **Documentation**: Create docs in `website/docs/` with proper frontmatter 4. **README**: Update root README.md only if it affects quick start 5. **Examples**: Add usage examples to documentation ## Deployment Targets When suggesting deployment options, consider: - **Hugging Face Spaces**: Full Docker deployment (all 3 apps) - **Databricks Apps**: React + FastAPI for enterprise - **Local Development**: Using start-all.sh with tmux Always provide complete deployment instructions in `website/docs/deployment/`.