open-navigator / .github /copilot-instructions.md
jcbowyer's picture
Deploy: Consolidated gold tables, fixed nginx docs routing
896453f verified
# GitHub Copilot Instructions for Open Navigator
## 🚨 CRITICAL: Documentation Standards
### ⚠️ ALWAYS Use Docusaurus Format - NO EXCEPTIONS
**MANDATORY RULE:** When creating ANY documentation, guides, or markdown files:
**βœ… DO THIS:**
- Create ALL documentation in `website/docs/` subdirectories
- Add YAML frontmatter to every documentation file
- Use kebab-case filenames
- Place in appropriate subdirectory
**❌ NEVER DO THIS:**
- ❌ Create `.md` files in project root (except README.md, LICENSE, CONTRIBUTING.md)
- ❌ Create files like `VARIABLE_MIGRATION.md`, `DOCKER_BUILD_TROUBLESHOOTING.md` in root
- ❌ Create `UPPERCASE_FILE.md` files anywhere
- ❌ Skip frontmatter in documentation files
### Documentation File Location Rules
When creating or editing documentation:
1. **Location**: ALWAYS place documentation in `website/docs/` with appropriate subdirectories
- Deployment guides β†’ `website/docs/deployment/`
- How-to guides β†’ `website/docs/guides/`
- Data sources β†’ `website/docs/data-sources/`
- Case studies β†’ `website/docs/case-studies/`
- Integration docs β†’ `website/docs/integrations/`
- Development guides β†’ `website/docs/development/`
2. **Frontmatter**: ALWAYS include YAML frontmatter at the top:
```markdown
---
sidebar_position: 1
---
# Document Title
```
3. **File naming**: ALWAYS use kebab-case (lowercase with hyphens)
- βœ… `huggingface-spaces.md`
- βœ… `variable-migration.md`
- βœ… `docker-troubleshooting.md`
- ❌ `HUGGINGFACE_DEPLOYMENT.md`
- ❌ `HuggingFaceSpaces.md`
- ❌ `VARIABLE_MIGRATION.md`
4. **Root directory**: Keep root directory clean
- βœ… Only keep these in root: README.md, LICENSE, CONTRIBUTING.md
- βœ… Move ALL other docs to `website/docs/`
- ❌ Don't create new `.md` files in project root
### Examples
**When asked to create troubleshooting documentation:**
```bash
# ❌ WRONG
/home/developer/projects/open-navigator/DOCKER_BUILD_TROUBLESHOOTING.md
# βœ… CORRECT
/home/developer/projects/open-navigator/website/docs/deployment/docker-troubleshooting.md
```
**When asked to create a migration guide:**
```bash
# ❌ WRONG
/home/developer/projects/open-navigator/VARIABLE_MIGRATION.md
# βœ… CORRECT
/home/developer/projects/open-navigator/website/docs/deployment/variable-migration.md
```
**When asked to document a new feature:**
```bash
# ❌ WRONG
/home/developer/projects/open-navigator/NEW_FEATURE.md
# βœ… CORRECT
/home/developer/projects/open-navigator/website/docs/guides/new-feature.md
```
### Sidebar Organization
The documentation uses audience-based navigation in `website/sidebars.ts`:
- **πŸš€ Getting Started**: Landing pages (intro, dashboard)
- **πŸ“Š For Policy Makers & Advocates**: Non-technical content
- **πŸ› οΈ For Developers & Technical Users**: Technical content including:
- Setup & Installation
- Data Sources (Technical)
- How-To Guides
- Integrations
- Deployment (uses `autogenerated` for `deployment/` directory)
- Development
When creating docs in a directory with `autogenerated`, they'll automatically appear in sidebar.
## Scripts Organization
### ⚠️ ALWAYS Organize Scripts into Logical Folders
**MANDATORY RULE:** When creating ANY scripts in the `scripts/` directory:
**βœ… DO THIS:**
- Organize scripts into logical subdirectories by function
- Use clear, descriptive folder names
- Keep the root `scripts/` directory clean
- Add README.md to each subdirectory explaining its purpose
**❌ NEVER DO THIS:**
- ❌ Create scripts directly in `scripts/` root (except core workflow scripts)
- ❌ Mix unrelated scripts together
- ❌ Recreate scripts that already exist - search first!
### Scripts Directory Structure
```
scripts/
β”œβ”€β”€ data/ # Data processing and migration
β”‚ β”œβ”€β”€ aggregate_bills_from_postgres.py
β”‚ β”œβ”€β”€ create_all_gold_tables.py
β”‚ β”œβ”€β”€ migrate_to_events_naming.py
β”‚ └── README.md
β”œβ”€β”€ deployment/ # Deployment and setup
β”‚ β”œβ”€β”€ deploy-databricks-app.sh
β”‚ β”œβ”€β”€ setup-local.sh
β”‚ β”œβ”€β”€ setup_openstates_db.sh
β”‚ └── README.md
β”œβ”€β”€ enrichment/ # Data enrichment (990s, nonprofits)
β”‚ β”œβ”€β”€ enrich_nonprofits_async.py
β”‚ β”œβ”€β”€ batch_download_990s.py
β”‚ β”œβ”€β”€ extract_990_zips.sh
β”‚ └── README.md
β”œβ”€β”€ huggingface/ # HuggingFace dataset management
β”‚ β”œβ”€β”€ upload_to_huggingface.py
β”‚ β”œβ”€β”€ reorganize_for_huggingface.py
β”‚ β”œβ”€β”€ finalize_huggingface_structure.py
β”‚ └── README.md
β”œβ”€β”€ maintenance/ # Cleanup and maintenance
β”‚ β”œβ”€β”€ cleanup_disk_space.sh
β”‚ β”œβ”€β”€ cleanup_frontend_junk.sh
β”‚ └── README.md
└── README.md # Overview of all script categories
```
### Before Creating a New Script
1. **Search first**: Use `grep` or `file_search` to find existing scripts
2. **Check for duplicates**: Scripts like `aggregate_bills_from_postgres.py` already exist
3. **Use existing**: Prefer modifying existing scripts over creating new ones
4. **Organize**: If creating new, place in appropriate subdirectory
## Code Style Preferences
### Python
- Use type hints for function parameters and return values
- Follow PEP 8 naming conventions
- Add docstrings to all public functions and classes
- Prefer pathlib over os.path for file operations
### TypeScript/React
- Use functional components with hooks
- Prefer named exports over default exports
- Use TypeScript interfaces for props
- Follow the existing Tailwind CSS patterns
### Documentation
- Use emoji headers sparingly and consistently (πŸš€, πŸ“Š, πŸ› οΈ, etc.)
- Include code examples with syntax highlighting
- Add "Prerequisites" section for setup guides
- Include "Next Steps" at the end of tutorials
## Project Context
This is **Open Navigator** - a civic engagement platform that:
- Tracks 90,000+ jurisdictions (cities, counties, states)
- Monitors 1.8M nonprofit organizations
- Analyzes meeting minutes and public records
- Provides oral health policy tracking
### Three Services Architecture
Always mention all three services when documenting deployment:
1. **Documentation** (Docusaurus) - Port 3000
2. **Main Application** (React + Vite) - Port 5173 (MAIN APP)
3. **API Backend** (FastAPI) - Port 8000
### Common Patterns
When suggesting deployment or setup:
- Use `start-all.sh` to launch all services
- Reference environment variables from `.env.example`
- Mention that secrets go in `.env` (gitignored)
- Include verification steps to test deployment
### Data Management Rules
**CRITICAL - DO NOT DELETE APPLICATION CACHE:**
- ❌ **NEVER** recommend deleting `/home/developer/projects/open-navigator/data/cache/`
- ❌ **NEVER** suggest `rm -rf data/cache` or similar commands
- This directory contains critical application data from data processing pipelines
- Deleting it will cause data loss and require expensive reprocessing
- If disk space cleanup is needed, suggest cleaning:
- Docker images/volumes: `docker system prune`
- System caches: `~/.cache/pip`, `~/.cache/npm`, `~/.cache/huggingface`
- Build artifacts: `frontend/dist`, `website/build`
- NOT the application data cache
## File Organization Rules
### What Goes Where
**Root directory** (minimal):
- README.md (developer quick start)
- LICENSE, CONTRIBUTING.md
- Configuration files (Dockerfile, docker-compose.yml, requirements.txt, etc.)
- Shell scripts (start-all.sh, deploy-huggingface.sh, etc.)
**Documentation** (`website/docs/`):
- All markdown documentation
- Organized by topic and audience
- Automatically included in Docusaurus sidebar
**Code** (`src/`, `api/`, `agents/`, etc.):
- Python modules and packages
- Organized by functionality
## When Creating New Features
1. **Code first**: Implement the feature
2. **Tests**: Add tests if applicable
3. **Documentation**: Create docs in `website/docs/` with proper frontmatter
4. **README**: Update root README.md only if it affects quick start
5. **Examples**: Add usage examples to documentation
## Deployment Targets
When suggesting deployment options, consider:
- **Hugging Face Spaces**: Full Docker deployment (all 3 apps)
- **Databricks Apps**: React + FastAPI for enterprise
- **Local Development**: Using start-all.sh with tmux
Always provide complete deployment instructions in `website/docs/deployment/`.