Spaces:
Running on CPU Upgrade
Running on CPU Upgrade
File size: 1,470 Bytes
61d29fc | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 | # Enrichment Scripts
Scripts for enriching nonprofit data with additional metadata from various sources.
## 990 Forms Processing
### batch_download_990s.py
Downloads IRS 990 forms in bulk for offline processing.
### extract_990_zips.sh
Extracts downloaded 990 ZIP files into organized directories.
### build_990_local_index.py
Builds a searchable index of downloaded 990 forms.
## Nonprofit Enrichment
### enrich_nonprofits_async.py
**Main enrichment script** - enriches nonprofits asynchronously from multiple sources.
**Usage:**
```bash
python scripts/enrichment/enrich_nonprofits_async.py
```
### Source-Specific Enrichment
- `enrich_nonprofits_propublica.py` - ProPublica Nonprofit Explorer
- `enrich_nonprofits_everyorg.py` - Every.org API
- `enrich_nonprofits_form990.py` - IRS Form 990 data
- `enrich_nonprofits_bigquery.py` - Google BigQuery IRS data
- `enrich_nonprofits_gt990.py` - GT990 API
- `enrich_nonprofits_logodev.py` - Logo enrichment
### Batch Processing
- `auto_enrich_nonprofits.sh` - Automated enrichment pipeline
- `enrich_all_states_local.sh` - State-by-state enrichment
- `enrich_nonprofits_no_auth.sh` - Enrichment without API authentication
- `enrich_alabama_nonprofits.sh` - Alabama-specific nonprofit enrichment
## Utilities
- `cleanup_nonprofit_files.py` - Clean up temporary enrichment files
- `discover_tuscaloosa_nonprofits.py` - Example discovery pipeline
- `run_tuscaloosa_pipeline.sh` - Full pipeline for Tuscaloosa, AL
|