Spaces:
Running on CPU Upgrade
Running on CPU Upgrade
Enrichment Scripts
Scripts for enriching nonprofit data with additional metadata from various sources.
990 Forms Processing
batch_download_990s.py
Downloads IRS 990 forms in bulk for offline processing.
extract_990_zips.sh
Extracts downloaded 990 ZIP files into organized directories.
build_990_local_index.py
Builds a searchable index of downloaded 990 forms.
Nonprofit Enrichment
enrich_nonprofits_async.py
Main enrichment script - enriches nonprofits asynchronously from multiple sources.
Usage:
python scripts/enrichment/enrich_nonprofits_async.py
Source-Specific Enrichment
enrich_nonprofits_propublica.py- ProPublica Nonprofit Explorerenrich_nonprofits_everyorg.py- Every.org APIenrich_nonprofits_form990.py- IRS Form 990 dataenrich_nonprofits_bigquery.py- Google BigQuery IRS dataenrich_nonprofits_gt990.py- GT990 APIenrich_nonprofits_logodev.py- Logo enrichment
Batch Processing
auto_enrich_nonprofits.sh- Automated enrichment pipelineenrich_all_states_local.sh- State-by-state enrichmentenrich_nonprofits_no_auth.sh- Enrichment without API authenticationenrich_alabama_nonprofits.sh- Alabama-specific nonprofit enrichment
Utilities
cleanup_nonprofit_files.py- Clean up temporary enrichment filesdiscover_tuscaloosa_nonprofits.py- Example discovery pipelinerun_tuscaloosa_pipeline.sh- Full pipeline for Tuscaloosa, AL