Spaces:
Running on CPU Upgrade
Nonprofit Discovery Module
Automated discovery and enrichment of nonprofits and churches using 100% FREE open data APIs.
Why This Matters
When government says "no" to a policy (e.g., "We can't do dental screenings - legal risk"), you can instantly show citizens the nonprofits already doing it. This:
- Bypasses the technocratic veto - Shows direct alternatives
- Creates social pressure - Exposes inefficiency ("$5K legal review vs $25 screening")
- Mobilizes citizens - Provides volunteer/donation pathways
Data Sources (All Free)
1. ProPublica Nonprofit Explorer API β PRIMARY SOURCE
What it provides:
- Financial data (revenue, expenses, assets) from IRS Form 990
- NTEE codes (standardized classification)
- EIN (tax ID) for verification
- 3+ million organizations, 10+ years of data
Coverage: All nonprofits with >$50K revenue or >$250K assets
API Docs: https://projects.propublica.org/nonprofits/api
Example Usage:
from discovery.nonprofit_discovery import NonprofitDiscovery
discovery = NonprofitDiscovery()
# Search all health organizations in Tuscaloosa
health_orgs = discovery.search_propublica(
state="AL",
city="Tuscaloosa",
ntee_code="E" # E = Health
)
# Get detailed financials for specific org
details = discovery.get_propublica_org_details("63-0123456")
print(f"Revenue: ${details['filings'][0]['total_revenue']:,}")
Rate Limits: Free, unlimited. Be respectful: ~1 request/second suggested.
2. IRS Tax-Exempt Organization Search (TEOS)
What it provides:
- Official tax-exempt status
- Pub 78 verification (deductibility)
- Bulk download of all U.S. nonprofits
Source: https://www.irs.gov/charities-non-profits/tax-exempt-organization-search-bulk-data-downloads
Note: ProPublica API already includes this data, so direct IRS access only needed for bulk downloads.
3. Every.org Charity API
What it provides:
- Human-readable mission statements
- Organization logos and images
- Cause categories
- Cleaner data than raw IRS filings
API Docs: https://www.every.org/nonprofit-api
Note: May require API key for full access. Free tier available.
Example Usage:
# Search by location and cause
orgs = discovery.search_everyorg(
location="Tuscaloosa, AL",
causes=["health", "education", "youth"]
)
4. Local Service Directories (Manual Enrichment)
Findhelp.org (Aunt Bertha):
- Most comprehensive directory of local social services
- Includes specific services, hours, eligibility
- Search: https://www.findhelp.org/search?query=dental&location=Tuscaloosa,%20AL
- API access varies (request from Findhelp.org)
211 Alabama:
- Regional social services directory
- More detailed than IRS data (days/hours, languages, insurance)
- Search: https://www.211connects.org
Strategy: Use ProPublica for financial backbone, then manually enrich with Findhelp/211 for specific service details.
NTEE Code Classification
NTEE = National Taxonomy of Exempt Entities (IRS classification system)
Key Codes for Oral Health Policy
| Code | Category | Description | Example Orgs |
|---|---|---|---|
| E | Health | General and rehabilitative health | Community health centers |
| E20 | Hospitals | Primary medical care facilities | County hospitals |
| E32 | School Health | School-based health care | Mobile dental clinics in schools |
| E40 | Health General | Community clinics | Free clinics |
| E80 | Health Other | Health N.E.C. | Health advocacy groups |
| F | Mental Health | Crisis intervention | Counseling centers |
| K | Food/Nutrition | Food, agriculture, nutrition | Food banks |
| K30 | Food Service | Free food distribution | School meal programs |
| K34 | Congregate Meals | Community dining programs | Senior nutrition sites |
| N | Recreation | Sports, leisure, athletics | Community rec centers |
| O | Youth Dev | Youth development programs | After-school programs |
| O50 | Youth Other | Youth development N.E.C. | Mentoring programs |
| P | Human Services | Multipurpose human services | Family support centers |
| X | Religion | Religious organizations | Churches, synagogues |
| X20 | Christian | Christian orgs | Church health ministries |
| W | Public Benefit | Society benefit programs | Water advocacy groups |
NTEE Hierarchy
E (Health)
βββ E20 (Hospitals)
βββ E30 (Ambulatory Health)
β βββ E32 (School-Based Health) β Mobile dental units
βββ E40 (Reproductive Health)
βββ E80 (Health N.E.C.)
X (Religion)
βββ X20 (Christian) β Church health ministries
βββ X30 (Jewish)
βββ X40 (Islamic)
Quick Start
1. Discover All Tuscaloosa Nonprofits
source .venv/bin/activate
python scripts/discover_tuscaloosa_nonprofits.py
Output: frontend/policy-dashboards/src/data/tuscaloosa_nonprofits.json
2. Search Specific NTEE Codes
from discovery.nonprofit_discovery import NonprofitDiscovery
discovery = NonprofitDiscovery()
# Just dental/school health
dental = discovery.search_propublica(
state="AL",
city="Tuscaloosa",
ntee_code="E32"
)
# Churches with health ministries
churches = discovery.search_propublica(
state="AL",
city="Tuscaloosa",
ntee_code="X20"
)
# Food/nutrition programs
food = discovery.search_propublica(
state="AL",
city="Tuscaloosa",
ntee_code="K"
)
# Merge and export
all_orgs = discovery.merge_nonprofit_data(dental, churches)
all_orgs.extend(food)
discovery.export_to_frontend(all_orgs)
3. Get Detailed Financials
# Get 5 years of 990 data for a specific org
details = discovery.get_propublica_org_details("63-0123456")
print(f"Organization: {details['name']}")
print(f"NTEE: {details['ntee_code']} - {details['ntee_description']}")
print("\nRecent Filings:")
for filing in details['filings']:
revenue = filing['total_revenue']
expenses = filing['total_expenses']
year = filing['tax_period']
print(f" {year}: ${revenue:,} revenue, ${expenses:,} expenses")
Data Model
Nonprofit Record (Frontend Format)
{
"name": "Tuscaloosa County Interfaith Dental Initiative",
"ein": "63-0345678",
"ntee_code": "E32",
"ntee_description": "School-Based Health Care",
"mission": "Multi-faith collaboration providing free dental care",
"services": [
"Mobile dental unit serving Title I schools",
"Free toothbrush and fluoride programs",
"Parent education workshops"
],
"annual_budget": 125000,
"students_served": 2400,
"families_served": 0,
"youth_served": 0,
"contact": {
"website": "https://tuscaloosainterfaithdental.org",
"email": "contact@tuscaloosainterfaithdental.org",
"phone": "(205) 555-0300"
},
"logo_url": "https://...",
"volunteer_opportunities": true,
"accepting_board_members": true
}
ProPublica API Response
{
"organizations": [
{
"ein": "630345678",
"name": "TUSCALOOSA COUNTY INTERFAITH DENTAL INITIATIVE",
"city": "TUSCALOOSA",
"state": "AL",
"ntee_code": "E32",
"revenue_amount": 125000,
"asset_amount": 45000,
"income_amount": 125000
}
]
}
Architecture
Discovery Pipeline
1. Search ProPublica API
β (by state, city, NTEE code)
2. Get Financial Data
β (revenue, expenses, assets)
3. Enrich with Every.org
β (mission, logo, causes)
4. Match to Government Decisions
β (by NTEE code)
5. Export to Frontend
β
frontend/policy-dashboards/src/data/tuscaloosa_nonprofits.json
Caching Strategy
All API responses are cached in data/cache/nonprofits/:
data/cache/nonprofits/
βββ propublica_AL_E_Tuscaloosa.json
βββ propublica_AL_E32_Tuscaloosa.json
βββ propublica_org_63-0345678.json
βββ everyorg_Tuscaloosa_AL_health-education.json
Benefits:
- Faster subsequent runs (no API calls)
- Respectful to free APIs (no repeated requests)
- Offline development possible
- Manual review/editing of cached data
Cache Invalidation:
- Delete cache files to force fresh download
- Recommended refresh: Monthly (990 data updates annually)
Cost Comparison
Paid Services
| Service | Cost | Coverage |
|---|---|---|
| Candid/GuideStar Premium | $500-2,000/month | Deep services data |
| Charity Navigator API | $500+/month | Ratings + financials |
| GiveWell Data | Free (limited) | Top charities only |
Our Free Stack
| Service | Cost | Coverage |
|---|---|---|
| ProPublica API | $0 | 1.8M orgs, 10+ years |
| IRS TEOS | $0 | All U.S. nonprofits |
| Every.org API | $0 (basic) | Mission + logos |
| Total | $0/month | 95% of paid features |
What You Give Up:
- Real-time "services provided" updates (need manual enrichment)
- Phone numbers/emails (need scraping or manual entry)
- Volunteer opportunities feed (need manual verification)
What You Keep:
- All financial data (revenue, expenses, assets)
- NTEE classification (interoperable with paid services)
- Mission statements and descriptions
- Scalability to all 50 states
Advanced Usage
Bulk Download for All Alabama
# Get ALL health nonprofits in Alabama
alabama_health = []
for city in ["Birmingham", "Montgomery", "Mobile", "Tuscaloosa", "Huntsville"]:
orgs = discovery.search_propublica(
state="AL",
city=city,
ntee_code="E"
)
alabama_health.extend(orgs)
time.sleep(1) # Rate limiting
print(f"Found {len(alabama_health)} health nonprofits in Alabama")
Find Nonprofits by Revenue
# Find large health orgs (>$1M revenue)
large_orgs = [
org for org in nonprofits
if (org.get('revenue_amount') or 0) > 1000000
]
print(f"Large organizations: {len(large_orgs)}")
for org in sorted(large_orgs, key=lambda x: x['revenue_amount'], reverse=True)[:10]:
print(f" {org['name']}: ${org['revenue_amount']:,}")
Match to Government Decisions
# Load government decisions with NTEE codes
with open('frontend/policy-dashboards/src/data/tuscaloosa_policies.json') as f:
decisions = json.load(f)
# Find nonprofits for each deferred decision
for decision in decisions:
if decision.get('outcome') in ['Tabled', 'Deferred']:
ntee = decision.get('ntee_code')
# Find matching nonprofits
matches = [
org for org in nonprofits
if org['ntee_code'] == ntee or
org['ntee_code'].startswith(ntee[0])
]
if matches:
print(f"\nDecision: {decision['decision_summary']}")
print(f"Government said NO, but {len(matches)} nonprofits are doing it:")
for org in matches[:3]:
revenue = org.get('revenue_amount', 0)
print(f" β’ {org['name']}: ${revenue:,}/year")
Troubleshooting
ProPublica API Returns Empty Results
Possible causes:
- City name spelling (try "Tuscaloosa" vs "TUSCALOOSA")
- NTEE code doesn't exist in that location
- No nonprofits in that category
Solutions:
# Try broader search (remove city filter)
orgs = discovery.search_propublica(state="AL", ntee_code="E32")
# Try major category only (E vs E32)
orgs = discovery.search_propublica(state="AL", city="Tuscaloosa", ntee_code="E")
Every.org API Requires Authentication
Solution: Every.org is optional. ProPublica provides 90% of needed data.
# Skip Every.org if auth fails
try:
everyorg_orgs = discovery.search_everyorg(...)
except:
everyorg_orgs = [] # Continue with ProPublica data only
Rate Limiting
Built-in protection: Module automatically spaces requests 1 second apart.
If you hit rate limits:
discovery.min_request_interval = 2.0 # Increase to 2 seconds
Next Steps
- Run discovery:
python scripts/discover_tuscaloosa_nonprofits.py - Review output: Check
frontend/policy-dashboards/src/data/tuscaloosa_nonprofits.json - Manual enrichment: Add phone/email from Findhelp.org or 211
- Verify services: Cross-check "services provided" with org websites
- Launch frontend:
cd frontend/policy-dashboards && npm start
Resources
- ProPublica Nonprofit Explorer: https://projects.propublica.org/nonprofits/
- IRS Tax-Exempt Org Search: https://www.irs.gov/charities-non-profits/tax-exempt-organization-search
- NTEE Code Lookup: https://nccs.urban.org/publication/irs-activity-codes
- Findhelp.org: https://www.findhelp.org
- 211 Directory: https://www.211.org