# Nonprofit Discovery Module Automated discovery and enrichment of nonprofits and churches using **100% FREE** open data APIs. ## Why This Matters When government says "no" to a policy (e.g., "We can't do dental screenings - legal risk"), you can instantly show citizens the nonprofits **already doing it**. This: 1. **Bypasses the technocratic veto** - Shows direct alternatives 2. **Creates social pressure** - Exposes inefficiency ("$5K legal review vs $25 screening") 3. **Mobilizes citizens** - Provides volunteer/donation pathways ## Data Sources (All Free) ### 1. ProPublica Nonprofit Explorer API ⭐ PRIMARY SOURCE **What it provides:** - Financial data (revenue, expenses, assets) from IRS Form 990 - NTEE codes (standardized classification) - EIN (tax ID) for verification - 3+ million organizations, 10+ years of data **Coverage:** All nonprofits with >$50K revenue or >$250K assets **API Docs:** https://projects.propublica.org/nonprofits/api **Example Usage:** ```python from discovery.nonprofit_discovery import NonprofitDiscovery discovery = NonprofitDiscovery() # Search all health organizations in Tuscaloosa health_orgs = discovery.search_propublica( state="AL", city="Tuscaloosa", ntee_code="E" # E = Health ) # Get detailed financials for specific org details = discovery.get_propublica_org_details("63-0123456") print(f"Revenue: ${details['filings'][0]['total_revenue']:,}") ``` **Rate Limits:** Free, unlimited. Be respectful: ~1 request/second suggested. --- ### 2. IRS Tax-Exempt Organization Search (TEOS) **What it provides:** - Official tax-exempt status - Pub 78 verification (deductibility) - Bulk download of all U.S. nonprofits **Source:** https://www.irs.gov/charities-non-profits/tax-exempt-organization-search-bulk-data-downloads **Note:** ProPublica API already includes this data, so direct IRS access only needed for bulk downloads. --- ### 3. Every.org Charity API **What it provides:** - Human-readable mission statements - Organization logos and images - Cause categories - Cleaner data than raw IRS filings **API Docs:** https://www.every.org/nonprofit-api **Note:** May require API key for full access. Free tier available. **Example Usage:** ```python # Search by location and cause orgs = discovery.search_everyorg( location="Tuscaloosa, AL", causes=["health", "education", "youth"] ) ``` --- ### 4. Local Service Directories (Manual Enrichment) **Findhelp.org (Aunt Bertha):** - Most comprehensive directory of local social services - Includes specific services, hours, eligibility - Search: https://www.findhelp.org/search?query=dental&location=Tuscaloosa,%20AL - API access varies (request from Findhelp.org) **211 Alabama:** - Regional social services directory - More detailed than IRS data (days/hours, languages, insurance) - Search: https://www.211connects.org **Strategy:** Use ProPublica for financial backbone, then manually enrich with Findhelp/211 for specific service details. --- ## NTEE Code Classification NTEE = **National Taxonomy of Exempt Entities** (IRS classification system) ### Key Codes for Oral Health Policy | Code | Category | Description | Example Orgs | |------|----------|-------------|--------------| | **E** | Health | General and rehabilitative health | Community health centers | | **E20** | Hospitals | Primary medical care facilities | County hospitals | | **E32** | School Health | School-based health care | Mobile dental clinics in schools | | **E40** | Health General | Community clinics | Free clinics | | **E80** | Health Other | Health N.E.C. | Health advocacy groups | | **F** | Mental Health | Crisis intervention | Counseling centers | | **K** | Food/Nutrition | Food, agriculture, nutrition | Food banks | | **K30** | Food Service | Free food distribution | School meal programs | | **K34** | Congregate Meals | Community dining programs | Senior nutrition sites | | **N** | Recreation | Sports, leisure, athletics | Community rec centers | | **O** | Youth Dev | Youth development programs | After-school programs | | **O50** | Youth Other | Youth development N.E.C. | Mentoring programs | | **P** | Human Services | Multipurpose human services | Family support centers | | **X** | Religion | Religious organizations | Churches, synagogues | | **X20** | Christian | Christian orgs | Church health ministries | | **W** | Public Benefit | Society benefit programs | Water advocacy groups | ### NTEE Hierarchy ``` E (Health) ├── E20 (Hospitals) ├── E30 (Ambulatory Health) │ └── E32 (School-Based Health) ⭐ Mobile dental units ├── E40 (Reproductive Health) └── E80 (Health N.E.C.) X (Religion) ├── X20 (Christian) ⭐ Church health ministries ├── X30 (Jewish) └── X40 (Islamic) ``` ## Quick Start ### 1. Discover All Tuscaloosa Nonprofits ```bash source .venv/bin/activate python scripts/discover_tuscaloosa_nonprofits.py ``` **Output:** `frontend/policy-dashboards/src/data/tuscaloosa_nonprofits.json` ### 2. Search Specific NTEE Codes ```python from discovery.nonprofit_discovery import NonprofitDiscovery discovery = NonprofitDiscovery() # Just dental/school health dental = discovery.search_propublica( state="AL", city="Tuscaloosa", ntee_code="E32" ) # Churches with health ministries churches = discovery.search_propublica( state="AL", city="Tuscaloosa", ntee_code="X20" ) # Food/nutrition programs food = discovery.search_propublica( state="AL", city="Tuscaloosa", ntee_code="K" ) # Merge and export all_orgs = discovery.merge_nonprofit_data(dental, churches) all_orgs.extend(food) discovery.export_to_frontend(all_orgs) ``` ### 3. Get Detailed Financials ```python # Get 5 years of 990 data for a specific org details = discovery.get_propublica_org_details("63-0123456") print(f"Organization: {details['name']}") print(f"NTEE: {details['ntee_code']} - {details['ntee_description']}") print("\nRecent Filings:") for filing in details['filings']: revenue = filing['total_revenue'] expenses = filing['total_expenses'] year = filing['tax_period'] print(f" {year}: ${revenue:,} revenue, ${expenses:,} expenses") ``` ## Data Model ### Nonprofit Record (Frontend Format) ```json { "name": "Tuscaloosa County Interfaith Dental Initiative", "ein": "63-0345678", "ntee_code": "E32", "ntee_description": "School-Based Health Care", "mission": "Multi-faith collaboration providing free dental care", "services": [ "Mobile dental unit serving Title I schools", "Free toothbrush and fluoride programs", "Parent education workshops" ], "annual_budget": 125000, "students_served": 2400, "families_served": 0, "youth_served": 0, "contact": { "website": "https://tuscaloosainterfaithdental.org", "email": "contact@tuscaloosainterfaithdental.org", "phone": "(205) 555-0300" }, "logo_url": "https://...", "volunteer_opportunities": true, "accepting_board_members": true } ``` ### ProPublica API Response ```json { "organizations": [ { "ein": "630345678", "name": "TUSCALOOSA COUNTY INTERFAITH DENTAL INITIATIVE", "city": "TUSCALOOSA", "state": "AL", "ntee_code": "E32", "revenue_amount": 125000, "asset_amount": 45000, "income_amount": 125000 } ] } ``` ## Architecture ### Discovery Pipeline ``` 1. Search ProPublica API ↓ (by state, city, NTEE code) 2. Get Financial Data ↓ (revenue, expenses, assets) 3. Enrich with Every.org ↓ (mission, logo, causes) 4. Match to Government Decisions ↓ (by NTEE code) 5. Export to Frontend ↓ frontend/policy-dashboards/src/data/tuscaloosa_nonprofits.json ``` ### Caching Strategy All API responses are cached in `data/cache/nonprofits/`: ``` data/cache/nonprofits/ ├── propublica_AL_E_Tuscaloosa.json ├── propublica_AL_E32_Tuscaloosa.json ├── propublica_org_63-0345678.json └── everyorg_Tuscaloosa_AL_health-education.json ``` **Benefits:** - Faster subsequent runs (no API calls) - Respectful to free APIs (no repeated requests) - Offline development possible - Manual review/editing of cached data **Cache Invalidation:** - Delete cache files to force fresh download - Recommended refresh: Monthly (990 data updates annually) ## Cost Comparison ### Paid Services | Service | Cost | Coverage | |---------|------|----------| | **Candid/GuideStar Premium** | $500-2,000/month | Deep services data | | **Charity Navigator API** | $500+/month | Ratings + financials | | **GiveWell Data** | Free (limited) | Top charities only | ### Our Free Stack | Service | Cost | Coverage | |---------|------|----------| | **ProPublica API** | $0 | 1.8M orgs, 10+ years | | **IRS TEOS** | $0 | All U.S. nonprofits | | **Every.org API** | $0 (basic) | Mission + logos | | **Total** | **$0/month** | 95% of paid features | **What You Give Up:** - Real-time "services provided" updates (need manual enrichment) - Phone numbers/emails (need scraping or manual entry) - Volunteer opportunities feed (need manual verification) **What You Keep:** - All financial data (revenue, expenses, assets) - NTEE classification (interoperable with paid services) - Mission statements and descriptions - Scalability to all 50 states ## Advanced Usage ### Bulk Download for All Alabama ```python # Get ALL health nonprofits in Alabama alabama_health = [] for city in ["Birmingham", "Montgomery", "Mobile", "Tuscaloosa", "Huntsville"]: orgs = discovery.search_propublica( state="AL", city=city, ntee_code="E" ) alabama_health.extend(orgs) time.sleep(1) # Rate limiting print(f"Found {len(alabama_health)} health nonprofits in Alabama") ``` ### Find Nonprofits by Revenue ```python # Find large health orgs (>$1M revenue) large_orgs = [ org for org in nonprofits if (org.get('revenue_amount') or 0) > 1000000 ] print(f"Large organizations: {len(large_orgs)}") for org in sorted(large_orgs, key=lambda x: x['revenue_amount'], reverse=True)[:10]: print(f" {org['name']}: ${org['revenue_amount']:,}") ``` ### Match to Government Decisions ```python # Load government decisions with NTEE codes with open('frontend/policy-dashboards/src/data/tuscaloosa_policies.json') as f: decisions = json.load(f) # Find nonprofits for each deferred decision for decision in decisions: if decision.get('outcome') in ['Tabled', 'Deferred']: ntee = decision.get('ntee_code') # Find matching nonprofits matches = [ org for org in nonprofits if org['ntee_code'] == ntee or org['ntee_code'].startswith(ntee[0]) ] if matches: print(f"\nDecision: {decision['decision_summary']}") print(f"Government said NO, but {len(matches)} nonprofits are doing it:") for org in matches[:3]: revenue = org.get('revenue_amount', 0) print(f" • {org['name']}: ${revenue:,}/year") ``` ## Troubleshooting ### ProPublica API Returns Empty Results **Possible causes:** - City name spelling (try "Tuscaloosa" vs "TUSCALOOSA") - NTEE code doesn't exist in that location - No nonprofits in that category **Solutions:** ```python # Try broader search (remove city filter) orgs = discovery.search_propublica(state="AL", ntee_code="E32") # Try major category only (E vs E32) orgs = discovery.search_propublica(state="AL", city="Tuscaloosa", ntee_code="E") ``` ### Every.org API Requires Authentication **Solution:** Every.org is optional. ProPublica provides 90% of needed data. ```python # Skip Every.org if auth fails try: everyorg_orgs = discovery.search_everyorg(...) except: everyorg_orgs = [] # Continue with ProPublica data only ``` ### Rate Limiting **Built-in protection:** Module automatically spaces requests 1 second apart. If you hit rate limits: ```python discovery.min_request_interval = 2.0 # Increase to 2 seconds ``` ## Next Steps 1. **Run discovery:** `python scripts/discover_tuscaloosa_nonprofits.py` 2. **Review output:** Check `frontend/policy-dashboards/src/data/tuscaloosa_nonprofits.json` 3. **Manual enrichment:** Add phone/email from Findhelp.org or 211 4. **Verify services:** Cross-check "services provided" with org websites 5. **Launch frontend:** `cd frontend/policy-dashboards && npm start` ## Resources - **ProPublica Nonprofit Explorer:** https://projects.propublica.org/nonprofits/ - **IRS Tax-Exempt Org Search:** https://www.irs.gov/charities-non-profits/tax-exempt-organization-search - **NTEE Code Lookup:** https://nccs.urban.org/publication/irs-activity-codes - **Findhelp.org:** https://www.findhelp.org - **211 Directory:** https://www.211.org