Spaces:
Running on CPU Upgrade
Running on CPU Upgrade
| sidebar_position: 7 | |
| # Schema Migration Guide | |
| ## Overview | |
| CommunityOne has migrated from a domain-specific oral health schema to a **generic community engagement data platform**. This enables broader civic tech applications beyond health policy. | |
| ## What Changed | |
| ### File Rename | |
| - **Old:** `databricks/oral_health_schema.sql` | |
| - **New:** `databricks/communityone_schema.sql` | |
| - **Status:** Legacy file renamed to `.deprecated` suffix | |
| ### Table Renames | |
| | Old Name | New Name | Purpose | | |
| |----------|----------|---------| | |
| | `fact_oral_health_observation` | `fact_communityone_observation` | Generic community outcome measurements | | |
| | *(no oral health prefix in other tables)* | All dimensions remain the same | Geography, date, measure, etc. | | |
| ### Dimension Table Updates | |
| #### New: `dim_jurisdiction` | |
| Replaces inline jurisdiction data with proper dimension table: | |
| ```sql | |
| CREATE TABLE dim_jurisdiction ( | |
| jurisdiction_key string NOT NULL, | |
| jurisdiction_id string, -- OCD-ID format | |
| jurisdiction_name string, | |
| jurisdiction_type string, -- city, county, state, district | |
| geography_key string, | |
| ocd_id string, | |
| website_url string, | |
| population int, | |
| ... | |
| ) | |
| ``` | |
| #### New: `dim_organization` | |
| Nonprofit and foundation master dimension (IRS EO-BMF): | |
| ```sql | |
| CREATE TABLE dim_organization ( | |
| organization_key string NOT NULL, | |
| ein string, | |
| organization_name string, | |
| ntee_code string, | |
| foundation_code string, -- 10-13=Foundation, 15=Public Charity | |
| is_private_foundation boolean, -- 990-PF filers | |
| asset_amount decimal(18, 2), | |
| income_amount decimal(18, 2), | |
| ... | |
| ) | |
| ``` | |
| ### New Fact Tables (Previously Missing) | |
| #### 1. `fact_grant` - Grant Transactions | |
| **Purpose:** Track individual grants between funders and recipients | |
| **Data Sources:** | |
| - IRS Form 990 Schedule I (grants paid by nonprofits) | |
| - IRS Form 990-PF (private foundation giving) | |
| - USASpending.gov API (federal grants) | |
| - State grant databases | |
| ```sql | |
| CREATE TABLE fact_grant ( | |
| grant_key string NOT NULL, | |
| recipient_org_key string, -- FK to dim_organization | |
| recipient_jurisdiction_key string, -- FK to dim_jurisdiction | |
| funder_org_key string, -- FK to dim_organization | |
| funder_jurisdiction_key string, -- FK to dim_jurisdiction | |
| grant_amount decimal(18, 2), | |
| grant_purpose string, | |
| program_area string, | |
| award_date_key int, | |
| start_date_key int, | |
| end_date_key int, | |
| is_multi_year boolean, | |
| funding_source string, -- federal, state, foundation, corporate | |
| ... | |
| ) | |
| ``` | |
| **Example Queries:** | |
| ```sql | |
| -- Find all federal grants to dental nonprofits in Alabama | |
| SELECT | |
| g.grant_amount, | |
| g.grant_purpose, | |
| o.organization_name, | |
| j.jurisdiction_name | |
| FROM fact_grant g | |
| JOIN dim_organization o ON g.recipient_org_key = o.organization_key | |
| JOIN dim_jurisdiction j ON j.jurisdiction_key = g.recipient_jurisdiction_key | |
| WHERE o.ntee_code LIKE 'E%' -- Health services | |
| AND j.state_code = 'AL' | |
| AND g.funding_source = 'federal' | |
| AND g.grant_purpose LIKE '%dental%'; | |
| -- Track foundation giving patterns (990-PF data) | |
| SELECT | |
| funder.organization_name, | |
| COUNT(*) as grant_count, | |
| SUM(g.grant_amount) as total_giving, | |
| AVG(g.grant_amount) as avg_grant_size | |
| FROM fact_grant g | |
| JOIN dim_organization funder ON g.funder_org_key = funder.organization_key | |
| WHERE funder.is_private_foundation = TRUE | |
| GROUP BY funder.organization_name | |
| ORDER BY total_giving DESC; | |
| ``` | |
| #### 2. `fact_nonprofit_finance` - Annual 990 Filings | |
| **Purpose:** Detailed nonprofit financial health and revenue sources | |
| ```sql | |
| CREATE TABLE fact_nonprofit_finance ( | |
| filing_key string NOT NULL, | |
| organization_key string, | |
| ein string, | |
| tax_year int, | |
| total_revenue decimal(18, 2), | |
| total_expenses decimal(18, 2), | |
| grants_paid decimal(18, 2), | |
| government_grants decimal(18, 2), -- Revenue source | |
| foundation_grants decimal(18, 2), -- Revenue source | |
| corporate_donations decimal(18, 2), -- Revenue source | |
| individual_donations decimal(18, 2), -- Revenue source | |
| program_service_revenue decimal(18, 2), -- Earned income | |
| overhead_ratio decimal(8, 4), -- Calculated metric | |
| fundraising_efficiency decimal(8, 4), -- Calculated metric | |
| ... | |
| ) | |
| ``` | |
| **Example Queries:** | |
| ```sql | |
| -- Compare revenue sources for health vs education nonprofits | |
| SELECT | |
| SUBSTR(o.ntee_code, 1, 1) as sector, | |
| AVG(f.government_grants / f.total_revenue * 100) as govt_pct, | |
| AVG(f.foundation_grants / f.total_revenue * 100) as foundation_pct, | |
| AVG(f.individual_donations / f.total_revenue * 100) as individual_pct | |
| FROM fact_nonprofit_finance f | |
| JOIN dim_organization o ON f.organization_key = o.organization_key | |
| WHERE o.ntee_code IN ('E', 'B') -- Health, Education | |
| AND f.total_revenue > 0 | |
| GROUP BY SUBSTR(o.ntee_code, 1, 1); | |
| -- Find most efficient nonprofits | |
| SELECT | |
| o.organization_name, | |
| f.total_revenue, | |
| f.overhead_ratio, | |
| f.fundraising_efficiency | |
| FROM fact_nonprofit_finance f | |
| JOIN dim_organization o ON f.organization_key = o.organization_key | |
| WHERE f.tax_year = 2023 | |
| AND f.overhead_ratio < 0.25 -- Less than 25% overhead | |
| AND f.fundraising_efficiency > 4.0 -- $4+ raised per $1 spent | |
| ORDER BY f.total_revenue DESC; | |
| ``` | |
| #### 3. `fact_jurisdiction_budget` - Government Finances | |
| **Purpose:** Track government budgets and spending priorities | |
| ```sql | |
| CREATE TABLE fact_jurisdiction_budget ( | |
| budget_key string NOT NULL, | |
| jurisdiction_key string, | |
| fiscal_year int, | |
| total_revenue decimal(18, 2), | |
| total_expenditures decimal(18, 2), | |
| federal_grants decimal(18, 2), | |
| state_grants decimal(18, 2), | |
| property_tax_revenue decimal(18, 2), | |
| ... | |
| ) | |
| ``` | |
| #### 4. `fact_meeting` - Meetings & Public Hearings | |
| **Purpose:** Track government transparency and public engagement | |
| ```sql | |
| CREATE TABLE fact_meeting ( | |
| meeting_key string NOT NULL, | |
| jurisdiction_key string, | |
| meeting_date_key int, | |
| meeting_type string, | |
| has_agenda boolean, | |
| has_minutes boolean, | |
| has_video boolean, | |
| topic_tags array<string>, | |
| ... | |
| ) | |
| ``` | |
| #### 5. `bridge_grant_program_area` - Grant Multi-Purpose Support | |
| **Purpose:** Handle grants supporting multiple program areas | |
| ```sql | |
| CREATE TABLE bridge_grant_program_area ( | |
| grant_key string NOT NULL, | |
| program_area_code string NOT NULL, | |
| program_area_desc string, | |
| allocation_pct decimal(5, 2), -- % of grant to this area | |
| ... | |
| ) | |
| ``` | |
| ## Migration Steps | |
| ### 1. For Databricks Users | |
| Update your Unity Catalog schema creation scripts: | |
| ```sql | |
| -- Old approach (DEPRECATED) | |
| -- CREATE TABLE catalog.schema.fact_oral_health_observation ... | |
| -- New approach | |
| CREATE TABLE catalog.schema.fact_communityone_observation ...; | |
| CREATE TABLE catalog.schema.fact_grant ...; | |
| CREATE TABLE catalog.schema.fact_nonprofit_finance ...; | |
| CREATE TABLE catalog.schema.dim_organization ...; | |
| CREATE TABLE catalog.schema.dim_jurisdiction ...; | |
| ``` | |
| ### 2. For Existing Data | |
| If you have data in `fact_oral_health_observation`: | |
| ```sql | |
| -- Rename table | |
| ALTER TABLE catalog.schema.fact_oral_health_observation | |
| RENAME TO fact_communityone_observation; | |
| -- Or migrate data | |
| INSERT INTO fact_communityone_observation | |
| SELECT | |
| observation_key, | |
| measure_key, | |
| geography_key, | |
| NULL as jurisdiction_key, -- NEW column | |
| stratification_key, | |
| ... | |
| FROM fact_oral_health_observation; | |
| ``` | |
| ### 3. Update Application Code | |
| **Python/SQL queries:** | |
| ```python | |
| # Old | |
| df = spark.table("fact_oral_health_observation") | |
| # New | |
| df = spark.table("fact_communityone_observation") | |
| ``` | |
| **Documentation references:** | |
| - Update ERD diagrams | |
| - Update API documentation | |
| - Update data dictionary | |
| ## New Capabilities Enabled | |
| ### 1. Grant Flow Analysis | |
| Track money flow from funders to recipients: | |
| ```sql | |
| SELECT | |
| funder.organization_name as funder, | |
| recipient.organization_name as recipient, | |
| SUM(g.grant_amount) as total_grants, | |
| COUNT(*) as grant_count | |
| FROM fact_grant g | |
| JOIN dim_organization funder ON g.funder_org_key = funder.organization_key | |
| JOIN dim_organization recipient ON g.recipient_org_key = recipient.organization_key | |
| WHERE funder.is_private_foundation = TRUE | |
| GROUP BY funder.organization_name, recipient.organization_name; | |
| ``` | |
| ### 2. Nonprofit-Government Relationships | |
| Which nonprofits receive the most government funding? | |
| ```sql | |
| SELECT | |
| o.organization_name, | |
| SUM(CASE WHEN g.funding_source IN ('federal', 'state') | |
| THEN g.grant_amount ELSE 0 END) as govt_grants, | |
| COUNT(CASE WHEN g.funding_source IN ('federal', 'state') | |
| THEN 1 END) as govt_grant_count | |
| FROM dim_organization o | |
| LEFT JOIN fact_grant g ON o.organization_key = g.recipient_org_key | |
| GROUP BY o.organization_name | |
| HAVING govt_grants > 0 | |
| ORDER BY govt_grants DESC; | |
| ``` | |
| ### 3. Foundation Investment Patterns | |
| 990-PF Schedule I analysis: | |
| ```sql | |
| -- Where are private foundations investing? | |
| SELECT | |
| g.program_area, | |
| COUNT(DISTINCT funder.organization_key) as foundation_count, | |
| SUM(g.grant_amount) as total_investment, | |
| AVG(g.grant_amount) as avg_grant_size | |
| FROM fact_grant g | |
| JOIN dim_organization funder ON g.funder_org_key = funder.organization_key | |
| WHERE funder.is_private_foundation = TRUE | |
| AND g.program_area IS NOT NULL | |
| GROUP BY g.program_area | |
| ORDER BY total_investment DESC; | |
| ``` | |
| ### 4. Financial Health Benchmarking | |
| ```sql | |
| -- Compare your nonprofit to sector averages | |
| WITH sector_avg AS ( | |
| SELECT | |
| SUBSTR(o.ntee_code, 1, 1) as sector, | |
| AVG(f.overhead_ratio) as avg_overhead, | |
| AVG(f.fundraising_efficiency) as avg_efficiency | |
| FROM fact_nonprofit_finance f | |
| JOIN dim_organization o ON f.organization_key = o.organization_key | |
| WHERE f.tax_year = 2023 | |
| GROUP BY SUBSTR(o.ntee_code, 1, 1) | |
| ) | |
| SELECT | |
| o.organization_name, | |
| f.overhead_ratio, | |
| s.avg_overhead as sector_avg_overhead, | |
| f.fundraising_efficiency, | |
| s.avg_efficiency as sector_avg_efficiency | |
| FROM fact_nonprofit_finance f | |
| JOIN dim_organization o ON f.organization_key = o.organization_key | |
| JOIN sector_avg s ON SUBSTR(o.ntee_code, 1, 1) = s.sector | |
| WHERE f.tax_year = 2023 | |
| AND o.ein = 'YOUR-EIN-HERE'; | |
| ``` | |
| ## Backward Compatibility | |
| ### Deprecated Fields | |
| The following fields in `dim_measure` are renamed for generic use: | |
| | Old Field | New Field | Notes | | |
| |-----------|-----------|-------| | |
| | `nohss_indicator_nbr` | `indicator_nbr` | Generic indicator number | | |
| | `nohss_indicator_group_type` | `indicator_group_type` | Generic grouping | | |
| | `nohss_indicator_desc` | `indicator_desc` | Generic description | | |
| ### Views for Compatibility | |
| Create views to maintain old query compatibility: | |
| ```sql | |
| CREATE VIEW fact_oral_health_observation AS | |
| SELECT * FROM fact_communityone_observation | |
| WHERE measure_key IN ( | |
| SELECT measure_key FROM dim_measure | |
| WHERE indicator_group_type = 'oral_health' | |
| ); | |
| ``` | |
| ## Questions? | |
| - **Schema issues:** See [Data Model ERD](/docs/data-sources/data-model-erd) | |
| - **Grant data sources:** See [Nonprofit Data Sources](/docs/data-sources/nonprofit-sources) | |
| - **990-PF parsing:** See [Form 990 XML Guide](/docs/data-sources/form-990-xml) | |