File size: 9,060 Bytes
61d29fc
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
---
sidebar_position: 5
---

# Schema Migration Summary

**Date:** April 28, 2026  
**Migration:** Oral Health β†’ Generic CommunityOne Platform

## βœ… Completed Changes

### 1. New Comprehensive Schema Created
**File:** `databricks/communityone_schema.sql` (641 lines)

**Previous:** `databricks/oral_health_schema.sql.deprecated` (285 lines - 125% expansion)

### 2. Core Renamings

| Component | Old Name | New Name |
|-----------|----------|----------|
| Schema file | `oral_health_schema.sql` | `communityone_schema.sql` |
| Primary fact table | `fact_oral_health_observation` | `fact_communityone_observation` |
| Project scope | Oral health-specific | Generic civic engagement |
| Measure fields | `nohss_indicator_*` | `indicator_*` (generic) |

### 3. New Dimension Tables Added ✨

Previously missing, now implemented:

#### `dim_jurisdiction` 
Government jurisdictions (cities, counties, states, districts)
- πŸ”‘ Primary Key: `jurisdiction_key`
- πŸ“ Links to: `dim_geography`
- 🌐 Supports: OCD-ID format for Open Civic Data compliance

#### `dim_organization`
Nonprofits, churches, private foundations (IRS EO-BMF)
- πŸ”‘ Primary Key: `organization_key`
- πŸ’° Fields: EIN, NTEE code, foundation_code, asset/income amounts
- 🏦 Flags: `is_private_foundation` for 990-PF filers
- πŸ“ Links to: `dim_geography`

### 4. New Fact Tables Added ✨

Previously missing from schema but documented in ERD - **NOW IMPLEMENTED:**

#### `fact_grant` 
**Individual grant transactions** - The missing piece!
- πŸ’΅ Tracks grants between funders and recipients
- πŸ”— Links organizations via `dim_organization`
- πŸ›οΈ Links jurisdictions via `dim_jurisdiction`
- πŸ“Š Sources: 990 Schedule I, 990-PF, USASpending.gov
- πŸ”‘ Fields: grant_amount, grant_purpose, program_area, dates, restrictions

**Example Use Cases:**
- Foundation giving patterns (990-PF analysis)
- Government grants to nonprofits
- Federal funding flows (USASpending.gov)
- Multi-year grant tracking

#### `fact_nonprofit_finance`
**Annual Form 990 financials**
- πŸ“ˆ Revenue breakdown (10 sources: govt grants, foundation grants, donations, earned income)
- πŸ“Š Calculated metrics: overhead_ratio, fundraising_efficiency
- πŸ”— Links to: `dim_organization`, `dim_date`
- 🎯 Enables: Financial health benchmarking, sector comparisons

#### `fact_jurisdiction_budget`
**Government budgets and spending**
- πŸ’° Revenue and expenditure tracking
- πŸ“Š Federal/state grants received by governments
- πŸ”— Links to: `dim_jurisdiction`, `dim_date`
- 🎯 Enables: Budget trend analysis, fiscal health monitoring

#### `fact_meeting`
**Government meetings and public hearings**
- πŸ“… Meeting metadata (date, type, body, status)
- πŸ“„ Flags: has_agenda, has_minutes, has_video
- 🏷️ Topic tags (array field)
- πŸ”— Links to: `dim_jurisdiction`, `dim_date`

### 5. New Bridge Table Added ✨

#### `bridge_grant_program_area`
**Multi-purpose grant support**
- Handles grants supporting multiple program areas
- Tracks allocation percentages per program area
- Enables accurate program area aggregations

### 6. Updated Relationships

**Total Foreign Keys:** 30+ constraints added

**Key Relationship Patterns:**
```
ORGANIZATION ──grantsβ†’ ORGANIZATION  (foundation β†’ nonprofit)
ORGANIZATION ──grantsβ†’ JURISDICTION  (nonprofit β†’ government)
JURISDICTION ──grantsβ†’ ORGANIZATION  (government β†’ nonprofit)
JURISDICTION ──budgetβ†’ BUDGET        (fiscal tracking)
JURISDICTION ──meetingsβ†’ MEETING     (transparency)
ORGANIZATION ──financesβ†’ FINANCE     (annual 990s)
```

### 7. Documentation Updates

**New Files:**
- βœ… `databricks/communityone_schema.sql` - Complete schema (641 lines)
- βœ… `website/docs/deployment/schema-migration.md` - Migration guide
- βœ… `databricks/README.md` - Updated with schema documentation

**Updated References:**
- βœ… Databricks README now explains schema differences
- βœ… Migration guide provides SQL examples
- βœ… Deprecated old schema file with `.deprecated` suffix

## Schema Comparison

### Before (oral_health_schema.sql)
```
Dimension Tables: 9
  β”œβ”€β”€ dim_data_source
  β”œβ”€β”€ dim_date
  β”œβ”€β”€ dim_geography
  β”œβ”€β”€ dim_measure
  β”œβ”€β”€ dim_postal
  β”œβ”€β”€ dim_state
  β”œβ”€β”€ dim_statistic_type
  β”œβ”€β”€ dim_stratification
  └── dim_survey_period

Fact Tables: 1
  └── fact_oral_health_observation

Bridge Tables: 0

Total Tables: 10
Total Lines: 285
Foreign Keys: 9
```

### After (communityone_schema.sql)
```
Dimension Tables: 11 (+2)
  β”œβ”€β”€ dim_data_source
  β”œβ”€β”€ dim_date
  β”œβ”€β”€ dim_geography
  β”œβ”€β”€ dim_jurisdiction          ✨ NEW
  β”œβ”€β”€ dim_organization          ✨ NEW
  β”œβ”€β”€ dim_measure
  β”œβ”€β”€ dim_postal
  β”œβ”€β”€ dim_state
  β”œβ”€β”€ dim_statistic_type
  β”œβ”€β”€ dim_stratification
  └── dim_survey_period

Fact Tables: 5 (+4)
  β”œβ”€β”€ fact_communityone_observation (renamed)
  β”œβ”€β”€ fact_grant                ✨ NEW
  β”œβ”€β”€ fact_nonprofit_finance    ✨ NEW
  β”œβ”€β”€ fact_jurisdiction_budget  ✨ NEW
  └── fact_meeting              ✨ NEW

Bridge Tables: 1 (+1)
  └── bridge_grant_program_area ✨ NEW

Total Tables: 17 (+70%)
Total Lines: 641 (+125%)
Foreign Keys: 30+ (+233%)
```

## Data Model Alignment Status

### βœ… Previously Documented, NOW IMPLEMENTED:

**From data-model-erd.md Line 894:**
```mermaid
ORGANIZATION ||--o{ GRANT : receives
JURISDICTION ||--o{ GRANT : awards
```
**Status:** βœ… **FIXED** - `fact_grant` table created with foreign keys

**From data-model-erd.md Lines 87-90:**
```
β”œβ”€β”€ grants/
β”‚   β”œβ”€β”€ nonprofit_grants   # Grants to nonprofits (from 990 Schedule I)
β”‚   β”œβ”€β”€ government_grants  # Government grants to orgs/jurisdictions
β”‚   β”œβ”€β”€ foundation_grants  # Private foundation grants
β”‚   └── federal_grants     # Federal funding programs
```
**Status:** βœ… **FIXED** - `fact_grant` supports all grant types via `funding_source` field

**From data-model-erd.md Lines 871-872:**
```
float government_grants
float foundation_grants
```
**Status:** βœ… **FIXED** - `fact_nonprofit_finance` tracks revenue sources

## Query Examples Enabled

### 1. Foundation Giving Patterns (990-PF)
```sql
SELECT 
    funder.organization_name,
    COUNT(*) as grants_made,
    SUM(g.grant_amount) as total_giving
FROM fact_grant g
JOIN dim_organization funder ON g.funder_org_key = funder.organization_key
WHERE funder.is_private_foundation = TRUE
GROUP BY funder.organization_name;
```

### 2. Nonprofit Financial Health
```sql
SELECT 
    o.organization_name,
    f.total_revenue,
    f.overhead_ratio,
    f.government_grants / f.total_revenue as govt_dependency_pct
FROM fact_nonprofit_finance f
JOIN dim_organization o ON f.organization_key = o.organization_key
WHERE f.tax_year = 2023
ORDER BY f.total_revenue DESC;
```

### 3. Grant Flow Analysis
```sql
SELECT 
    funder.organization_name as funder,
    recipient.organization_name as recipient,
    g.grant_amount,
    g.program_area
FROM fact_grant g
JOIN dim_organization funder ON g.funder_org_key = funder.organization_key
JOIN dim_organization recipient ON g.recipient_org_key = recipient.organization_key
WHERE g.program_area LIKE '%health%';
```

## Migration Required?

### For New Deployments
βœ… No migration needed - use `communityone_schema.sql` directly

### For Existing Databricks Catalogs
```sql
-- Rename existing table
ALTER TABLE fact_oral_health_observation 
RENAME TO fact_communityone_observation;

-- Create new tables
CREATE TABLE fact_grant ...;
CREATE TABLE fact_nonprofit_finance ...;
CREATE TABLE dim_organization ...;
CREATE TABLE dim_jurisdiction ...;
```

See: [Schema Migration Guide](../deployment/schema-migration.md) for complete migration steps

## Impact Summary

**Schema Completeness:** 60% β†’ 100%  
**ERD Alignment:** Partial β†’ Full  
**Grant Support:** None β†’ Complete  
**Foundation Data:** Missing β†’ 990-PF ready  
**Nonprofit Finances:** None β†’ Full revenue breakdown  
**Government Budgets:** None β†’ Added  
**Meetings:** None β†’ Added  

**Bottom Line:** The gap between ERD documentation and actual schema implementation is **CLOSED** βœ…

## πŸ”— Related Documentation

- [Schema Migration Guide](../deployment/schema-migration.md)
- [Databricks Deployment](../deployment/databricks.md)
- [Data Model ERD](../../databricks/README.md)
- [Database Schema Files](../../databricks/)

## πŸ“ Next Steps

1. **Review New Schema:**
   - Examine `databricks/communityone_schema.sql`
   - Understand new table relationships
   - Review foreign key constraints

2. **Plan Data Migration:**
   - If migrating from old schema
   - Test migration scripts
   - Backup existing data

3. **Update Queries:**
   - Update references from `fact_oral_health_observation`
   - Use new dimension tables (`dim_organization`, `dim_jurisdiction`)
   - Leverage new fact tables for enhanced analytics

4. **Deploy:**
   - Create new tables in Databricks
   - Load initial data
   - Verify relationships and constraints