File size: 8,175 Bytes
61d29fc
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
---
sidebar_position: 5
---

# Enterprise Tech Integration Guide

This guide documents the enterprise technology platforms and programs that support Open Navigator's data infrastructure.

## Implementation Status Legend

- βœ… **Active** - Fully implemented and in production use
- πŸ”„ **Recommended** - Implementation recommended for enhancement
- πŸ“š **Reference** - Used as inspiration for data modeling
- πŸ” **Evaluation** - Under consideration for future adoption

## 1. Cloud & Data Platforms

### βœ… Microsoft: Tech for Social Impact

**Status:** ACTIVE - Nonprofit CDM fully implemented

**What we use:**
- Nonprofit Common Data Model (CDM) for constituent management
- 8 core entities: CONSTITUENT, DONATION, CAMPAIGN, DESIGNATION, MEMBERSHIP, VOLUNTEER_ACTIVITY, PROGRAM_DELIVERY, PROGRAM_OUTCOME

**Files:**
- See [Nonprofit & Philanthropy](/docs/data-sources/citations#nonprofit--philanthropy) section
- ERD: [Data Model](/data-sources/data-model-erd)

**Resources:**
- GitHub: https://github.com/microsoft/Industry-Accelerator-Nonprofit
- License: MIT

---

### πŸ”„ Google: Data Commons

**Status:** RECOMMENDED - Implementation available, not yet deployed

**What we use:**
- Knowledge Graph API for jurisdiction demographics
- 100+ variables per jurisdiction (income, education, health, housing)
- Simplifies Census Bureau data access

**Implementation:**
- Code: `discovery/google_data_commons.py`
- Install: `pip install datacommons datacommons-pandas`
- Documentation: https://docs.datacommons.org/api/

**Next Steps:**
1. Install dependencies: `pip install datacommons datacommons-pandas`
2. Update `discovery/census_ingestion.py` to use Data Commons client
3. Replace manual Census API calls with simplified DC API
4. Add time-series enrichment for historical trends

**Example Usage:**
```python
from discovery.google_data_commons import DataCommonsClient

client = DataCommonsClient()

# Enrich a single jurisdiction
data = client.enrich_jurisdiction("01073")  # Jefferson County, AL
print(data["Median_Income_Household"])  # $65,000

# Bulk enrich multiple jurisdictions
fips_codes = ["01073", "01089", "01097"]
df = client.enrich_jurisdictions_bulk(fips_codes)

# Get time series
df_ts = client.get_time_series("01073", start_year=2015)
```

**Benefits:**
- βœ… Simpler API than raw Census Bureau
- βœ… 100+ pre-integrated variables
- βœ… Automatic data quality validation
- βœ… Time series support
- βœ… No API key required (free tier)

---

### πŸ”„ AWS: Open Data for Good

**Status:** PLANNED - Best practices for dataset exports

**What we use:**
- Parquet format best practices
- S3 storage patterns
- AWS Glue Data Catalog

**Recommendations for `/exports` folder:**
1. **Format:** Use Parquet with Snappy compression
2. **Partitioning:** Partition by `state/county/year`
3. **Versioning:** Enable S3 versioning for lineage
4. **Catalog:** Use AWS Glue for schema management
5. **Querying:** Athena for SQL without ETL

**Next Steps:**
1. Review AWS Registry examples: https://registry.opendata.aws
2. Update export scripts to generate Parquet
3. Document partitioning strategy
4. Consider AWS Glue for metadata

---

## 2. Data Engineering Platforms

### βœ… Databricks: Databricks for Good

**Status:** ACTIVE - Full implementation

**What we use:**
- **Unity Catalog:** Model registry and data governance
- **Delta Lake:** Bronze/Silver/Gold lakehouse architecture
- **MLflow:** Agent deployment and experiment tracking
- **Model Serving:** Auto-scaling REST endpoints for agents
- **Agent Bricks:** Mosaic AI Agent Framework

**Files:**
- `pipeline/delta_lake.py` - Delta Lake pipeline
- `agents/mlflow_classifier.py` - Policy classifier agent
- `agents/mlflow_base.py` - Base MLflow agent class
- `databricks/deployment.py` - Unity Catalog deployment
- `databricks/evaluation.py` - Agent evaluation framework
- `databricks/notebooks/01_agent_bricks_quickstart.py` - Quickstart notebook

**Resources:**
- Documentation: https://docs.databricks.com/
- Unity Catalog: https://docs.databricks.com/en/data-governance/unity-catalog/
- Solution Accelerators: https://www.databricks.com/solutions/accelerators

**Delta Sharing for Public Exports:**
```python
from databricks import delta_sharing

# Share Gold layer tables
share = delta_sharing.SharingClient()
share.create_share(
    name="one_civic_data",
    tables=["gold.jurisdictions", "gold.meetings", "gold.nonprofits"]
)
```

---

### πŸ” Snowflake: Snowflake for Good

**Status:** EVALUATION - Consider for enterprise data sharing

**What we use:**
- Data Marketplace for Census/ESG data
- Data sharing capabilities

**Evaluation Criteria:**
- Cost vs. Databricks
- Data Marketplace value-add
- Enterprise collaboration needs

---

### πŸ“š Oracle: NetSuite Social Impact

**Status:** REFERENCE - Inspiration for nonprofit accounting

**What we use:**
- Fund accounting model patterns
- Grant tracking workflows

**Resources:**
- https://netsuite.com/social-impact

---

### πŸ“š Salesforce: Nonprofit Success Pack (NPSP)

**Status:** REFERENCE - Inspiration for constituent management

**What we use:**
- Household accounts model
- Recurring donations pattern
- Program engagement tracking

**NPSP β†’ ONE Mappings:**

| NPSP Object | Our Entity | Use Case |
|-------------|------------|----------|
| Contact | CONSTITUENT | Donor, volunteer, beneficiary |
| Opportunity | DONATION | Financial contributions |
| Campaign | CAMPAIGN | Fundraising campaigns |
| Engagement Plan | VOLUNTEER_ACTIVITY | Volunteer tracking |
| Program Cohort | PROGRAM_DELIVERY | Program participants |

**Resources:**
- GitHub: https://github.com/SalesforceFoundation/NPSP
- License: BSD-3-Clause

---

## 3. Infrastructure & AI

### πŸ“š Cisco: Crisis Response

**Status:** REFERENCE - Inspiration for platform resilience

**Focus:**
- Network connectivity during emergencies
- System resilience patterns

**Resources:**
- https://cisco.com/crisis-response

---

### πŸ“š IBM: Science for Social Good

**Status:** REFERENCE - AI/ML use case patterns

**Focus:**
- Watson AI for civic applications
- Blockchain for transparency
- Quantum computing potential

**Resources:**
- https://ibm.com/social-good

---

### πŸ” Meta: Data for Good

**Status:** EVALUATION - Population mapping potential

**What we use:**
- High-Resolution Population Density Maps
- Social Connectedness Index

**Evaluation:**
- Integration with demographics
- Use for underserved area identification

**Resources:**
- https://dataforgood.facebook.com

---

## Summary: Current vs. Planned Integrations

| Platform | Status | Priority | Effort | Value |
|----------|--------|----------|--------|-------|
| Microsoft CDM | βœ… Active | - | - | HIGH |
| Databricks | βœ… Active | - | - | HIGH |
| Google Data Commons | πŸ”„ Recommended | HIGH | Low | HIGH |
| AWS Best Practices | πŸ”„ Planned | MEDIUM | Medium | MEDIUM |
| Snowflake | πŸ” Evaluation | LOW | Medium | MEDIUM |
| Meta Data for Good | πŸ” Evaluation | LOW | Medium | MEDIUM |
| Salesforce NPSP | πŸ“š Reference | - | - | - |
| Oracle NetSuite | πŸ“š Reference | - | - | - |
| Cisco | πŸ“š Reference | - | - | - |
| IBM | πŸ“š Reference | - | - | - |

## Recommended Implementation Order

1. **Google Data Commons** (Immediate - Low effort, High value)
   - Install dependencies
   - Update census ingestion
   - Test with sample jurisdictions
   - Deploy to production

2. **AWS Export Optimization** (Next sprint - Medium effort, Medium value)
   - Convert exports to Parquet
   - Implement partitioning
   - Document patterns

3. **Databricks Delta Sharing** (Future - Medium effort, Medium value)
   - Configure sharing
   - Create public share
   - Document access

4. **Snowflake/Meta Evaluation** (Backlog - TBD)
   - POC evaluation
   - Cost-benefit analysis
   - Decision by end of quarter

---

## How to Cite These Partnerships

All enterprise technology partnerships are properly cited in:

**[Citations & Data Sources - Enterprise Tech for Social Good](/docs/data-sources/citations#-enterprise-tech-for-social-good)**

Includes:
- Full program URLs
- Implementation status
- License information
- BibTeX citations (where applicable)
- Code examples