Spaces:

CommunityOne
/

open-navigator

Running on CPU Upgrade

File size: 8,175 Bytes

61d29fc

---
sidebar_position: 5
---

# Enterprise Tech Integration Guide

This guide documents the enterprise technology platforms and programs that support Open Navigator's data infrastructure.

## Implementation Status Legend

- ✅ **Active** - Fully implemented and in production use
- 🔄 **Recommended** - Implementation recommended for enhancement
- 📚 **Reference** - Used as inspiration for data modeling
- 🔍 **Evaluation** - Under consideration for future adoption

## 1. Cloud & Data Platforms

### ✅ Microsoft: Tech for Social Impact

**Status:** ACTIVE - Nonprofit CDM fully implemented

**What we use:**
- Nonprofit Common Data Model (CDM) for constituent management
- 8 core entities: CONSTITUENT, DONATION, CAMPAIGN, DESIGNATION, MEMBERSHIP, VOLUNTEER_ACTIVITY, PROGRAM_DELIVERY, PROGRAM_OUTCOME

**Files:**
- See [Nonprofit & Philanthropy](/docs/data-sources/citations#nonprofit--philanthropy) section
- ERD: [Data Model](/data-sources/data-model-erd)

**Resources:**
- GitHub: https://github.com/microsoft/Industry-Accelerator-Nonprofit
- License: MIT

---

### 🔄 Google: Data Commons

**Status:** RECOMMENDED - Implementation available, not yet deployed

**What we use:**
- Knowledge Graph API for jurisdiction demographics
- 100+ variables per jurisdiction (income, education, health, housing)
- Simplifies Census Bureau data access

**Implementation:**
- Code: `discovery/google_data_commons.py`
- Install: `pip install datacommons datacommons-pandas`
- Documentation: https://docs.datacommons.org/api/

**Next Steps:**
1. Install dependencies: `pip install datacommons datacommons-pandas`
2. Update `discovery/census_ingestion.py` to use Data Commons client
3. Replace manual Census API calls with simplified DC API
4. Add time-series enrichment for historical trends

**Example Usage:**
```python
from discovery.google_data_commons import DataCommonsClient

client = DataCommonsClient()

# Enrich a single jurisdiction
data = client.enrich_jurisdiction("01073")  # Jefferson County, AL
print(data["Median_Income_Household"])  # $65,000

# Bulk enrich multiple jurisdictions
fips_codes = ["01073", "01089", "01097"]
df = client.enrich_jurisdictions_bulk(fips_codes)

# Get time series
df_ts = client.get_time_series("01073", start_year=2015)
```

**Benefits:**
- ✅ Simpler API than raw Census Bureau
- ✅ 100+ pre-integrated variables
- ✅ Automatic data quality validation
- ✅ Time series support
- ✅ No API key required (free tier)

---

### 🔄 AWS: Open Data for Good

**Status:** PLANNED - Best practices for dataset exports

**What we use:**
- Parquet format best practices
- S3 storage patterns
- AWS Glue Data Catalog

**Recommendations for `/exports` folder:**
1. **Format:** Use Parquet with Snappy compression
2. **Partitioning:** Partition by `state/county/year`
3. **Versioning:** Enable S3 versioning for lineage
4. **Catalog:** Use AWS Glue for schema management
5. **Querying:** Athena for SQL without ETL

**Next Steps:**
1. Review AWS Registry examples: https://registry.opendata.aws
2. Update export scripts to generate Parquet
3. Document partitioning strategy
4. Consider AWS Glue for metadata

---

## 2. Data Engineering Platforms

### ✅ Databricks: Databricks for Good

**Status:** ACTIVE - Full implementation

**What we use:**
- **Unity Catalog:** Model registry and data governance
- **Delta Lake:** Bronze/Silver/Gold lakehouse architecture
- **MLflow:** Agent deployment and experiment tracking
- **Model Serving:** Auto-scaling REST endpoints for agents
- **Agent Bricks:** Mosaic AI Agent Framework

**Files:**
- `pipeline/delta_lake.py` - Delta Lake pipeline
- `agents/mlflow_classifier.py` - Policy classifier agent
- `agents/mlflow_base.py` - Base MLflow agent class
- `databricks/deployment.py` - Unity Catalog deployment
- `databricks/evaluation.py` - Agent evaluation framework
- `databricks/notebooks/01_agent_bricks_quickstart.py` - Quickstart notebook

**Resources:**
- Documentation: https://docs.databricks.com/
- Unity Catalog: https://docs.databricks.com/en/data-governance/unity-catalog/
- Solution Accelerators: https://www.databricks.com/solutions/accelerators

**Delta Sharing for Public Exports:**
```python
from databricks import delta_sharing

# Share Gold layer tables
share = delta_sharing.SharingClient()
share.create_share(
    name="one_civic_data",
    tables=["gold.jurisdictions", "gold.meetings", "gold.nonprofits"]
)
```

---

### 🔍 Snowflake: Snowflake for Good

**Status:** EVALUATION - Consider for enterprise data sharing

**What we use:**
- Data Marketplace for Census/ESG data
- Data sharing capabilities

**Evaluation Criteria:**
- Cost vs. Databricks
- Data Marketplace value-add
- Enterprise collaboration needs

---

### 📚 Oracle: NetSuite Social Impact

**Status:** REFERENCE - Inspiration for nonprofit accounting

**What we use:**
- Fund accounting model patterns
- Grant tracking workflows

**Resources:**
- https://netsuite.com/social-impact

---

### 📚 Salesforce: Nonprofit Success Pack (NPSP)

**Status:** REFERENCE - Inspiration for constituent management

**What we use:**
- Household accounts model
- Recurring donations pattern
- Program engagement tracking

**NPSP → ONE Mappings:**

| NPSP Object | Our Entity | Use Case |
|-------------|------------|----------|
| Contact | CONSTITUENT | Donor, volunteer, beneficiary |
| Opportunity | DONATION | Financial contributions |
| Campaign | CAMPAIGN | Fundraising campaigns |
| Engagement Plan | VOLUNTEER_ACTIVITY | Volunteer tracking |
| Program Cohort | PROGRAM_DELIVERY | Program participants |

**Resources:**
- GitHub: https://github.com/SalesforceFoundation/NPSP
- License: BSD-3-Clause

---

## 3. Infrastructure & AI

### 📚 Cisco: Crisis Response

**Status:** REFERENCE - Inspiration for platform resilience

**Focus:**
- Network connectivity during emergencies
- System resilience patterns

**Resources:**
- https://cisco.com/crisis-response

---

### 📚 IBM: Science for Social Good

**Status:** REFERENCE - AI/ML use case patterns

**Focus:**
- Watson AI for civic applications
- Blockchain for transparency
- Quantum computing potential

**Resources:**
- https://ibm.com/social-good

---

### 🔍 Meta: Data for Good

**Status:** EVALUATION - Population mapping potential

**What we use:**
- High-Resolution Population Density Maps
- Social Connectedness Index

**Evaluation:**
- Integration with demographics
- Use for underserved area identification

**Resources:**
- https://dataforgood.facebook.com

---

## Summary: Current vs. Planned Integrations

| Platform | Status | Priority | Effort | Value |
|----------|--------|----------|--------|-------|
| Microsoft CDM | ✅ Active | - | - | HIGH |
| Databricks | ✅ Active | - | - | HIGH |
| Google Data Commons | 🔄 Recommended | HIGH | Low | HIGH |
| AWS Best Practices | 🔄 Planned | MEDIUM | Medium | MEDIUM |
| Snowflake | 🔍 Evaluation | LOW | Medium | MEDIUM |
| Meta Data for Good | 🔍 Evaluation | LOW | Medium | MEDIUM |
| Salesforce NPSP | 📚 Reference | - | - | - |
| Oracle NetSuite | 📚 Reference | - | - | - |
| Cisco | 📚 Reference | - | - | - |
| IBM | 📚 Reference | - | - | - |

## Recommended Implementation Order

1. **Google Data Commons** (Immediate - Low effort, High value)
   - Install dependencies
   - Update census ingestion
   - Test with sample jurisdictions
   - Deploy to production

2. **AWS Export Optimization** (Next sprint - Medium effort, Medium value)
   - Convert exports to Parquet
   - Implement partitioning
   - Document patterns

3. **Databricks Delta Sharing** (Future - Medium effort, Medium value)
   - Configure sharing
   - Create public share
   - Document access

4. **Snowflake/Meta Evaluation** (Backlog - TBD)
   - POC evaluation
   - Cost-benefit analysis
   - Decision by end of quarter

---

## How to Cite These Partnerships

All enterprise technology partnerships are properly cited in:

**[Citations & Data Sources - Enterprise Tech for Social Good](/docs/data-sources/citations#-enterprise-tech-for-social-good)**

Includes:
- Full program URLs
- Implementation status
- License information
- BibTeX citations (where applicable)
- Code examples