File size: 5,372 Bytes
61d29fc
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
---
sidebar_position: 8
---

# County Search and Aggregation - Status Summary

## Issue Identified

County search filtering is not working because:
1. ❌ The `county` field in `jurisdictions_search` table is NULL for most records
2. ❌ Cities don't have county data (Census gazetteer files don't include it)
3. ⚠️  Townships have county data but database update failed due to transaction errors

## What We Have Now

### ✅ Created

1. **ZIP Code to County Mapping** (`data/gold/reference/zip_county_mapping.parquet`)
   - 33,791 ZIP codes mapped to counties
   - Downloaded from Census Bureau 2020 ZCTA-to-County relationship file

2. **Scripts**:
   - `scripts/data/download_county_mappings.py` - Downloads Census relationship files
   - `scripts/data/update_jurisdiction_counties.py` - Updates database with county data

3. **Documentation**: `website/docs/guides/county-aggregation.md`
   - Complete guide on county-level aggregation
   - Examples of queries and API usage
   - Future enhancement roadmap

### ⚠️ Partially Working

- **Township County Mapping**: Code works (infers from GEOID) but needs database fix
- **Search API**: Already supports `county` parameter, just needs data

### ❌ Not Yet Available

- **City to County Mapping**: Census doesn't provide this in gazetteer files
  - Need geocoding API OR state-specific Census relationship files
  - Affects 32,333 cities

## How to Fix County Search

### Quick Fix: Update Townships (23,318 records)

The update script has a bug where database transaction errors cause rollback. Fix:

```python
# In update_township_with_counties(), add error handling per row:
for _, row in townships_df.iterrows():
    try:
        # Update code...
        conn.commit()  # Commit each row individually
    except Exception as e:
        conn.rollback()  # Rollback only this row
        logger.error(f"Error: {e}")
```

Then run:
```bash
python scripts/data/update_jurisdiction_counties.py
```

This will populate county data for **townships** (64% of non-county jurisdictions).

### Complete Fix: Add City-County Mapping

Three options:

**Option 1: Use Geocoding (Recommended for now)**
- Use city lat/lon coordinates (already in data)
- Call geocoding API (Nominatim, Google, etc.)
- Free tier available

**Option 2: Download Census Relationship Files**
- Download state-by-state from Census Bureau
- URL: `https://www2.census.gov/geo/docs/maps-data/data/rel2020/place/`
- Process each state file

**Option 3: Use OpenStreetMap**
- Query OSM Nominatim API for each city
- Extract county from administrative boundaries

## Current Database State

```
Type              Total    With County   Percent
-------------------------------------------------
city             32,333            0      0.0%
county            3,222            0      0.0%
school_district  13,326            0      0.0%
township         36,421            0      0.0%  (should be 64% after fix)
```

## API Already Works

The Search API in `api/routes/search_postgres.py` already has county filtering:

```python
# City filter
if city:
    where_clauses.append(f"LOWER(name) LIKE LOWER(${param_idx})")
    params.append(f"%{city}%")
```

Just missing: `county` filter (which would be trivial to add once data exists)

## Frontend Already Uses It

The Home.tsx component already passes county to the API:

```typescript
if (searchScope === 'county' || searchScope === 'city') {
  if (location.county) params.set('county', location.county)
}
```

## Next Steps (Priority Order)

### 1. Fix Township Update (10 minutes)
Edit `scripts/data/update_jurisdiction_counties.py` to commit per-row instead of in batch.
This will populate 23,318 township records with county data.

### 2. Add County Filter to Search API (5 minutes)
Add to `api/routes/search_postgres.py`:
```python
# County filter
if county:
    where_clauses.append(f"county = ${param_idx}")
    params.append(county)
    param_idx += 1
```

### 3. Add City-County Geocoding (1-2 hours)
Create script to geocode all 32,333 cities using Nominatim:
```python
from geopy.geocoders import Nominatim

geolocator = Nominatim(user_agent="open-navigator")
# Geocode each city, extract county, update database
# Add rate limiting (1 request/second for free tier)
```

### 4. Test County Search
Once townships have county data:
1. Search for "Boston" with scope="county"
2. Should return all jurisdictions in the county
3. Verify heatmap/stats aggregate correctly

## Files Created

1. `/home/developer/projects/open-navigator/scripts/data/download_county_mappings.py`
2. `/home/developer/projects/open-navigator/scripts/data/update_jurisdiction_counties.py`
3. `/home/developer/projects/open-navigator/website/docs/guides/county-aggregation.md`
4. `/home/developer/projects/open-navigator/data/gold/reference/zip_county_mapping.parquet`

## Summary

**Problem**: County field is empty in database → county filtering doesn't work

**Root Cause**: Census gazetteer files don't include county for cities/places

**Solution Created**:
1. ✅ Downloaded ZIP→County mapping (33,791 records)
2. ✅ Created scripts to update database
3. ⚠️  Township updates ready but transaction error needs fix
4. ❌ City→County needs geocoding or additional Census files

**Impact**: Once township fix is applied, 64% of jurisdictions will have county data. City geocoding will bring it to 100%.