File size: 13,571 Bytes
61d29fc
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
---
sidebar_position: 8
---

# Census American Community Survey (ACS)

Add demographic, economic, housing, and social data from the U.S. Census Bureau's American Community Survey to enrich your civic engagement analysis.

## Overview

The **American Community Survey (ACS)** is the premier source for detailed population and housing information about America. It provides data for communities across the United States, Puerto Rico, and Island Areas.

### What's Included

- **Demographics**: Age, race, ethnicity, language, citizenship
- **Economics**: Income, poverty, employment, occupation
- **Housing**: Occupancy, value, rent, housing costs
- **Education**: School enrollment, educational attainment
- **Health**: Health insurance coverage by age and type
- **Social**: Disability status, veteran status, commuting

### ACS vs. Census of Governments

| Dataset | Purpose | What it Measures |
|---------|---------|------------------|
| **Census of Governments** | Jurisdiction discovery | Lists all government entities (cities, counties, districts) |
| **American Community Survey (ACS)** | Community demographics | Population characteristics, economics, housing |

**Use both together**: Census of Governments tells you *which* jurisdictions exist, ACS tells you *about the people* who live there.

## ๐Ÿš€ Quick Start

### 1. Get a Census API Key (Recommended)

While optional, an API key increases your rate limit from 500 to 5,000 requests per day.

1. Visit: https://api.census.gov/data/key_signup.html
2. Enter your email and organization
3. Check email for API key
4. Add to `.env` file:

```bash
CENSUS_API_KEY=your_key_here
```

### 2. Run the ACS Ingestion Script

```bash
# Activate virtual environment
source .venv/bin/activate

# Navigate to script directory
cd scripts/datasources/census

# Run the example (downloads sample data)
python acs_ingestion.py
```

This will:
- Download median household income for all U.S. counties
- Download health insurance data for California
- Cache data to `data/cache/acs/`

## ๐Ÿ“Š Available Data Tables

### Demographics

| Table Code | Description | Use Case |
|------------|-------------|----------|
| B01001 | Sex by Age | Identify communities with children (dental screening priority) |
| B02001 | Race | Analyze health equity across racial groups |
| B03002 | Hispanic or Latino Origin by Race | Understand demographic composition |
| B05001 | Nativity and Citizenship Status | Language access planning |
| B16001 | Language Spoken at Home | Multilingual outreach needs |

### Economics

| Table Code | Description | Use Case |
|------------|-------------|----------|
| B19013 | Median Household Income | Target low-income communities for programs |
| B17001 | Poverty Status | Medicaid eligibility analysis |
| B23025 | Employment Status | Economic health assessment |
| C24010 | Sex by Occupation | Workforce composition |

### Health Insurance โญ **Critical for Oral Health Policy**

| Table Code | Description | Use Case |
|------------|-------------|----------|
| B27001 | Health Insurance Coverage Status by Age | Overall insurance coverage rates |
| B27010 | Health Insurance Coverage (Under 19) | **Child dental insurance coverage** |
| C27007 | Medicaid/Means-Tested Public Coverage | Medicaid enrollment by community |

### Education

| Table Code | Description | Use Case |
|------------|-------------|----------|
| B15003 | Educational Attainment | Community education levels |
| B14001 | School Enrollment by Age | Number of school-aged children |

## ๐Ÿ’ป Usage Examples

### Example 1: Download Data for All Counties

```python
import asyncio
from pathlib import Path
from scripts.datasources.census.acs_ingestion import ACSDataIngestion

async def download_county_data():
    # Initialize with default cache directory
    acs = ACSDataIngestion()
    
    # Download median household income for all U.S. counties
    income_df = await acs.download_acs_data_api(
        table="B19013",        # Median household income
        geography="county",    # County level
        state="*"             # All states
    )
    
    print(f"Downloaded {len(income_df)} counties")
    print(income_df.head())

asyncio.run(download_county_data())
```

### Example 2: Child Health Insurance Coverage

**Critical for oral health policy analysis!**

```python
async def analyze_child_insurance():
    acs = ACSDataIngestion()
    
    # Download health insurance for children under 19
    child_insurance_df = await acs.download_acs_data_api(
        table="B27010",        # Health insurance (Under 19)
        geography="county",
        state="*"
    )
    
    # This table includes:
    # - With health insurance
    # - With public coverage (Medicaid/CHIP)
    # - With private coverage
    # - No health insurance
    
    return child_insurance_df

df = asyncio.run(analyze_child_insurance())
```

### Example 3: Download Multiple Tables at Once

```python
async def download_comprehensive_data():
    acs = ACSDataIngestion()
    
    # Download all key demographic tables for California
    ca_data = await acs.download_all_demographics(
        geography="county",
        state="06"  # California FIPS code
    )
    
    # Returns dictionary with multiple DataFrames
    for table_code, df in ca_data.items():
        print(f"{table_code}: {len(df)} counties")

asyncio.run(download_comprehensive_data())
```

### Example 4: Use Cached Data

```python
acs = ACSDataIngestion()

# First call downloads from API
df1 = await acs.download_acs_data_api("B19013", "county", "*")

# Subsequent calls use cached Parquet file (instant!)
df2 = acs.get_cached_data("B19013", "county", "*")

print(f"Same data: {df1.equals(df2)}")  # True
```

## ๐Ÿ—„๏ธ Data Storage Options

### Option 1: Default Cache (Recommended for Development)

```python
# Uses data/cache/acs/ in project directory
acs = ACSDataIngestion()
```

**Location**: `/home/developer/projects/open-navigator/data/cache/acs/`

### Option 2: D Drive (Windows)

```python
from pathlib import Path

# Store all ACS data on D drive
acs = ACSDataIngestion(data_dir=Path("D:/open-navigator-data/acs"))
```

**Location**: `D:\open-navigator-data\acs\`

### Option 3: External Drive (Linux/Mac)

```python
# Mount external drive first, then:
acs = ACSDataIngestion(data_dir=Path("/mnt/external/acs-data"))
```

**Location**: `/mnt/external/acs-data/`

### Option 4: Network Storage

```python
# For shared team access
acs = ACSDataIngestion(data_dir=Path("//server/shared/acs"))
```

## ๐Ÿ“ Data File Format

Downloaded data is cached as **Parquet files** for fast loading:

```
data/cache/acs/
โ”œโ”€โ”€ B19013_county_*_2022.parquet   # Median income, all counties
โ”œโ”€โ”€ B27010_county_06_2022.parquet  # Child insurance, CA only
โ”œโ”€โ”€ B01001_place_*_2022.parquet    # Age/sex, all cities
โ””โ”€โ”€ acs_2022_ALL/                  # Bulk download (if used)
```

**Parquet advantages**:
- 10x smaller than CSV
- 100x faster to load
- Preserves data types
- Columnar storage (efficient queries)

## ๐ŸŒ Geography Levels

ACS data is available at multiple geographic levels:

| Level | Code | Example | Records (approx.) |
|-------|------|---------|-------------------|
| **National** | `us` | United States | 1 |
| **State** | `state` | California, Texas | 50 |
| **County** | `county` | Los Angeles County | 3,200 |
| **Place** | `place` | San Francisco city | 19,500 |
| **Tract** | `tract` | Neighborhood-level | 85,000 |
| **County Subdivision** | `cousub` | Townships | 36,000 |

**Choose based on your analysis needs**:
- **State-level**: Policy comparison across states
- **County-level**: Regional analysis
- **Place-level**: City-specific programs
- **Tract-level**: Neighborhood targeting (large datasets!)

## ๐Ÿ”— Integration with Open Navigator

### Enriching Jurisdiction Data

Combine ACS demographics with jurisdiction discovery:

```python
from discovery.census_ingestion import CensusGovernmentIngestion
from scripts.datasources.census.acs_ingestion import ACSDataIngestion

# Step 1: Get list of all counties
census = CensusGovernmentIngestion()
counties_df = await census.download_census_data("counties")

# Step 2: Add demographic data from ACS
acs = ACSDataIngestion()
demographics = await acs.download_acs_data_api("B19013", "county", "*")

# Step 3: Join on FIPS code
enriched = counties_df.merge(demographics, on="fips", how="left")

# Now you have: county name, URL, population, AND median income!
```

### Targeting High-Need Communities

Identify counties for oral health program targeting:

```python
async def find_high_need_counties():
    acs = ACSDataIngestion()
    
    # Get poverty data
    poverty_df = await acs.download_acs_data_api("B17001", "county", "*")
    
    # Get child health insurance
    child_insurance_df = await acs.download_acs_data_api("B27010", "county", "*")
    
    # Combine datasets
    combined = poverty_df.merge(child_insurance_df, on=["state", "county"])
    
    # Filter for high poverty + low insurance coverage
    high_need = combined[
        (combined["poverty_rate"] > 0.15) &  # > 15% poverty
        (combined["uninsured_children"] > 100)  # > 100 uninsured kids
    ]
    
    return high_need
```

## โšก Performance Tips

### 1. Use State Filters

```python
# โŒ Slow: Downloads all 3,200 counties
all_counties = await acs.download_acs_data_api("B19013", "county", "*")

# โœ… Fast: Downloads only California's 58 counties
ca_counties = await acs.download_acs_data_api("B19013", "county", "06")
```

### 2. Leverage Caching

```python
# First run: Downloads from API (slow)
df1 = await acs.download_acs_data_api("B19013", "county", "*")

# Second run: Loads from Parquet cache (instant!)
df2 = acs.get_cached_data("B19013", "county", "*")
```

### 3. Download Multiple Tables in Parallel

```python
async def parallel_download():
    acs = ACSDataIngestion()
    
    # Download 3 tables simultaneously
    results = await asyncio.gather(
        acs.download_acs_data_api("B19013", "county", "*"),
        acs.download_acs_data_api("B27010", "county", "*"),
        acs.download_acs_data_api("B17001", "county", "*"),
    )
    
    income_df, insurance_df, poverty_df = results
```

### 4. Avoid Bulk Downloads (Unless Necessary)

The Census Bureau offers bulk downloads of ALL ACS data:

```python
# โš ๏ธ WARNING: This downloads 15 GB!
await acs.download_bulk_files(state="ALL")
```

**Use bulk downloads only if**:
- You need 100+ tables
- You need tract-level data for entire U.S.
- You're doing large-scale research

**Otherwise**: Use targeted API downloads (much faster!)

## ๐Ÿ“š Resources

### Official Documentation

- **ACS Homepage**: https://www.census.gov/programs-surveys/acs
- **Table Shells**: https://www.census.gov/programs-surveys/acs/technical-documentation/table-shells.html
- **API Documentation**: https://www.census.gov/data/developers/data-sets/acs-5year.html
- **Data Profiles**: https://www.census.gov/acs/www/data/data-tables-and-tools/data-profiles/

### Understanding ACS Data

- **ACS 101**: https://www.census.gov/programs-surveys/acs/about.html
- **When to Use ACS vs. Decennial Census**: https://www.census.gov/programs-surveys/acs/guidance.html
- **Margin of Error**: ACS is a sample survey, all estimates have MOE
- **5-Year vs. 1-Year Estimates**: Use 5-year for small areas (more reliable)

### State FIPS Codes

Common state codes for API queries:

| State | FIPS | State | FIPS |
|-------|------|-------|------|
| Alabama | 01 | Montana | 30 |
| Alaska | 02 | Nebraska | 31 |
| Arizona | 04 | Nevada | 32 |
| Arkansas | 05 | New Hampshire | 33 |
| California | 06 | New Jersey | 34 |
| Colorado | 08 | New Mexico | 35 |
| Connecticut | 09 | New York | 36 |
| Delaware | 10 | North Carolina | 37 |
| Florida | 12 | Ohio | 39 |
| Georgia | 13 | Oklahoma | 40 |
| Hawaii | 15 | Oregon | 41 |
| Illinois | 17 | Pennsylvania | 42 |
| Indiana | 18 | Texas | 48 |
| Iowa | 19 | Utah | 49 |
| Kansas | 20 | Virginia | 51 |
| Louisiana | 22 | Washington | 53 |
| Massachusetts | 25 | Wisconsin | 55 |
| Michigan | 26 | | |

**Full list**: https://www.census.gov/library/reference/code-lists/ansi/ansi-codes-for-states.html

## ๐Ÿ†˜ Troubleshooting

### "API request failed: 403"

**Cause**: Rate limit exceeded (500 requests/day without API key)

**Fix**: Get a Census API key (see Quick Start above)

### "Module 'config.settings' has no attribute 'CENSUS_API_KEY'"

**Cause**: API key not set in configuration

**Fix**: Add to `.env` file:
```bash
CENSUS_API_KEY=your_key_here
```

### "No data returned for this geography"

**Cause**: Not all tables are available at all geography levels

**Fix**: Check Census API documentation for table availability by geography

### Downloads are slow

**Solutions**:
1. Use state filters instead of `"*"`
2. Use cached data for repeated queries
3. Download during off-peak hours (late night/early morning EST)
4. Consider bulk downloads if you need many tables

## ๐Ÿ”ฎ Next Steps

1. **Explore Available Tables**: Run `acs.list_available_tables()`
2. **Download Sample Data**: Try the examples in this guide
3. **Join with Jurisdictions**: Combine ACS demographics with jurisdiction URLs
4. **Build Dashboards**: Create visualizations of demographic data
5. **Target Programs**: Use poverty/insurance data to prioritize outreach

## Related Documentation

- [Census of Governments](./census-governments.md) - Jurisdiction discovery
- [Data Sources Overview](./citations.md) - All data sources
- [D Drive Configuration](../deployment/d-drive-configuration.md) - External storage setup