Spaces:
Running on CPU Upgrade
Running on CPU Upgrade
| sidebar_position: 8 | |
| # Census American Community Survey (ACS) | |
| Add demographic, economic, housing, and social data from the U.S. Census Bureau's American Community Survey to enrich your civic engagement analysis. | |
| ## Overview | |
| The **American Community Survey (ACS)** is the premier source for detailed population and housing information about America. It provides data for communities across the United States, Puerto Rico, and Island Areas. | |
| ### What's Included | |
| - **Demographics**: Age, race, ethnicity, language, citizenship | |
| - **Economics**: Income, poverty, employment, occupation | |
| - **Housing**: Occupancy, value, rent, housing costs | |
| - **Education**: School enrollment, educational attainment | |
| - **Health**: Health insurance coverage by age and type | |
| - **Social**: Disability status, veteran status, commuting | |
| ### ACS vs. Census of Governments | |
| | Dataset | Purpose | What it Measures | | |
| |---------|---------|------------------| | |
| | **Census of Governments** | Jurisdiction discovery | Lists all government entities (cities, counties, districts) | | |
| | **American Community Survey (ACS)** | Community demographics | Population characteristics, economics, housing | | |
| **Use both together**: Census of Governments tells you *which* jurisdictions exist, ACS tells you *about the people* who live there. | |
| ## ๐ Quick Start | |
| ### 1. Get a Census API Key (Recommended) | |
| While optional, an API key increases your rate limit from 500 to 5,000 requests per day. | |
| 1. Visit: https://api.census.gov/data/key_signup.html | |
| 2. Enter your email and organization | |
| 3. Check email for API key | |
| 4. Add to `.env` file: | |
| ```bash | |
| CENSUS_API_KEY=your_key_here | |
| ``` | |
| ### 2. Run the ACS Ingestion Script | |
| ```bash | |
| # Activate virtual environment | |
| source .venv/bin/activate | |
| # Navigate to script directory | |
| cd scripts/datasources/census | |
| # Run the example (downloads sample data) | |
| python acs_ingestion.py | |
| ``` | |
| This will: | |
| - Download median household income for all U.S. counties | |
| - Download health insurance data for California | |
| - Cache data to `data/cache/acs/` | |
| ## ๐ Available Data Tables | |
| ### Demographics | |
| | Table Code | Description | Use Case | | |
| |------------|-------------|----------| | |
| | B01001 | Sex by Age | Identify communities with children (dental screening priority) | | |
| | B02001 | Race | Analyze health equity across racial groups | | |
| | B03002 | Hispanic or Latino Origin by Race | Understand demographic composition | | |
| | B05001 | Nativity and Citizenship Status | Language access planning | | |
| | B16001 | Language Spoken at Home | Multilingual outreach needs | | |
| ### Economics | |
| | Table Code | Description | Use Case | | |
| |------------|-------------|----------| | |
| | B19013 | Median Household Income | Target low-income communities for programs | | |
| | B17001 | Poverty Status | Medicaid eligibility analysis | | |
| | B23025 | Employment Status | Economic health assessment | | |
| | C24010 | Sex by Occupation | Workforce composition | | |
| ### Health Insurance โญ **Critical for Oral Health Policy** | |
| | Table Code | Description | Use Case | | |
| |------------|-------------|----------| | |
| | B27001 | Health Insurance Coverage Status by Age | Overall insurance coverage rates | | |
| | B27010 | Health Insurance Coverage (Under 19) | **Child dental insurance coverage** | | |
| | C27007 | Medicaid/Means-Tested Public Coverage | Medicaid enrollment by community | | |
| ### Education | |
| | Table Code | Description | Use Case | | |
| |------------|-------------|----------| | |
| | B15003 | Educational Attainment | Community education levels | | |
| | B14001 | School Enrollment by Age | Number of school-aged children | | |
| ## ๐ป Usage Examples | |
| ### Example 1: Download Data for All Counties | |
| ```python | |
| import asyncio | |
| from pathlib import Path | |
| from scripts.datasources.census.acs_ingestion import ACSDataIngestion | |
| async def download_county_data(): | |
| # Initialize with default cache directory | |
| acs = ACSDataIngestion() | |
| # Download median household income for all U.S. counties | |
| income_df = await acs.download_acs_data_api( | |
| table="B19013", # Median household income | |
| geography="county", # County level | |
| state="*" # All states | |
| ) | |
| print(f"Downloaded {len(income_df)} counties") | |
| print(income_df.head()) | |
| asyncio.run(download_county_data()) | |
| ``` | |
| ### Example 2: Child Health Insurance Coverage | |
| **Critical for oral health policy analysis!** | |
| ```python | |
| async def analyze_child_insurance(): | |
| acs = ACSDataIngestion() | |
| # Download health insurance for children under 19 | |
| child_insurance_df = await acs.download_acs_data_api( | |
| table="B27010", # Health insurance (Under 19) | |
| geography="county", | |
| state="*" | |
| ) | |
| # This table includes: | |
| # - With health insurance | |
| # - With public coverage (Medicaid/CHIP) | |
| # - With private coverage | |
| # - No health insurance | |
| return child_insurance_df | |
| df = asyncio.run(analyze_child_insurance()) | |
| ``` | |
| ### Example 3: Download Multiple Tables at Once | |
| ```python | |
| async def download_comprehensive_data(): | |
| acs = ACSDataIngestion() | |
| # Download all key demographic tables for California | |
| ca_data = await acs.download_all_demographics( | |
| geography="county", | |
| state="06" # California FIPS code | |
| ) | |
| # Returns dictionary with multiple DataFrames | |
| for table_code, df in ca_data.items(): | |
| print(f"{table_code}: {len(df)} counties") | |
| asyncio.run(download_comprehensive_data()) | |
| ``` | |
| ### Example 4: Use Cached Data | |
| ```python | |
| acs = ACSDataIngestion() | |
| # First call downloads from API | |
| df1 = await acs.download_acs_data_api("B19013", "county", "*") | |
| # Subsequent calls use cached Parquet file (instant!) | |
| df2 = acs.get_cached_data("B19013", "county", "*") | |
| print(f"Same data: {df1.equals(df2)}") # True | |
| ``` | |
| ## ๐๏ธ Data Storage Options | |
| ### Option 1: Default Cache (Recommended for Development) | |
| ```python | |
| # Uses data/cache/acs/ in project directory | |
| acs = ACSDataIngestion() | |
| ``` | |
| **Location**: `/home/developer/projects/open-navigator/data/cache/acs/` | |
| ### Option 2: D Drive (Windows) | |
| ```python | |
| from pathlib import Path | |
| # Store all ACS data on D drive | |
| acs = ACSDataIngestion(data_dir=Path("D:/open-navigator-data/acs")) | |
| ``` | |
| **Location**: `D:\open-navigator-data\acs\` | |
| ### Option 3: External Drive (Linux/Mac) | |
| ```python | |
| # Mount external drive first, then: | |
| acs = ACSDataIngestion(data_dir=Path("/mnt/external/acs-data")) | |
| ``` | |
| **Location**: `/mnt/external/acs-data/` | |
| ### Option 4: Network Storage | |
| ```python | |
| # For shared team access | |
| acs = ACSDataIngestion(data_dir=Path("//server/shared/acs")) | |
| ``` | |
| ## ๐ Data File Format | |
| Downloaded data is cached as **Parquet files** for fast loading: | |
| ``` | |
| data/cache/acs/ | |
| โโโ B19013_county_*_2022.parquet # Median income, all counties | |
| โโโ B27010_county_06_2022.parquet # Child insurance, CA only | |
| โโโ B01001_place_*_2022.parquet # Age/sex, all cities | |
| โโโ acs_2022_ALL/ # Bulk download (if used) | |
| ``` | |
| **Parquet advantages**: | |
| - 10x smaller than CSV | |
| - 100x faster to load | |
| - Preserves data types | |
| - Columnar storage (efficient queries) | |
| ## ๐ Geography Levels | |
| ACS data is available at multiple geographic levels: | |
| | Level | Code | Example | Records (approx.) | | |
| |-------|------|---------|-------------------| | |
| | **National** | `us` | United States | 1 | | |
| | **State** | `state` | California, Texas | 50 | | |
| | **County** | `county` | Los Angeles County | 3,200 | | |
| | **Place** | `place` | San Francisco city | 19,500 | | |
| | **Tract** | `tract` | Neighborhood-level | 85,000 | | |
| | **County Subdivision** | `cousub` | Townships | 36,000 | | |
| **Choose based on your analysis needs**: | |
| - **State-level**: Policy comparison across states | |
| - **County-level**: Regional analysis | |
| - **Place-level**: City-specific programs | |
| - **Tract-level**: Neighborhood targeting (large datasets!) | |
| ## ๐ Integration with Open Navigator | |
| ### Enriching Jurisdiction Data | |
| Combine ACS demographics with jurisdiction discovery: | |
| ```python | |
| from discovery.census_ingestion import CensusGovernmentIngestion | |
| from scripts.datasources.census.acs_ingestion import ACSDataIngestion | |
| # Step 1: Get list of all counties | |
| census = CensusGovernmentIngestion() | |
| counties_df = await census.download_census_data("counties") | |
| # Step 2: Add demographic data from ACS | |
| acs = ACSDataIngestion() | |
| demographics = await acs.download_acs_data_api("B19013", "county", "*") | |
| # Step 3: Join on FIPS code | |
| enriched = counties_df.merge(demographics, on="fips", how="left") | |
| # Now you have: county name, URL, population, AND median income! | |
| ``` | |
| ### Targeting High-Need Communities | |
| Identify counties for oral health program targeting: | |
| ```python | |
| async def find_high_need_counties(): | |
| acs = ACSDataIngestion() | |
| # Get poverty data | |
| poverty_df = await acs.download_acs_data_api("B17001", "county", "*") | |
| # Get child health insurance | |
| child_insurance_df = await acs.download_acs_data_api("B27010", "county", "*") | |
| # Combine datasets | |
| combined = poverty_df.merge(child_insurance_df, on=["state", "county"]) | |
| # Filter for high poverty + low insurance coverage | |
| high_need = combined[ | |
| (combined["poverty_rate"] > 0.15) & # > 15% poverty | |
| (combined["uninsured_children"] > 100) # > 100 uninsured kids | |
| ] | |
| return high_need | |
| ``` | |
| ## โก Performance Tips | |
| ### 1. Use State Filters | |
| ```python | |
| # โ Slow: Downloads all 3,200 counties | |
| all_counties = await acs.download_acs_data_api("B19013", "county", "*") | |
| # โ Fast: Downloads only California's 58 counties | |
| ca_counties = await acs.download_acs_data_api("B19013", "county", "06") | |
| ``` | |
| ### 2. Leverage Caching | |
| ```python | |
| # First run: Downloads from API (slow) | |
| df1 = await acs.download_acs_data_api("B19013", "county", "*") | |
| # Second run: Loads from Parquet cache (instant!) | |
| df2 = acs.get_cached_data("B19013", "county", "*") | |
| ``` | |
| ### 3. Download Multiple Tables in Parallel | |
| ```python | |
| async def parallel_download(): | |
| acs = ACSDataIngestion() | |
| # Download 3 tables simultaneously | |
| results = await asyncio.gather( | |
| acs.download_acs_data_api("B19013", "county", "*"), | |
| acs.download_acs_data_api("B27010", "county", "*"), | |
| acs.download_acs_data_api("B17001", "county", "*"), | |
| ) | |
| income_df, insurance_df, poverty_df = results | |
| ``` | |
| ### 4. Avoid Bulk Downloads (Unless Necessary) | |
| The Census Bureau offers bulk downloads of ALL ACS data: | |
| ```python | |
| # โ ๏ธ WARNING: This downloads 15 GB! | |
| await acs.download_bulk_files(state="ALL") | |
| ``` | |
| **Use bulk downloads only if**: | |
| - You need 100+ tables | |
| - You need tract-level data for entire U.S. | |
| - You're doing large-scale research | |
| **Otherwise**: Use targeted API downloads (much faster!) | |
| ## ๐ Resources | |
| ### Official Documentation | |
| - **ACS Homepage**: https://www.census.gov/programs-surveys/acs | |
| - **Table Shells**: https://www.census.gov/programs-surveys/acs/technical-documentation/table-shells.html | |
| - **API Documentation**: https://www.census.gov/data/developers/data-sets/acs-5year.html | |
| - **Data Profiles**: https://www.census.gov/acs/www/data/data-tables-and-tools/data-profiles/ | |
| ### Understanding ACS Data | |
| - **ACS 101**: https://www.census.gov/programs-surveys/acs/about.html | |
| - **When to Use ACS vs. Decennial Census**: https://www.census.gov/programs-surveys/acs/guidance.html | |
| - **Margin of Error**: ACS is a sample survey, all estimates have MOE | |
| - **5-Year vs. 1-Year Estimates**: Use 5-year for small areas (more reliable) | |
| ### State FIPS Codes | |
| Common state codes for API queries: | |
| | State | FIPS | State | FIPS | | |
| |-------|------|-------|------| | |
| | Alabama | 01 | Montana | 30 | | |
| | Alaska | 02 | Nebraska | 31 | | |
| | Arizona | 04 | Nevada | 32 | | |
| | Arkansas | 05 | New Hampshire | 33 | | |
| | California | 06 | New Jersey | 34 | | |
| | Colorado | 08 | New Mexico | 35 | | |
| | Connecticut | 09 | New York | 36 | | |
| | Delaware | 10 | North Carolina | 37 | | |
| | Florida | 12 | Ohio | 39 | | |
| | Georgia | 13 | Oklahoma | 40 | | |
| | Hawaii | 15 | Oregon | 41 | | |
| | Illinois | 17 | Pennsylvania | 42 | | |
| | Indiana | 18 | Texas | 48 | | |
| | Iowa | 19 | Utah | 49 | | |
| | Kansas | 20 | Virginia | 51 | | |
| | Louisiana | 22 | Washington | 53 | | |
| | Massachusetts | 25 | Wisconsin | 55 | | |
| | Michigan | 26 | | | | |
| **Full list**: https://www.census.gov/library/reference/code-lists/ansi/ansi-codes-for-states.html | |
| ## ๐ Troubleshooting | |
| ### "API request failed: 403" | |
| **Cause**: Rate limit exceeded (500 requests/day without API key) | |
| **Fix**: Get a Census API key (see Quick Start above) | |
| ### "Module 'config.settings' has no attribute 'CENSUS_API_KEY'" | |
| **Cause**: API key not set in configuration | |
| **Fix**: Add to `.env` file: | |
| ```bash | |
| CENSUS_API_KEY=your_key_here | |
| ``` | |
| ### "No data returned for this geography" | |
| **Cause**: Not all tables are available at all geography levels | |
| **Fix**: Check Census API documentation for table availability by geography | |
| ### Downloads are slow | |
| **Solutions**: | |
| 1. Use state filters instead of `"*"` | |
| 2. Use cached data for repeated queries | |
| 3. Download during off-peak hours (late night/early morning EST) | |
| 4. Consider bulk downloads if you need many tables | |
| ## ๐ฎ Next Steps | |
| 1. **Explore Available Tables**: Run `acs.list_available_tables()` | |
| 2. **Download Sample Data**: Try the examples in this guide | |
| 3. **Join with Jurisdictions**: Combine ACS demographics with jurisdiction URLs | |
| 4. **Build Dashboards**: Create visualizations of demographic data | |
| 5. **Target Programs**: Use poverty/insurance data to prioritize outreach | |
| ## Related Documentation | |
| - [Census of Governments](./census-governments.md) - Jurisdiction discovery | |
| - [Data Sources Overview](./citations.md) - All data sources | |
| - [D Drive Configuration](../deployment/d-drive-configuration.md) - External storage setup | |