jcbowyer's picture
Deploy: Consolidated gold tables, fixed nginx docs routing
bf3e681 verified

FEC Campaign Finance Scripts

Scripts for working with Federal Election Commission campaign finance data.

Data Source

Scripts

bulk_download_fec.py

Download all FEC bulk data files and organize them by year and type.

Features:

  • Downloads all bulk data from 1980-present
  • Organizes files matching FEC website structure
  • Resume interrupted downloads
  • Progress tracking and logging
  • Filter by year or file type

Usage:

# Download everything to D:/fec_data/
python bulk_download_fec.py

# Download to custom directory
python bulk_download_fec.py --base-dir /mnt/d/fec_data

# Download specific years only
python bulk_download_fec.py --years 2020,2022,2024

# Download specific file types only
python bulk_download_fec.py --types indiv,cn,cm

# Resume interrupted download
python bulk_download_fec.py --resume

# Dry run (show what would be downloaded)
python bulk_download_fec.py --dry-run

File Types:

  • cm - Committee Master files
  • cn - Candidate Master files
  • ccl - Candidate-Committee Linkages
  • indiv - Individual Contributions
  • pas2 - PAC Summary files
  • oth - Other Transactions
  • oppexp - Operating Expenditures
  • weball - All Candidates
  • webk / webl - Current House/Senate Campaigns

Output Structure:

/mnt/d/fec_data/
└── bulk-downloads/
    β”œβ”€β”€ candidate-master/          (Candidate master files)
    β”‚   β”œβ”€β”€ 1980/cn80.zip
    β”‚   β”œβ”€β”€ 2024/cn24.zip
    β”‚   └── ...
    β”œβ”€β”€ all-candidates/            (All candidates files)
    β”‚   β”œβ”€β”€ 1980/weball80.zip
    β”‚   └── 2024/weball24.zip
    β”œβ”€β”€ house-senate-campaigns/    (Current campaigns)
    β”‚   └── 2024/
    β”‚       β”œβ”€β”€ webk24.zip
    β”‚       └── webl24.zip
    β”œβ”€β”€ committee-master/          (Committee master files)
    β”‚   └── 2024/cm24.zip
    β”œβ”€β”€ pac-summary/               (PAC summary files)
    β”‚   └── 2024/pas224.zip
    β”œβ”€β”€ contributions-by-individuals/ (Individual contributions)
    β”‚   └── 2024/indiv24.zip
    β”œβ”€β”€ candidate-committee-linkages/ (Linkages)
    β”‚   └── 2024/ccl24.zip
    β”œβ”€β”€ committee-to-committee/    (Committee transactions)
    β”‚   └── 2024/oth24.zip
    β”œβ”€β”€ operating-expenditures/    (Operating expenses)
    β”‚   └── 2024/oppexp24.zip
    β”œβ”€β”€ summary-reports/           (Summary CSVs)
    β”‚   └── 2024/
    β”‚       β”œβ”€β”€ candidate_summary_2024.csv
    β”‚       β”œβ”€β”€ independent_expenditure_2024.csv
    β”‚       └── ...
    β”œβ”€β”€ headers/                   (Data dictionaries)
    β”‚   β”œβ”€β”€ cm_header_file.csv
    β”‚   └── ...
    └── special-files/             (Lobbyist data, etc.)
        β”œβ”€β”€ lobbyist.csv
        └── ...

unzip_fec_data.py (High-Performance Edition)

Unzip all FEC bulk data files with parallel processing and 7-Zip support for maximum speed.

Performance Modes:

  • Parallel Processing: 4-8x faster with --workers 8
  • 7-Zip Extraction: 2-3x faster than Python zipfile
  • Combined: 10-15x faster with --method 7z --workers 8

Features:

  • Multiple extraction methods (Python zipfile, 7-Zip, auto-detect)
  • Parallel processing with configurable worker count
  • Maintains same folder hierarchy as source
  • Resume support (skip already unzipped files)
  • Progress tracking and logging
  • Optional: Remove ZIP files after extraction
  • Filter by category or year

Usage:

# RECOMMENDED: Unzip latest 2 years only with 8 workers (FAST & QUICK)
python unzip_fec_data.py --latest 2 --workers 8 --base-dir /mnt/d/fec_data

# FASTEST: Use 7-Zip with 8 parallel workers (10-15x faster, all years)
python unzip_fec_data.py --method 7z --workers 8 --base-dir /mnt/d/fec_data

# Fast: Use parallel workers only (4-8x faster)
python unzip_fec_data.py --workers 8 --base-dir /mnt/d/fec_data

# Moderate: Use 7-Zip single-threaded (2-3x faster)
python unzip_fec_data.py --method 7z --base-dir /mnt/d/fec_data

# Default: Python zipfile single-threaded (portable but slow)
python unzip_fec_data.py --base-dir /mnt/d/fec_data

# Auto-detect best method and optimal workers
python unzip_fec_data.py --method auto --workers 0 --base-dir /mnt/d/fec_data

# Unzip specific category with parallel workers
python unzip_fec_data.py --category candidate-master --workers 4

# Unzip specific years with parallel workers
python unzip_fec_data.py --years 2020,2022,2024 --workers 4
    
# Unzip latest 5 years only (auto-detects 2020-2024)
python unzip_fec_data.py --latest 5 --workers 8

# Resume interrupted extraction
python unzip_fec_data.py --resume --workers 8

# Dry run (show what would be unzipped)
python unzip_fec_data.py --dry-run

# Remove ZIP files after successful extraction (saves 50% disk space)
python unzip_fec_data.py --remove-zips --workers 8

Installation for 7-Zip (optional but recommended):

# Ubuntu/Debian
sudo apt-get install p7zip-full

# macOS
brew install p7zip

# Verify installation
7z --help

Output Structure:

/mnt/d/fec_data/
β”œβ”€β”€ bulk-downloads/          # Original ZIP files (source)
β”‚   β”œβ”€β”€ candidate-master/
β”‚   β”‚   β”œβ”€β”€ 1980/cn80.zip
β”‚   β”‚   └── 2024/cn24.zip
β”‚   └── ...
└── unzipped/                # Unzipped CSV/TXT files (destination)
    β”œβ”€β”€ candidate-master/
    β”‚   β”œβ”€β”€ 1980/
    β”‚   β”‚   β”œβ”€β”€ cn80/
    β”‚   β”‚   β”‚   β”œβ”€β”€ cn.txt
    β”‚   β”‚   β”‚   β”œβ”€β”€ cn_header_file.csv
    β”‚   β”‚   β”‚   └── ...
    β”‚   └── 2024/
    β”‚       └── cn24/
    β”‚           β”œβ”€β”€ cn.txt
    β”‚           └── ...
    β”œβ”€β”€ contributions-by-individuals/
    β”‚   └── 2024/
    β”‚       └── indiv24/
    β”‚           β”œβ”€β”€ indiv.txt
    β”‚           β”œβ”€β”€ indiv_header_file.csv
    β”‚           └── ...
    └── ...

Workflow:

  1. Download FEC bulk data: python bulk_download_fec.py --base-dir /mnt/d/fec_data
  2. QUICK START - Unzip latest 2 years only: python unzip_fec_data.py --latest 2 --workers 8 --base-dir /mnt/d/fec_data
    • OR FULL - Unzip all files (FAST): python unzip_fec_data.py --method 7z --workers 8 --base-dir /mnt/d/fec_data
  3. (Optional) Remove ZIPs to save space: Add --remove-zips flag to step 2

Performance Comparison:

Method Workers Speed Time (100 files)
Python zipfile 1 1x ~100 min
Python zipfile 8 4-6x ~15-20 min
7-Zip 1 2-3x ~30-40 min
7-Zip 8 10-15x ~7-10 min ⚑

Recommended Settings:

  • Maximum speed: --method 7z --workers 8 (requires 7z installed)
  • Good balance: --workers 4 (no additional software needed)
  • Portable: Default (works everywhere, no setup)

fec_integration.py

Integrate FEC API data for real-time queries.

Key Datasets

Candidates

  • Federal candidates (President, Senate, House)
  • Party affiliation
  • Election years

Committees

  • Campaign committees
  • PACs, Super PACs
  • Party committees

Financial Data

  • Individual contributions
  • Committee expenditures
  • Independent expenditures
  • Disbursements

Usage Examples

# Download candidate data
python fec_integration.py --data-type candidates --state MA

# Download committee data
python fec_integration.py --data-type committees --cycle 2024

API Key

Requires FEC API key. Set environment variable:

export FEC_API_KEY=your_key_here

Register at: https://api.data.gov/signup/

Data License

FEC data is public domain.