cmp / dataset_bundle /methodology.md
cjc0013's picture
Upload 30 files
bfdd027 verified
# Congress Public Records Slice Methodology
## What This Release Is
This bundle is a neutral, review-oriented slice of public-record linkages built from a fixed House-wide run of the dataset.
It is designed for exploration and bounded verification, not for assigning guilt, wrongdoing, intent, or causality.
## What This Shows
Public records can be normalized into a reproducible graph of House trades, committees, bills, votes, campaign finance, lobbying visibility, and community project funding.
## What This Does Not Prove
This sample does not prove illegality, corruption, intent, or causality. It only shows deterministic overlap, timing, and linkage strength from official public records.
## Source Groups
- House Clerk financial disclosures and PTRs
- House Clerk member directory and committee list
- GovInfo BILLSTATUS bulk data
- House Clerk roll-call vote XML
- FEC public bulk downloads
- LDA public search pages
- House member community project funding disclosure pages
## Public Release Notes
- This release is a slice of public-record data, not a complete accounting of all potentially relevant data.
- Future releases may update or expand this slice as source recovery, parsing, and evidence linkage improve.
- This release does not assign guilt, wrongdoing, intent, or causality to any person or organization.
- The release shows public-record overlaps, timing, and linkage strength, not proof of illegality or corruption.
- Some rows remain review-tier or include unresolved official source references and should be read with those labels in mind.
- The public package includes verification summaries and SHA-backed artifact indexes, but it does not include the full internal raw corpus, so external verification is bounded by what is published here.