cmp / dataset_bundle /schema_notes.md
cjc0013's picture
Upload 30 files
bfdd027 verified
# Schema Notes
## Top-Level Public Files
- `members.csv`: public member roster with basic identity and source-anchor fields.
- `scored_events.csv`: public event table with human-readable member names, score labels, reasons, counts, and source URL/SHA summaries.
- `graph_links.csv`: public relationship table with human-readable member names feeding the network graph and relationship drilldown.
- `recipient_link_quality_report.json`, `source_quality_report.json`, `provenance_coverage_report.json`: public-safe report copies from the source slice.
## Network Graph Files
- `network_graph/nodes.csv`: member, recipient, and sector nodes only.
- `network_graph/edges.csv`: aggregated member-to-recipient and member-to-sector edges with member names, status labels, and support counts.
- `network_graph/graph_config.json`: counts and default filter config for the Space.
## Audit Files
- `evidence_audit/source_artifact_index.csv`: source URLs, SHA-256 values, trust buckets, and provenance kinds.
- `evidence_audit/scored_event_index.csv`: one public summary row per scored event.
- `evidence_audit/scored_event_provenance.jsonl`: source URLs and SHA-backed artifacts for each event.
- `evidence_audit/claim_supporting_index.csv`: summary rows for claim-supporting records outside the final event table.
- `evidence_audit/claim_supporting_provenance.jsonl`: source URLs and SHA-backed artifacts for those claim-supporting rows.
The public package omits internal paths, local raw-corpus references, database metadata, and control-plane operational artifacts.