FOIA_Doc_Search / README.md
GodsDevProject's picture
Update README.md
144fa14 verified
|
raw
history blame
4.53 kB
metadata
title: Federal FOIA Intelligence Search
emoji: πŸ›οΈ
colorFrom: blue
colorTo: indigo
sdk: gradio
sdk_version: 6.3.0
app_file: app.py
pinned: true
tags:
  - foia
  - government-transparency
  - public-records
  - journalism
  - open-data
  - legal-tech
  - oversight
license: mit
short_description: 'FOIA INTELLIGENCE SEARCH '

Federal FOIA Intelligence Search

Public Electronic Reading Rooms Only

This Hugging Face Space provides a federated search and analysis interface for publicly released U.S. Government records made available under the Freedom of Information Act (FOIA).

It is designed for journalists, researchers, legal professionals, oversight bodies, and the public.


πŸ” What This Space Does

  • Searches only publicly accessible FOIA Electronic Reading Rooms
  • Uses official agency search endpoints or public landing pages
  • Enforces robots.txt, rate limiting, and safe defaults
  • Provides semantic search, clustering, and visualization
  • Generates court-ready citations and FOIA request packets

πŸ›οΈ Live Public Sources

This Space currently supports live querying or indexed access to the following public FOIA repositories:

  • CIA FOIA Electronic Reading Room
  • FBI Vault
  • NSA FOIA Library
  • Department of Defense FOIA Reading Room
  • National Reconnaissance Office (NRO) Declassified Releases
  • Department of Justice (DOJ) FOIA Library
  • Department of Homeland Security (DHS) FOIA Library
  • U.S. Department of State FOIA Reading Room

All access is:

  • Public
  • Unauthenticated
  • Non-privileged
  • Read-only

πŸ“‚ Hosted Public Collections (Clearly Labeled)

Some historically named programs or collections (e.g. AATIP, Special Access Programs, Special Activities) do not operate independent FOIA portals.

When these appear in the interface, they are:

  • Explicitly labeled as Hosted Public Releases
  • Linked to the original publishing agency
  • Limited to already-declassified, publicly released documents
  • Included for historical research and transparency

No restricted or classified systems are accessed.


πŸ“Š Analytics & Visualization Features

  • Real-time agency coverage heatmap (documents per agency)
  • Latency & health indicators per source
  • Interactive semantic cluster graph (Plotly)
  • Timeline views for document release dates
  • Result deduplication and clustering

🧠 Semantic Search

  • Uses sentence-transformers + FAISS
  • Supports:
    • Semantic clustering
    • Search-within-results
    • Topic grouping
  • No model training on classified or private data

🧾 Legal & Journalistic Tools

  • Court-ready Bluebook citation PDF export
  • Journalist ZIP export (documents + index)
  • FOIA request packet generator (PDF / text)
  • FOIA exemption (b-code) labeling (heuristic, informational only)

⚠️ FOIA requests are generated only.
This Space does not submit requests on behalf of users.


βš–οΈ What This Space Does NOT Do

❌ No classified access
❌ No scraping behind authentication
❌ No bypassing agency safeguards
❌ No intelligence analysis or inference
❌ No automated FOIA submissions
❌ No user tracking beyond standard HF analytics


πŸ“œ Legal Basis

All content accessed through:

  • 5 U.S.C. Β§ 552 (Freedom of Information Act)
  • Agency-maintained Electronic Reading Rooms
  • Public U.S. Government websites

This Space operates entirely within:

  • U.S. law
  • Agency publication rules
  • Hugging Face platform policies

πŸ›‘οΈ Safety & Governance

  • Robots.txt enforcement per adapter
  • Per-agency rate limits
  • Per-agency kill switches
  • Health monitoring & auto-disable
  • Adapter compliance tests (CI)

🎯 Intended Use

This project is intended to support:

  • Investigative journalism
  • Academic and historical research
  • Legal review and litigation support
  • Government oversight
  • Public transparency initiatives

πŸ“¬ Disclaimer

This tool aggregates and analyzes public information only.
It does not provide legal advice, intelligence assessments, or official interpretations.

Users are responsible for verifying primary sources.


πŸ“¦ Open Source & Transparency

All adapters, logic, and safety mechanisms are visible in the source code. No hidden data sources or privileged access exist.


Federal FOIA Intelligence Search
Making public records easier to find, understand, and cite.