--- title: EpsteinWithAnomScore emoji: 👁 colorFrom: gray colorTo: green sdk: gradio sdk_version: 6.5.1 app_file: app.py pinned: false license: mit --- # Epstein Corpus Explorer (Space + Dataset split) This Space is a read-only browser for a large SQLite corpus plus optional signal cards. - Space: `cjc0013/EpsteinWithAnomScore` - Dataset: `cjc0013/EpsteinWithAnomScore` ## Links - Space: https://huggingface.co/spaces/cjc0013/EpsteinWithAnomScore - Dataset file (DB): https://huggingface.co/datasets/cjc0013/EpsteinWithAnomScore/blob/main/corpus.sqlite ## What this app does - Opens `corpus.sqlite` in read-only mode - FTS keyword search (`chunks_fts`) - Cluster browsing across runs (`cluster_summary`) - Open any `uid` and view local context window (`order_index +/- k`) - Optional Signals tab for method-sanitized signal cards (JSONL/CSV), then open linked chunks ## Core principle Raw data is not modified here. This app is for indexing, browsing, and narrowing search space. Signal/anomaly values are triage hints, not proof. ## How DB loading works Priority order: 1. `CORPUS_SQLITE_PATH` (if set) 2. Local paths like `./data/corpus.sqlite` 3. Download from dataset repo using: - `DATASET_REPO_ID` - `DATASET_FILENAME` (default: `corpus.sqlite`) Recommended Space variables: - `DATASET_REPO_ID = cjc0013/EpsteinWithAnomScore` - `DATASET_FILENAME = corpus.sqlite` - `DB_LOCAL_DIR = ./data` (optional) ## Optional Signals file loading If you publish a signals file in the dataset, the app can load it automatically. Supported names: - `public_method_sanitized_topN.jsonl` - `public_top_signals.jsonl` - CSV variants of the same names Priority order: 1. `METHOD_SIGNALS_PATH` (if set) 2. Common local paths (`./data`, `./dataset`, `/data`) 3. Download from dataset repo with: - `METHOD_SIGNALS_DATASET_REPO_ID` - `METHOD_SIGNALS_FILENAME` Recommended variables (if signals are in same dataset repo): - `METHOD_SIGNALS_DATASET_REPO_ID = cjc0013/EpsteinWithAnomScore` - `METHOD_SIGNALS_FILENAME = public_method_sanitized_topN.jsonl` ```txt gradio>=4.0.0 huggingface_hub>=0.20.0