andandandand's picture
Deploy cold-start reliability update (source: 85cf4fa)
d110c29 verified
---
title: "HyperView: Jaguar Embedding Geometry Comparison"
emoji: 🐆
colorFrom: green
colorTo: yellow
sdk: docker
app_port: 7860
pinned: false
---
# HyperView Jaguar Core Claims Demo
This Space compares the top core-claims-set families in three geometric panels:
1. Euclidean: `triplet:T0:msv3` (seed 43)
2. Hyperspherical view: `arcface:O0:msv3` (seed 44)
3. Hyperbolic (Poincare) view: `lorentz:O1:msv3` (seed 44)
The app loads train + validation-tagged samples from a resized Hugging Face dataset and injects precomputed embedding assets generated offline on GPU.
## Contracts
Runtime environment variables:
- `HF_DATASET_REPO` (default: `hyper3labs/jaguar-hyperview-demo`)
- `HF_DATASET_CONFIG` (default: `default`)
- `HF_DATASET_SPLIT` (default: `train`)
- `EMBEDDING_ASSET_DIR` (default: `./assets`)
- `EMBEDDING_ASSET_MANIFEST` (default: `${EMBEDDING_ASSET_DIR}/manifest.json`)
- `HYPERVIEW_STARTUP_MODE` (default: `serve_fast`; choices: `serve_fast|blocking`)
- `HYPERVIEW_WARMUP_STATUS_PATH` (default: `/tmp/hyperview_warmup_status.json`)
- `HYPERVIEW_WARMUP_FAILURE_POLICY` (default: `exit`; choices: `exit|warn`)
- `HYPERVIEW_BATCH_INSERT_SIZE` (default: `500`; controls sample-batch insertion chunk size)
- `HYPERVIEW_DEFAULT_PANEL` (default: `spherical3d`; enables Sphere 3D as initial scatter panel)
- `HYPERVIEW_LAYOUT_CACHE_VERSION` (default: `v6`; bumps dock layout localStorage key to invalidate stale cached panel state)
- `HYPERVIEW_BIND_HOST` (preferred bind host; optional)
- `SPACE_HOST` (compat input only; used for bind only if local: `0.0.0.0`, `127.0.0.1`, `localhost`, `::`, `::1`)
- `SPACE_PORT` (primary port source)
- `PORT` (fallback port source when `SPACE_PORT` is unset)
Port precedence: `SPACE_PORT` > `PORT` > `7860`.
On Hugging Face Spaces, `SPACE_HOST` may be injected as `<space-subdomain>.hf.space`. That domain must not be used as a local bind socket, so the runtime falls back to `0.0.0.0` unless `HYPERVIEW_BIND_HOST` is explicitly set.
The runtime also patches HyperView's dock-layout cache key from legacy `hyperview:dockview-layout:v5` to `hyperview:dockview-layout:${HYPERVIEW_LAYOUT_CACHE_VERSION}` to force migration away from stale panel layouts after UI/layout changes. For future migrations, increment `HYPERVIEW_LAYOUT_CACHE_VERSION` (for example, `v7`) without changing code.
## Startup and Warmup Semantics
- `HYPERVIEW_STARTUP_MODE=serve_fast` (default):
- Starts the HyperView server immediately.
- Runs dataset warmup asynchronously in a background thread.
- Warmup phases are persisted as JSON: `ingest -> spaces -> layouts -> ready`.
- `HYPERVIEW_STARTUP_MODE=blocking`:
- Performs warmup synchronously before serving traffic.
Warmup status JSON fields include:
- `status` (`starting|running|ready|failed`)
- `phase` (`boot|ingest|spaces|layouts|ready|failed`)
- `counts` (sample/space/layout counters and ingestion stats)
- `error` (exception payload when warmup fails)
- `timestamps` (`started_at`, `updated_at`, plus terminal timestamps)
Failure policy behavior:
- `HYPERVIEW_WARMUP_FAILURE_POLICY=exit` (default): process exits on warmup failure.
- `HYPERVIEW_WARMUP_FAILURE_POLICY=warn`: process stays up and records failure in warmup status JSON.
Healthcheck semantics:
- Container health (`/__hyperview__/health`) indicates server liveness only.
- Data readiness (dataset/spaces/layouts completed) is indicated by warmup status JSON (`status=ready`).
## Important Note
HyperView similarity search currently uses cosine distance in storage backends. The Lorentz panel in this Space is intended for embedding-space visualization and geometry-aware comparison rather than canonical Lorentz-distance retrieval scoring.
## Reproducibility Commands
Run from this folder (`HyperViewDemoHuggingFaceSpace/`).
### 1) Build embedding assets (GPU required)
```bash
source .venv/bin/activate
python3 scripts/build_hyperview_demo_assets.py \
--model_manifest config/model_manifest.json \
--dataset_root ../kaggle_jaguar_dataset_v2 \
--coreset_csv ../data/validation_coreset.csv \
--output_dir ./assets \
--device cuda \
--batch_size 64 \
--num_workers 4
```
### 2) Publish resized demo dataset
```bash
source .venv/bin/activate
python3 scripts/publish_hyperview_demo_dataset.py \
--dataset_root ../kaggle_jaguar_dataset_v2 \
--coreset_csv ../data/validation_coreset.csv \
--output_dir ./dataset_build \
--repo_id hyper3labs/jaguar-hyperview-demo \
--config_name default
```
Use `--no_push` for local dry-runs.
### 3) Local Docker smoke run
```bash
docker build -t jaguar-hyperview .
docker run --rm -p 7860:7860 \
-e HF_DATASET_REPO=hyper3labs/jaguar-hyperview-demo \
-e EMBEDDING_ASSET_DIR=/home/user/app/assets \
jaguar-hyperview
```
Open `http://127.0.0.1:7860`.
### 4) Optional H100 batch export on HPI
```bash
sbatch remote_setup/build_hyperview_demo_assets_h100.slurm
```
Override defaults at submit time if needed:
```bash
MODEL_MANIFEST=config/model_manifest.json \
OUTPUT_DIR=./assets \
sbatch remote_setup/build_hyperview_demo_assets_h100.slurm
```
## Provenance
Model manifest: `config/model_manifest.json`
Ranking and source-of-truth anchors:
- `reports/summaries_of_findings/core_claims_axis12_paper_facing_tables_2026_03_16_102311/axis1_primary_ranking.csv`
- `paper_draft/second_draft/sources_of_truth.md`