# Backend — Django REST API

Stateless REST API serving the PhD research models. No database — PyTorch checkpoints are loaded into memory at startup and used to answer each request independently. Three Django app endpoints expose [COINs](../../docs/glossary.md#coins) link prediction and query answering, [MultiProxAn](../../docs/glossary.md#multiproxan) graph generation, and KG anomaly correction; one global `threading.Lock` serializes inference because the deployed container has no GPU and shares 2 vCPU.

For deeper material:

- [`docs/reference/api.md`](../../docs/reference/api.md) — every endpoint, request and response shape.
- [`docs/reference/sse-protocol.md`](../../docs/reference/sse-protocol.md) — wire format for streaming inference.
- [`docs/reference/backend-services.md`](../../docs/reference/backend-services.md) — module-by-module reference.
- [`docs/explanation/architecture.md`](../../docs/explanation/architecture.md) — how the backend, the SPA and HF Hub fit together.
- [`docs/explanation/inference-lifecycle.md`](../../docs/explanation/inference-lifecycle.md) — boot sequence, lazy weight loading, the inference lock.
- [`docs/glossary.md`](../../docs/glossary.md) — domain vocabulary.

This README covers the practical surface: running the backend, where things live, env vars, the endpoint table, and the streaming protocol summary.

## Prerequisites

1. **Mamba environment** mirroring the deployment image. The repo-root `environment.yml`
   captures the conda half (Python 3.9, `rdkit=2023.03.2`, `boost=1.78`, cairo, etc.):
   ```bash
   mamba env create -n website_c -f ../../environment.yml
   mamba activate website_c
   ```

2. **Pip dependencies** (GPU torch, Django, DRF, …):
   ```bash
   pip install --extra-index-url https://download.pytorch.org/whl/cu118 -r requirements.txt
   ```

3. **Model checkpoints** — downloaded automatically from the Hugging Face Hub model repo
   `Bani57/checkpoints` on first boot. The remote layout mirrors the on-disk one, so
   `huggingface_hub.snapshot_download(local_dir=CHECKPOINTS_ROOT)` drops files directly
   into the expected paths:
   - `src/research/COINs-KGGeneration/graph_completion/checkpoints/` (COINs: `{dataset}_{algorithm}.tar`)
   - `src/research/COINs-KGGeneration/graph_completion/results/{dataset}/` (KBGAT TransE init: `transe_model.tar`)
   - `src/research/COINs-KGGeneration/graph_generation/checkpoints/` (KG anomaly: `{dataset}.ckpt`, `{dataset}_correct.ckpt`)
   - `src/research/MultiProxAn/checkpoints/` (graph generation: `{dataset}.ckpt`, `{dataset}_c.ckpt`)

   To (re-)publish the checkpoints to the Hub from a local copy:
   ```bash
   huggingface-cli login   # one-time
   python ../../scripts/upload_checkpoints.py --create
   ```

4. **Dataset files** — the raw KG data files must be present under `src/research/COINs-KGGeneration/data/` (FB15k-237, WN18RR, NELL-995).

## Running

From `src/backend/`:

```bash
# Development server
python manage.py runserver 8000

# With custom settings
DJANGO_DEBUG=True DJANGO_SECRET_KEY=my-secret python manage.py runserver
```

The API is served at `http://localhost:8000/api/v1/`.

## Environment Variables

| Variable | Default | Description |
|---|---|---|
| `DJANGO_SECRET_KEY` | `dev-insecure-key-change-in-production` | Django secret key. **Set in production.** |
| `DJANGO_DEBUG` | `True` | Enable debug mode. Set to `False` in production. |
| `DJANGO_ALLOWED_HOSTS` | `localhost,127.0.0.1` | Comma-separated allowed hosts. |
| `CORS_ALLOWED_ORIGINS` | `https://bani57-website.hf.space` | Comma-separated allowed CORS origins. |
| `TORCH_DEVICE` | Auto (`cuda:0` if available, else `cpu`) | PyTorch device for model inference. |
| `RESEARCH_ROOT` | `<repo>/src/research` (dev), `/app/research` (image) | Where the research-code modules live. |
| `CHECKPOINTS_ROOT` | Same as `RESEARCH_ROOT` | Where `huggingface_hub` deposits weights. Override to e.g. `/data/checkpoints` on a paid HF Space with persistent storage. |
| `HF_CHECKPOINTS_REPO` | `Bani57/checkpoints` | HF Hub model repo holding all weights. |
| `HF_TOKEN` | unset | Recommended. Read-scope token lifts anonymous rate limits and roughly triples cold-start download throughput. Required if the repo is private. Empty values are unset by `entrypoint.sh` to avoid a malformed `Bearer ` header. |
| `HF_HUB_ENABLE_HF_TRANSFER` | `1` (image), unset (dev) | Enables the Rust-accelerated `hf_transfer` backend for `snapshot_download`. |
| `SPA_DIST_DIR` | `<backend>/dist` | Folder containing `index.html` from `npm run build`. WhiteNoise serves assets from here. |

## Startup Sequence

In the deployment container the entrypoint script pre-warms the checkpoint download
from the Hugging Face Hub *before* gunicorn starts, so workers never block on the
network. Then on Django boot (`ApiConfig.ready()`), the `ModelRegistry` initializes:

1. **Verify / download checkpoints** from `Bani57/checkpoints` on HF Hub if any expected
   subdir is missing. Idempotent — a no-op when the entrypoint already populated the tree
   or when running locally with weights on disk.
2. **Scan checkpoint directories** to detect available models per method
3. **Load lightweight COINs Loaders** — one per dataset (freebase, wordnet, nell), loading graph data, name maps, and train/val/test splits. Heavy arrays (node neighbours ~275MB each, community neighbours, adjacency dicts) are freed after initialization to keep memory low.
4. **Generate sample subgraphs** for KG anomaly using the COINs Loaders

All model weights (COINs inference, graph generation, KG anomaly) are loaded lazily at first inference request.

## Deployment

The site is packaged as a single Docker image and deployed to a Hugging Face Space
(`Bani57/website` -> <https://bani57-website.hf.space>). The image:

- builds the Vue SPA with `npm run build` in a Node 20 stage,
- assembles a `mambaorg/micromamba` runtime mirroring the local `website_c` env from
  `environment.yml` + `requirements.txt` (GPU torch wheels, `cu118`),
- copies the SPA `dist/` next to Django so WhiteNoise serves it on the same origin as
  `/api/v1/`,
- runs `entrypoint.sh`, which `snapshot_download`s checkpoints from
  `Bani57/checkpoints` on HF Hub into `/app/checkpoints` and execs `gunicorn` on `0.0.0.0:7860`.

Local reproduction:
```bash
docker compose up --build
# -> http://localhost:7860
```

Push to the Space (one-time remote setup):
```bash
git remote add hf https://huggingface.co/spaces/Bani57/website
git push hf master:main
```

## API Endpoints

All endpoints are prefixed with `/api/v1/`.

### Health & Discovery

| Method | Path | Description |
|---|---|---|
| `GET` | `/health` | Service health + model availability + inference lock status |
| `GET` | `/methods` | List the 3 research methods |
| `POST` | `/debug/force-unlock` | Release stuck inference lock (debug mode only) |

### COINs — KG Reasoning

| Method | Path | Description |
|---|---|---|
| `GET` | `/coins/datasets` | List datasets with entity/relation counts |
| `GET` | `/coins/datasets/{id}/entities` | Paginated entity search (`?q=&page=&page_size=`) |
| `GET` | `/coins/datasets/{id}/relations` | Paginated relation search (`?q=&page=&page_size=`) |
| `GET` | `/coins/datasets/{id}/sample-triples` | Random training triples (`?count=10&seed=...`); optional `seed` makes sampling deterministic (same `seed+count` ⇒ same triples, e.g. seed by ISO date for a day-stable widget). Head/relation/tail each carry a dataset-cleaned `label` alongside `id`, `name` |
| `GET` | `/coins/datasets/{id}/sample-query` | Sample a structurally valid KG query (`?query_structure=2i&count=1&seed=...`). Walks the training graph to produce real paths/intersections. Returns `{anchors, relations, target}` keyed by node/edge IDs from `/coins/query-structures`. Preferred over `sample-triples` for multi-hop/intersection prefills |
| `GET` | `/coins/models` | Available algorithms + supported query structures |
| `GET` | `/coins/query-structures` | Query graph templates for frontend rendering |
| `POST` | `/coins/predict` | Run link prediction / query answering |

### Graph Generation — MultiProxAn

| Method | Path | Description |
|---|---|---|
| `GET` | `/graph-generation/datasets` | List graph types with node/edge types |
| `GET` | `/graph-generation/sampling-modes` | Sampling strategies with parameter specs |
| `POST` | `/graph-generation/generate` | **Streaming SSE.** Generate a graph (standard denoising or MultiProx Gibbs init) |
| `POST` | `/graph-generation/continue` | **Streaming SSE.** Advance a MultiProx Gibbs session by one step |

### KG Anomaly Correction

| Method | Path | Description |
|---|---|---|
| `GET` | `/kg-anomaly/datasets` | List datasets with correction models |
| `GET` | `/kg-anomaly/datasets/{id}/sample-subgraphs` | Pre-computed example subgraphs (`?count=5&noise_level=0.4&task=correct&seed=42`); noise is task-aware |
| `POST` | `/kg-anomaly/correct` | **Streaming SSE.** Correct/regenerate a KG subgraph (standard denoising or MultiProx Gibbs init) |
| `POST` | `/kg-anomaly/continue` | **Streaming SSE.** Advance a MultiProx correction session by one step |

## Streaming Inference Protocol (SSE)

The graph generation endpoints (`/generate`, `/continue`) return **Server-Sent Events** (`text/event-stream`). Three event types are emitted:

**`event: progress`** — phase/step metadata (no images):
```
event: progress
data: {"type":"progress","phase":"denoise","step":42,"total_steps":500,"elapsed_ms":2100}
```

KG-anomaly progress events additionally carry an optional `kg_log_likelihood`
(float) + `kg_log_likelihood_step` (int) on frame boundaries — the mean
log-sigmoid score from the frozen KG embedder + link ranker on the edges
currently present in the argmax reconstruction. Higher = cleaner.

**`event: preview`** — base64 PNG of the graph's current state, emitted at key frames:
```
event: preview
data: data:image/png;base64,...
```

Preview frequency: `denoise` emits at `chain_frames` intervals (~30 over 500 steps), `gibbs` emits every inner step, `refine` emits every ~10% of steps.

**`event: result`** — final payload with image, chain GIF, and timing:
```
event: result
data: {"type":"result","dataset_id":"qm9","model_type":"discrete","sampling_mode":"standard","image":"data:image/png;base64,...","chain_gif":"data:image/gif;base64,...","inference_time_ms":25000}
```

Phases: `denoise` (standard generation loop), `noise_init` (multiprox init noise sampling), `gibbs` (multiprox inner Gibbs steps), `refine` (multiprox refinement denoising).

## Project Structure

```
src/backend/
  manage.py
  requirements.txt
  research_api/          # Django project settings
    settings.py
    urls.py
    wsgi.py
  api/                   # Django app
    apps.py              # Triggers ModelRegistry.initialize() on startup
    urls.py              # Route definitions
    pagination.py        # Shared pagination helper
    exceptions.py        # Custom error envelope
    services/
      constants.py       # Dataset metadata, model configs, query structures
      registry.py        # ModelRegistry — checkpoint download, scanning, Loader init
    views/
      health.py          # /health, /methods
      coins.py           # /coins/* endpoints
      graph_generation.py  # /graph-generation/* endpoints
      kg_anomaly.py      # /kg-anomaly/* endpoints
```

## Testing with Postman

Import the collection and environment from `docs/postman/` to test all discovery endpoints.