website / docs /reference /sse-protocol.md
Andrej Janchevski
docs: add technical documentation set
175b650

SSE protocol

The two diffusion endpoints β€” /api/v1/graph-generation/{generate, continue} and /api/v1/kg-anomaly/{correct, continue} β€” return Server-Sent Events with Content-Type: text/event-stream. This document specifies the exact event shapes the backend emits and the order they appear in.

Why SSE

Diffusion sampling takes seconds to minutes. Streaming previews to the browser keeps the UI responsive and lets the user watch the generation evolve. SSE was chosen over WebSockets because the channel is one-way (server β†’ client) and survives plain HTTP infrastructure.

Envelope

Each event is a standard SSE frame:

event: <name>
data: <JSON or data URI>
  • A frame ends with a blank line.
  • The event: field is one of progress, preview, result.
  • The data: payload is JSON for progress and result; for preview it is a data:image/...;base64,... URI containing the in-progress render.
  • Frames are emitted in real time as the diffusion loop advances; the client should not buffer to end-of-stream.

Event types

progress

Lightweight metadata. Emitted multiple times per run (every step or every batch of steps).

{
  "type": "progress",
  "phase": "denoise",
  "step": 42,
  "total_steps": 500,
  "elapsed_ms": 2100
}

phase is one of:

Phase Meaning
denoise Standard reverse diffusion loop.
noise_init Initial noise sampling for the multiprox outer loop.
gibbs One round of multiprox Gibbs refinement.
refine Final denoising after the Gibbs loop completes.

For the kg-anomaly correct task, progress events on chain-frame boundaries additionally carry:

{
  "kg_log_likelihood": -1.42,
  "kg_log_likelihood_step": 240
}

kg_log_likelihood is the mean log-sigmoid score from the frozen COINs link ranker over edges currently present in the argmax reconstruction. Higher is better; the frontend plots this trace as a quality indicator.

preview

A base64-encoded PNG of the current state of the graph being generated, packaged as a data: URI ready to be fed straight into an <img> element.

event: preview
data: data:image/png;base64,iVBORw0KGgoAAAANS...

Cadence varies by phase:

  • denoise β€” at chain_frames intervals (typical: 30 frames over 500 steps).
  • gibbs β€” every inner Gibbs step.
  • refine β€” every ~10 % of total steps.

The frontend's PreviewReel component buffers these and replays them as a loop while waiting for the result.

result

Final payload. One per call. Always the last event.

For graph generation:

{
  "type": "result",
  "dataset_id": "qm9",
  "model_type": "discrete",
  "sampling_mode": "standard",
  "image": "data:image/png;base64,...",
  "chain_gif": "data:image/gif;base64,...",
  "inference_time_ms": 25000
}

For graph generation in multiprox mode the result is the partial state after the initial denoise to step t_prime:

{
  "type": "result",
  "step": 0,
  "round_complete": false,
  "done": false,
  "state": "<base64 continuation blob>",
  "image": "...",
  "chain_gif": "..."
}

The client posts state to /continue to advance one Gibbs round. Each /continue response is itself a stream that yields its own progress / preview / result events; the new result carries the next state blob, and so on, until the user stops or done becomes true.

For /kg-anomaly/correct the analogous result includes the corrected subgraph node and edge tensors so the frontend can rebuild the visualization.

State blob

The continuation blob is a base64-encoded JSON object containing all the per-run state needed to resume the Multiprox Gibbs loop: the current partial sample, the current step index, the multiprox parameters (n, m, t, t_prime, gibbs_chain_freq), and dataset_id / model_type (or task). It is opaque to the frontend; the contract is "give me back exactly what you got from the last result".

InvalidRequestError is raised on /continue if the blob is malformed; this maps to HTTP 400 / INVALID_REQUEST.

Client handling pattern

The reference implementation lives in src/frontend/src/composables/useSseStream.js. In summary:

  1. POST the request body, expect Content-Type: text/event-stream.
  2. Read the response body as a stream and parse SSE frames.
  3. For each frame, dispatch on event::
    • progress β†’ update progress bar, push kg_log_likelihood onto the metric trace.
    • preview β†’ swap the <img> src.
    • result β†’ finalize the UI; if a state blob is present, enable the "next round" button that POSTs to /continue.
  4. Release on stream close. The server releases the inference lock in a finally.

Client disconnects

If the client closes the connection mid-stream, gunicorn cancels the generator. The generator's finally block releases the inference lock β€” most of the time. Some proxies don't propagate the close cleanly, in which case the lock can stick. With DEBUG=True, POST /api/v1/debug/force-unlock releases it; in production a container restart is the only recourse. See explanation/inference-lifecycle.md.