Spaces:

Bani57
/

website

Running

App Files Files Community

website / docs /reference /api.md

Andrej Janchevski

docs: add technical documentation set

175b650 11 days ago

preview code

raw

history blame contribute delete

6.12 kB

	# REST API reference

	All endpoints are mounted at `/api/v1/`. Request and response bodies are JSON unless noted otherwise. Streaming endpoints emit `text/event-stream` per [reference/sse-protocol.md](sse-protocol.md). The OpenAPI 3.0.3 source of truth is [`docs/api.yaml`](../api.yaml); this document is the human-readable summary.

	For end-to-end examples in a runnable form, import the Postman collection at [`docs/postman/`](../postman/).

	## Conventions

	- Pagination: list endpoints accept `?page=` (1-indexed) and `?page_size=` (default 50). Responses include `total`.
	- Errors: every error has the shape `{ "error": { "code": "...", "message": "...", "details": {} } }`. Codes are listed below.
	- Inference serialization: only one inference runs at a time across the whole process. Concurrent requests return HTTP 429 / `INFERENCE_BUSY` (see [Inference lock](../glossary.md#inference-lock)).

	### Error codes

	\| HTTP \| `code` \| When \|
	\|---\|---\|---\|
	\| 400 \| `INVALID_REQUEST` \| Missing or malformed parameters, unsupported `query_structure` for the algorithm, etc. \|
	\| 404 \| `NOT_FOUND` \| Unknown dataset id, missing entity, etc. \|
	\| 422 \| `INFERENCE_ERROR` \| Inference ran but produced an unrecoverable error. \|
	\| 429 \| `INFERENCE_BUSY` \| Another inference is in progress. \|
	\| 503 \| `MODEL_UNAVAILABLE` \| Checkpoint file missing for the requested `(dataset, algorithm/task/model_type)` combination. \|

	## Health and discovery

	\| Method \| Path \| Purpose \|
	\|---\|---\|---\|
	\| `GET` \| `/` \| API root with absolute URLs to every section. \|
	\| `GET` \| `/health` \| Service status, which model groups are loaded, current inference-lock holder. \|
	\| `GET` \| `/methods` \| The three research methods with their thesis sections. \|
	\| `POST` \| `/debug/force-unlock` \| Release a stuck inference lock. Returns 403 unless `DEBUG=True`. \|

	## COINs — KG reasoning

	\| Method \| Path \| Purpose \|
	\|---\|---\|---\|
	\| `GET` \| `/coins/datasets` \| List datasets with entity / relation counts. \|
	\| `GET` \| `/coins/datasets/{id}/entities` \| Paginated entity search (`?q=`, `?page=`, `?page_size=`). \|
	\| `GET` \| `/coins/datasets/{id}/relations` \| Paginated relation search. \|
	\| `GET` \| `/coins/datasets/{id}/sample-triples` \| Random training triples. `?count=10`, optional `?seed=` for determinism. \|
	\| `GET` \| `/coins/datasets/{id}/sample-query` \| Sample a structurally-valid [query](../glossary.md#query-structure). `?query_structure=` is required (`1p`, `2p`, `3p`, `2i`, `3i`, `ip`, `pi`); `?count`, `?seed` optional. Returns `{anchors, relations, target}` keyed by frontend slot ids (`a`/`a1`/`a2`, `r1`/`r2`/`r3`, `v1`/`v2`). \|
	\| `GET` \| `/coins/models` \| Available algorithms per dataset, plus the query structures each supports. \|
	\| `GET` \| `/coins/query-structures` \| Frontend rendering templates for query graphs (anchor/variable/relation slots, edge connectivity). \|
	\| `POST` \| `/coins/predict` \| Run link prediction or query answering (synchronous JSON response). \|

	`POST /coins/predict` body:

	```json
	{
	"dataset_id": "freebase",
	"algorithm": "transe",
	"query_structure": "1p",
	"anchors": { "a": 42 },
	"variables": {},
	"relations": { "r1": 7 },
	"top_k": 10
	}
	```

	Response (truncated):

	```json
	{
	"predictions": [
	{ "entity_id": 99, "name": "...", "label": "...", "score": 12.4 }
	],
	"community_rank": { "ranked_community_id": 5, "rank": 2, "total": 1092 },
	"timing_ms": { "total": 320, "embedder": 80, "ranker": 240 }
	}
	```

	The `community_rank` block reports where the chosen target community sits in the global ranking — useful for showing the COINs locality benefit.

	## Graph generation — MultiProxAn

	\| Method \| Path \| Purpose \|
	\|---\|---\|---\|
	\| `GET` \| `/graph-generation/datasets` \| List graph types with node and edge type counts. \|
	\| `GET` \| `/graph-generation/sampling-modes` \| The two [sampling modes](../glossary.md#sampling-mode) with their parameter specs. \|
	\| `POST` \| `/graph-generation/generate` \| Streaming SSE. Generate a graph (`standard` or `multiprox` Gibbs init). \|
	\| `POST` \| `/graph-generation/continue` \| Streaming SSE. Advance a `multiprox` Gibbs session by one step using the returned [state blob](../glossary.md#continuation-token--state-blob). \|

	`POST /graph-generation/generate` body (standard mode):

	```json
	{
	"dataset_id": "qm9",
	"model_type": "discrete",
	"sampling_mode": "standard",
	"num_nodes": 19,
	"diffusion_steps": 500,
	"chain_frames": 30
	}
	```

	`multiprox` adds:

	```json
	{
	"multiprox_params": {
	"n": 4, "m": 8, "t": 500, "t_prime": 250, "gibbs_chain_freq": 1
	}
	}
	```

	`POST /graph-generation/continue` body:

	```json
	{ "state": "<base64 state blob from a previous /generate or /continue result>" }
	```

	## KG anomaly correction

	\| Method \| Path \| Purpose \|
	\|---\|---\|---\|
	\| `GET` \| `/kg-anomaly/datasets` \| List datasets with their `correct` and `generate` checkpoints. \|
	\| `GET` \| `/kg-anomaly/datasets/{id}/sample-subgraphs` \| Pre-computed [subgraphs](../glossary.md#subgraph). `?count=5`, `?noise_level=0.4`, `?task=correct\\|generate`, `?seed=42`. \|
	\| `POST` \| `/kg-anomaly/correct` \| Streaming SSE. Correct or regenerate a KG subgraph. \|
	\| `POST` \| `/kg-anomaly/continue` \| Streaming SSE. Advance a multiprox correction session by one step. \|

	`POST /kg-anomaly/correct` body:

	```json
	{
	"dataset_id": "freebase",
	"task": "correct",
	"sampling_mode": "standard",
	"subgraph": {
	"nodes": [{ "entity_id": 42, "type_id": 0 }, ...],
	"edges": [{ "source_idx": 0, "target_idx": 1, "relation_id": 5 }, ...],
	"is_bip": false,
	"row_size": 6
	},
	"diffusion_steps": 500,
	"chain_frames": 30
	}
	```

	The `correct`-task SSE stream additionally carries `kg_log_likelihood` on each `progress` event — see [reference/sse-protocol.md](sse-protocol.md).

	## See also

	- [reference/sse-protocol.md](sse-protocol.md) — exact wire format of the streaming events.
	- [reference/backend-services.md](backend-services.md) — what each Python module behind these endpoints does.
	- [explanation/research-methods.md](../explanation/research-methods.md) — what these endpoints actually compute, scientifically.