File size: 6,117 Bytes
175b650
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
# REST API reference

All endpoints are mounted at `/api/v1/`. Request and response bodies are JSON unless noted otherwise. Streaming endpoints emit `text/event-stream` per [reference/sse-protocol.md](sse-protocol.md). The OpenAPI 3.0.3 source of truth is [`docs/api.yaml`](../api.yaml); this document is the human-readable summary.

For end-to-end examples in a runnable form, import the Postman collection at [`docs/postman/`](../postman/).

## Conventions

- Pagination: list endpoints accept `?page=` (1-indexed) and `?page_size=` (default 50). Responses include `total`.
- Errors: every error has the shape `{ "error": { "code": "...", "message": "...", "details": {} } }`. Codes are listed below.
- Inference serialization: only one inference runs at a time across the whole process. Concurrent requests return HTTP 429 / `INFERENCE_BUSY` (see [Inference lock](../glossary.md#inference-lock)).

### Error codes

| HTTP | `code` | When |
|---|---|---|
| 400 | `INVALID_REQUEST` | Missing or malformed parameters, unsupported `query_structure` for the algorithm, etc. |
| 404 | `NOT_FOUND` | Unknown dataset id, missing entity, etc. |
| 422 | `INFERENCE_ERROR` | Inference ran but produced an unrecoverable error. |
| 429 | `INFERENCE_BUSY` | Another inference is in progress. |
| 503 | `MODEL_UNAVAILABLE` | Checkpoint file missing for the requested `(dataset, algorithm/task/model_type)` combination. |

## Health and discovery

| Method | Path | Purpose |
|---|---|---|
| `GET` | `/` | API root with absolute URLs to every section. |
| `GET` | `/health` | Service status, which model groups are loaded, current inference-lock holder. |
| `GET` | `/methods` | The three research methods with their thesis sections. |
| `POST` | `/debug/force-unlock` | Release a stuck inference lock. **Returns 403 unless `DEBUG=True`.** |

## COINs β€” KG reasoning

| Method | Path | Purpose |
|---|---|---|
| `GET` | `/coins/datasets` | List datasets with entity / relation counts. |
| `GET` | `/coins/datasets/{id}/entities` | Paginated entity search (`?q=`, `?page=`, `?page_size=`). |
| `GET` | `/coins/datasets/{id}/relations` | Paginated relation search. |
| `GET` | `/coins/datasets/{id}/sample-triples` | Random training triples. `?count=10`, optional `?seed=` for determinism. |
| `GET` | `/coins/datasets/{id}/sample-query` | Sample a structurally-valid [query](../glossary.md#query-structure). `?query_structure=` is required (`1p`, `2p`, `3p`, `2i`, `3i`, `ip`, `pi`); `?count`, `?seed` optional. Returns `{anchors, relations, target}` keyed by frontend slot ids (`a`/`a1`/`a2`, `r1`/`r2`/`r3`, `v1`/`v2`). |
| `GET` | `/coins/models` | Available algorithms per dataset, plus the query structures each supports. |
| `GET` | `/coins/query-structures` | Frontend rendering templates for query graphs (anchor/variable/relation slots, edge connectivity). |
| `POST` | `/coins/predict` | Run link prediction or query answering (synchronous JSON response). |

`POST /coins/predict` body:

```json
{
  "dataset_id": "freebase",
  "algorithm": "transe",
  "query_structure": "1p",
  "anchors": { "a": 42 },
  "variables": {},
  "relations": { "r1": 7 },
  "top_k": 10
}
```

Response (truncated):

```json
{
  "predictions": [
    { "entity_id": 99, "name": "...", "label": "...", "score": 12.4 }
  ],
  "community_rank": { "ranked_community_id": 5, "rank": 2, "total": 1092 },
  "timing_ms": { "total": 320, "embedder": 80, "ranker": 240 }
}
```

The `community_rank` block reports where the chosen target community sits in the global ranking β€” useful for showing the COINs locality benefit.

## Graph generation β€” MultiProxAn

| Method | Path | Purpose |
|---|---|---|
| `GET` | `/graph-generation/datasets` | List graph types with node and edge type counts. |
| `GET` | `/graph-generation/sampling-modes` | The two [sampling modes](../glossary.md#sampling-mode) with their parameter specs. |
| `POST` | `/graph-generation/generate` | **Streaming SSE.** Generate a graph (`standard` or `multiprox` Gibbs init). |
| `POST` | `/graph-generation/continue` | **Streaming SSE.** Advance a `multiprox` Gibbs session by one step using the returned [state blob](../glossary.md#continuation-token--state-blob). |

`POST /graph-generation/generate` body (standard mode):

```json
{
  "dataset_id": "qm9",
  "model_type": "discrete",
  "sampling_mode": "standard",
  "num_nodes": 19,
  "diffusion_steps": 500,
  "chain_frames": 30
}
```

`multiprox` adds:

```json
{
  "multiprox_params": {
    "n": 4, "m": 8, "t": 500, "t_prime": 250, "gibbs_chain_freq": 1
  }
}
```

`POST /graph-generation/continue` body:

```json
{ "state": "<base64 state blob from a previous /generate or /continue result>" }
```

## KG anomaly correction

| Method | Path | Purpose |
|---|---|---|
| `GET` | `/kg-anomaly/datasets` | List datasets with their `correct` and `generate` checkpoints. |
| `GET` | `/kg-anomaly/datasets/{id}/sample-subgraphs` | Pre-computed [subgraphs](../glossary.md#subgraph). `?count=5`, `?noise_level=0.4`, `?task=correct\|generate`, `?seed=42`. |
| `POST` | `/kg-anomaly/correct` | **Streaming SSE.** Correct or regenerate a KG subgraph. |
| `POST` | `/kg-anomaly/continue` | **Streaming SSE.** Advance a multiprox correction session by one step. |

`POST /kg-anomaly/correct` body:

```json
{
  "dataset_id": "freebase",
  "task": "correct",
  "sampling_mode": "standard",
  "subgraph": {
    "nodes": [{ "entity_id": 42, "type_id": 0 }, ...],
    "edges": [{ "source_idx": 0, "target_idx": 1, "relation_id": 5 }, ...],
    "is_bip": false,
    "row_size": 6
  },
  "diffusion_steps": 500,
  "chain_frames": 30
}
```

The `correct`-task SSE stream additionally carries `kg_log_likelihood` on each `progress` event β€” see [reference/sse-protocol.md](sse-protocol.md).

## See also

- [reference/sse-protocol.md](sse-protocol.md) β€” exact wire format of the streaming events.
- [reference/backend-services.md](backend-services.md) β€” what each Python module behind these endpoints does.
- [explanation/research-methods.md](../explanation/research-methods.md) β€” what these endpoints actually compute, scientifically.