File size: 11,993 Bytes
15ccd87 e706cf6 15ccd87 e706cf6 15ccd87 725264a 15ccd87 725264a 15ccd87 e706cf6 15ccd87 725264a 15ccd87 725264a 15ccd87 725264a 15ccd87 725264a 15ccd87 725264a 15ccd87 725264a 15ccd87 725264a 15ccd87 725264a 15ccd87 725264a 15ccd87 725264a 15ccd87 725264a 15ccd87 725264a 15ccd87 725264a 15ccd87 725264a 15ccd87 725264a 15ccd87 725264a 15ccd87 725264a 15ccd87 725264a 15ccd87 725264a 15ccd87 725264a 15ccd87 725264a 15ccd87 725264a 15ccd87 725264a 15ccd87 725264a 15ccd87 725264a 15ccd87 725264a c1298ea 725264a 15ccd87 725264a 15ccd87 725264a 15ccd87 725264a 15ccd87 725264a 15ccd87 725264a 15ccd87 725264a 15ccd87 725264a 15ccd87 725264a 15ccd87 725264a 15ccd87 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 | ---
title: HTTP API Reference
---
# HTTP API Reference
Koharu exposes a local HTTP API under:
```text
http://127.0.0.1:<PORT>/api/v1
```
This is the same API used by the desktop UI and the headless Web UI.
## Runtime model
Important current behavior:
- the API is served by the same process as the GUI or headless runtime
- the server binds to `127.0.0.1` by default; use `--host` to bind elsewhere
- the API and MCP server share the same loaded project, models, and pipeline state
- when no `--port` is provided, Koharu chooses a random local port
- everything except `/api/v1/downloads`, `/api/v1/operations`, and `/api/v1/events` returns `503 Service Unavailable` until the app finishes bootstrapping
## Resource model
The API is project-centric. A single project is open at a time and contains:
- a list of `Pages` indexed by `PageId`
- per-page `Nodes` (image layers, masks, text blocks) referenced by `NodeId`
- a content-addressed `Blob` store that holds raw image bytes by Blake3 hash
- a `Scene` snapshot built from those pieces, advanced by an `epoch` counter
- a history of `Op` mutations that can be undone or redone
Mutations always go through the history layer (`POST /history/apply`) so the scene, autosave, and event subscribers stay in sync.
## Common response shapes
Frequently used response types include:
- `MetaInfo` β app version and ML device label
- `EngineCatalog` β installable engine ids per pipeline stage
- `ProjectSummary` β id, name, path, page count, last opened
- `SceneSnapshot` β `{ epoch, scene }`
- `LlmState` β current LLM load state (status, target, error)
- `LlmCatalog` β local + provider models grouped by family
- `JobSummary` β `{ id, kind, status, error }`
- `DownloadProgress` β package id, byte counts, status
## Endpoints
### Meta
| Method | Path | Purpose |
| ------ | ----------- | ------------------------------------------ |
| `GET` | `/meta` | get app version and active ML backend |
| `GET` | `/engines` | list registered pipeline engines per stage |
### Fonts
| Method | Path | Purpose |
| ------ | ------------------------------------- | ---------------------------------------------------- |
| `GET` | `/fonts` | combined system + Google Fonts catalog for rendering |
| `GET` | `/google-fonts` | Google Fonts catalog as a standalone list |
| `POST` | `/google-fonts/{family}/fetch` | download and cache one Google Fonts family |
| `GET` | `/google-fonts/{family}/{file}` | serve the cached TTF/WOFF file |
### Projects
Every project lives under the managed `{data.path}/projects/` directory; clients never supply filesystem paths.
| Method | Path | Purpose |
| -------- | --------------------------------- | --------------------------------------------------------- |
| `GET` | `/projects` | list managed projects |
| `POST` | `/projects` | create a new project (body `{ name }`) |
| `POST` | `/projects/import` | extract a `.khr` archive into a fresh dir and open it |
| `PUT` | `/projects/current` | open a managed project by `id` |
| `DELETE` | `/projects/current` | close the current session |
| `POST` | `/projects/current/export` | export the current project; returns binary bytes |
`POST /projects/current/export` accepts `{ format, pages? }` where `format` is one of `khr`, `psd`, `rendered`, `inpainted`. When the format produces multiple files, the response is `application/zip`.
### Pages
| Method | Path | Purpose |
| ------ | --------------------------------------- | ---------------------------------------------------- |
| `POST` | `/pages` | create pages from N uploaded image files (multipart) |
| `POST` | `/pages/from-paths` | Tauri-only fast path that imports by absolute path |
| `POST` | `/pages/{id}/image-layers` | add a Custom image node from an uploaded file |
| `PUT` | `/pages/{id}/masks/{role}` | upsert a mask node from raw PNG bytes |
| `GET` | `/pages/{id}/thumbnail` | get the page thumbnail (cached as WebP) |
`role` is `segment` or `brushInpaint`. `POST /pages` accepts an optional `replace=true` field; the import is filename-sorted using natural order.
### Scene and blobs
| Method | Path | Purpose |
| ------ | ------------------- | ------------------------------------------------------------- |
| `GET` | `/scene.json` | full scene snapshot for web/UI clients |
| `GET` | `/scene.bin` | postcard-encoded `Snapshot { epoch, scene }` for Tauri client |
| `GET` | `/blobs/{hash}` | raw blob bytes by Blake3 hash |
`/scene.bin` includes the current epoch in the `x-koharu-epoch` response header.
### History (mutations)
All scene mutations go through here. Each response returns `{ epoch }`.
| Method | Path | Purpose |
| ------ | ------------------- | ---------------------------------------- |
| `POST` | `/history/apply` | apply an `Op` (including `Op::Batch`) |
| `POST` | `/history/undo` | revert the last applied op |
| `POST` | `/history/redo` | re-apply the last undone op |
`Op` is the discriminated union that covers add/remove/update node, add/remove page, batch, and other scene transitions. The body is the JSON-tagged variant.
### Pipelines
| Method | Path | Purpose |
| ------ | ------------- | -------------------------------------- |
| `POST` | `/pipelines` | start a pipeline run as an operation |
Body fields:
- `steps` β engine ids to run in order (validated against the registry)
- `pages` β optional subset of `PageId`s; omit to process the whole project
- `region` β optional bounding box for the inpainter (repair-brush flow)
- `targetLanguage`, `systemPrompt`, `defaultFont` β optional per-run overrides
The response carries an `operationId`. Progress and completion arrive on `/events` as `JobStarted`, `JobProgress`, `JobWarning`, and `JobFinished`.
### Operations
`/operations` is the unified registry for in-flight and recently-completed jobs (pipelines + downloads).
| Method | Path | Purpose |
| -------- | --------------------- | ---------------------------------------------------------- |
| `GET` | `/operations` | snapshot of every in-flight or recent operation |
| `DELETE` | `/operations/{id}` | cancel a pipeline run; best-effort eviction for downloads |
### Downloads
| Method | Path | Purpose |
| ------ | ------------------- | ------------------------------------------ |
| `GET` | `/downloads` | snapshot of every active or recent download |
| `POST` | `/downloads` | start a model-package download (`{ modelId }`) |
`modelId` is a package id declared via `declare_hf_model_package!` (e.g. `"model:comic-text-detector:yolo-v5"`). The response is `{ operationId }` reusing the package id.
### LLM control
The loaded model is a singleton resource at `/llm/current`.
| Method | Path | Purpose |
| -------- | ---------------- | --------------------------------------------- |
| `GET` | `/llm/current` | current state (status, target, error) |
| `PUT` | `/llm/current` | load the given target (local or provider) |
| `DELETE` | `/llm/current` | unload / release the model |
| `GET` | `/llm/catalog` | list available local + provider-backed models |
`PUT /llm/current` accepts an `LlmLoadRequest`:
- provider targets β `{ kind: "provider", providerId, modelId }`
- local targets β `{ kind: "local", modelId }`
- optional `options { temperature, maxTokens, customSystemPrompt }`
`PUT /llm/current` returns `204` once the load task is queued. The actual ready state is published as `LlmLoaded` on `/events`.
### Config
| Method | Path | Purpose |
| -------- | --------------------------------------- | ----------------------------------------------- |
| `GET` | `/config` | read the current `AppConfig` |
| `PATCH` | `/config` | apply a `ConfigPatch`; persists and broadcasts |
| `PUT` | `/config/providers/{id}/secret` | save (or overwrite) a provider's API key |
| `DELETE` | `/config/providers/{id}/secret` | clear a provider's stored API key |
`AppConfig` exposes top-level `data`, `http`, `pipeline`, and `providers`:
- `data.path` β local data directory used for runtime, model cache, and projects
- `http { connectTimeout, readTimeout, maxRetries }` β shared HTTP client used by downloads and provider-backed requests
- `pipeline { detector, fontDetector, segmenter, bubbleSegmenter, ocr, translator, inpainter, renderer }` β engine id selected for each stage
- `providers[] { id, baseUrl?, apiKey? }` β saved API keys round-trip as the redacted placeholder `"[REDACTED]"`; never the raw secret
Built-in provider ids:
- `openai`
- `gemini`
- `claude`
- `deepseek`
- `deepl`
- `google-translate`
- `caiyun`
- `openai-compatible`
API keys are stored in the platform credential store, not in `config.toml`. PATCHing `apiKey: ""` clears the saved key; PATCHing `"[REDACTED]"` leaves it unchanged. The dedicated `/config/providers/{id}/secret` routes are the explicit, non-PATCH way to manage one provider's secret.
## Events stream
Koharu exposes a Server-Sent Events stream at:
```text
GET /events
```
Behavior:
- a fresh connection (no `Last-Event-ID` header) starts with a `Snapshot` event holding the current jobs and downloads registries
- on reconnect, the server replays buffered events with `seq > Last-Event-ID` in order; if the requested id has scrolled out of the ring, the server re-sends a `Snapshot`
- each live event is emitted with its `seq` as the SSE `id:` field
- a 15-second keep-alive is maintained
Event variants currently include:
- `Snapshot` β full state seed for fresh and lag-recovery clients
- `JobStarted`, `JobProgress`, `JobWarning`, `JobFinished` β pipeline job lifecycle
- `DownloadProgress` β package download progress ticks
- `ConfigChanged` β config was applied via `PATCH /config` or a secret route
- `LlmLoaded`, `LlmUnloaded` β LLM lifecycle transitions
- `SceneAdvanced` β emitted when a scene mutation advances the epoch
## Typical workflow
The normal API order for one new project is:
1. `POST /projects` β create the project
2. `POST /pages` (or `/pages/from-paths` from Tauri) β import images
3. `PUT /llm/current` β load a translation model (local or provider)
4. `POST /pipelines` β kick off `detect β ocr β translate β inpaint β render`
5. tail `GET /events` until `JobFinished`
6. `POST /projects/current/export` with `format = "rendered"` or `"psd"`
For finer control, post `POST /history/apply` with explicit `Op` payloads instead of running a full pipeline.
If you want agent-oriented access instead of HTTP endpoint orchestration, see [MCP Tools Reference](mcp-tools.md).
|