Spaces:
Running
Running
Filip Makraduli commited on
Commit ·
4e0f10e
1
Parent(s): 0df0841
Switch to transformers5 SIE image; LightOnOCR as default recognition
Browse files- Base image: latest-cpu-default → latest-cpu-transformers5 (where the
LightOnOCR adapter actually lives, fixes the ImportError on Space boot)
- Recognition default: Florence-2-base-ft → LightOnOCR-2-1B (Florence-2
isn't loadable on transformers5 due to tokenizer API change; tracked in
sie-internal#828)
- Drop Florence-2 entries from the dropdown so users don't click into an
error
- Expand alternates: 3 Donut variants, 4 NER models (GLiNER multi/large/
PII + NuNER Zero)
- Swap sample images: replace dense table/handwritten/multi-column with
event-poster/slide/letter to better suit the new recognition model
- Server: /api/health now reports cuda availability; UI auto-disables
gpuRequired models on the CPU image
- Dockerfile +1 -1
- README.md +23 -16
- data/samples/event-poster.png +0 -0
- data/samples/handwritten.png +0 -0
- data/samples/index.json +29 -26
- data/samples/letter.png +0 -0
- data/samples/multi-column.png +0 -0
- data/samples/slide.png +0 -0
- data/samples/table.png +0 -0
- hf-entrypoint.sh +1 -1
- src/config.ts +17 -14
- web/public/app.js +9 -4
- web/public/index.html +1 -1
- web/server.ts +13 -6
Dockerfile
CHANGED
|
@@ -14,7 +14,7 @@
|
|
| 14 |
FROM node:22-bookworm-slim AS node
|
| 15 |
|
| 16 |
# --- stage 2: final image ---
|
| 17 |
-
FROM ghcr.io/superlinked/sie-server:latest-cpu-
|
| 18 |
|
| 19 |
USER root
|
| 20 |
|
|
|
|
| 14 |
FROM node:22-bookworm-slim AS node
|
| 15 |
|
| 16 |
# --- stage 2: final image ---
|
| 17 |
+
FROM ghcr.io/superlinked/sie-server:latest-cpu-transformers5
|
| 18 |
|
| 19 |
USER root
|
| 20 |
|
README.md
CHANGED
|
@@ -20,25 +20,25 @@ SIE hot-swap them with one identifier change.
|
|
| 20 |
|
| 21 |
A single Docker container with two processes:
|
| 22 |
|
| 23 |
-
- `sie-server` (the SIE inference engine) on `127.0.0.1:8080`,
|
| 24 |
-
|
| 25 |
- A small Node web server on `0.0.0.0:7860` that serves the UI and
|
| 26 |
proxies requests to SIE via SSE.
|
| 27 |
|
| 28 |
-
Both are baked into one image extending `ghcr.io/superlinked/sie-server:latest-cpu-
|
| 29 |
HF Spaces' persistent `/data` directory is used as the HuggingFace cache so
|
| 30 |
model weights survive Space restarts.
|
| 31 |
|
| 32 |
## Model lineup
|
| 33 |
|
| 34 |
-
| Stage | Default
|
| 35 |
|---|---|---|
|
| 36 |
-
| Recognition | `
|
| 37 |
-
| Structured | `naver-clova-ix/donut-base-finetuned-cord-v2` | Donut-DocVQA |
|
| 38 |
-
| NER | `urchade/gliner_multi-v2.1` | GLiNER-large |
|
| 39 |
|
| 40 |
-
The default trio is ~
|
| 41 |
-
|
| 42 |
|
| 43 |
## What SIE provides here
|
| 44 |
|
|
@@ -48,9 +48,9 @@ Three different model architectures, one API:
|
|
| 48 |
client.extract(model_id, { images: [bytes] })
|
| 49 |
```
|
| 50 |
|
| 51 |
-
The model ID alone decides whether you get VLM Markdown (
|
| 52 |
-
|
| 53 |
-
|
| 54 |
|
| 55 |
## Source
|
| 56 |
|
|
@@ -61,9 +61,16 @@ SIE image; this Space packages everything into one container for HF.
|
|
| 61 |
|
| 62 |
## Performance note
|
| 63 |
|
| 64 |
-
This Space runs on HF's free CPU tier (2 vCPU, 16 GB RAM).
|
| 65 |
-
|
| 66 |
-
|
| 67 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 68 |
|
| 69 |
Built on [SIE](https://github.com/superlinked/sie) (Apache 2.0).
|
|
|
|
| 20 |
|
| 21 |
A single Docker container with two processes:
|
| 22 |
|
| 23 |
+
- `sie-server` (the SIE inference engine) on `127.0.0.1:8080`, no preload
|
| 24 |
+
(lazy-loads models on first click to fit free-tier memory).
|
| 25 |
- A small Node web server on `0.0.0.0:7860` that serves the UI and
|
| 26 |
proxies requests to SIE via SSE.
|
| 27 |
|
| 28 |
+
Both are baked into one image extending `ghcr.io/superlinked/sie-server:latest-cpu-transformers5`.
|
| 29 |
HF Spaces' persistent `/data` directory is used as the HuggingFace cache so
|
| 30 |
model weights survive Space restarts.
|
| 31 |
|
| 32 |
## Model lineup
|
| 33 |
|
| 34 |
+
| Stage | Default | Alternates (lazy-load on click) |
|
| 35 |
|---|---|---|
|
| 36 |
+
| Recognition | `lightonai/LightOnOCR-2-1B` (2.1B, Markdown output) | PaddleOCR-VL, GLM-OCR (GPU-only; disabled on the CPU image) |
|
| 37 |
+
| Structured | `naver-clova-ix/donut-base-finetuned-cord-v2` | Donut-DocVQA, Donut-RVLCDIP |
|
| 38 |
+
| NER | `urchade/gliner_multi-v2.1` | GLiNER-large, GLiNER-PII, NuNER-Zero |
|
| 39 |
|
| 40 |
+
The default trio is ~5 GB total (LightOnOCR is the big one at ~4 GB).
|
| 41 |
+
Alternates lazy-load on first click.
|
| 42 |
|
| 43 |
## What SIE provides here
|
| 44 |
|
|
|
|
| 48 |
client.extract(model_id, { images: [bytes] })
|
| 49 |
```
|
| 50 |
|
| 51 |
+
The model ID alone decides whether you get VLM Markdown (LightOnOCR),
|
| 52 |
+
structured JSON (Donut), or typed entities (GLiNER / NuNER). No separate
|
| 53 |
+
auth, no separate rate limit, no separate deployment story.
|
| 54 |
|
| 55 |
## Source
|
| 56 |
|
|
|
|
| 61 |
|
| 62 |
## Performance note
|
| 63 |
|
| 64 |
+
This Space runs on HF's free CPU tier (2 vCPU, 16 GB RAM). The first click
|
| 65 |
+
for each model is a cold load (60-180 s) while weights download and the
|
| 66 |
+
adapter spins up. Subsequent clicks reuse the cached weights and run in
|
| 67 |
+
20-30 s. On a GPU Space (paid), recognition drops to a few seconds and the
|
| 68 |
+
heavier models like GLM-OCR become tractable.
|
| 69 |
+
|
| 70 |
+
The SIE image this Space runs on is `latest-cpu-transformers5`, where the
|
| 71 |
+
LightOnOCR adapter lives. Florence-2 ships in the sibling `default`
|
| 72 |
+
bundle (which pins `transformers<5`) and is not available on this image;
|
| 73 |
+
see [sie-internal#828](https://github.com/superlinked/sie-internal/issues/828)
|
| 74 |
+
for the bundle-composition story.
|
| 75 |
|
| 76 |
Built on [SIE](https://github.com/superlinked/sie) (Apache 2.0).
|
data/samples/event-poster.png
ADDED
|
data/samples/handwritten.png
DELETED
|
Binary file (21.9 kB)
|
|
|
data/samples/index.json
CHANGED
|
@@ -3,7 +3,7 @@
|
|
| 3 |
"id": "receipt",
|
| 4 |
"filename": "receipt.png",
|
| 5 |
"label": "Grocery receipt",
|
| 6 |
-
"description": "Printed receipt with line items, subtotal, tax, total.
|
| 7 |
"labels": [
|
| 8 |
"merchant",
|
| 9 |
"date",
|
|
@@ -18,7 +18,7 @@
|
|
| 18 |
"id": "invoice",
|
| 19 |
"filename": "invoice.png",
|
| 20 |
"label": "Vendor invoice",
|
| 21 |
-
"description": "Multi-column invoice with billing party, line items, subtotal, tax, total.
|
| 22 |
"labels": [
|
| 23 |
"vendor",
|
| 24 |
"invoice_number",
|
|
@@ -33,7 +33,7 @@
|
|
| 33 |
"id": "business-card",
|
| 34 |
"filename": "business-card.png",
|
| 35 |
"label": "Business card",
|
| 36 |
-
"description": "Tight layout, mixed text sizes, multiple contact fields.",
|
| 37 |
"labels": [
|
| 38 |
"company",
|
| 39 |
"person",
|
|
@@ -45,40 +45,43 @@
|
|
| 45 |
]
|
| 46 |
},
|
| 47 |
{
|
| 48 |
-
"id": "
|
| 49 |
-
"filename": "
|
| 50 |
-
"label": "
|
| 51 |
-
"description": "
|
| 52 |
"labels": [
|
| 53 |
-
"
|
| 54 |
-
"
|
| 55 |
-
"
|
| 56 |
-
"
|
|
|
|
|
|
|
| 57 |
]
|
| 58 |
},
|
| 59 |
{
|
| 60 |
-
"id": "
|
| 61 |
-
"filename": "
|
| 62 |
-
"label": "
|
| 63 |
-
"description": "
|
| 64 |
"labels": [
|
| 65 |
-
"
|
| 66 |
"person",
|
| 67 |
-
"
|
| 68 |
-
"
|
| 69 |
]
|
| 70 |
},
|
| 71 |
{
|
| 72 |
-
"id": "
|
| 73 |
-
"filename": "
|
| 74 |
-
"label": "
|
| 75 |
-
"description": "
|
| 76 |
"labels": [
|
| 77 |
-
"
|
| 78 |
"person",
|
| 79 |
-
"
|
| 80 |
-
"
|
| 81 |
-
"
|
|
|
|
| 82 |
]
|
| 83 |
}
|
| 84 |
]
|
|
|
|
| 3 |
"id": "receipt",
|
| 4 |
"filename": "receipt.png",
|
| 5 |
"label": "Grocery receipt",
|
| 6 |
+
"description": "Printed receipt with line items, subtotal, tax, total. Donut on CORD reads this end-to-end.",
|
| 7 |
"labels": [
|
| 8 |
"merchant",
|
| 9 |
"date",
|
|
|
|
| 18 |
"id": "invoice",
|
| 19 |
"filename": "invoice.png",
|
| 20 |
"label": "Vendor invoice",
|
| 21 |
+
"description": "Multi-column invoice with billing party, line items, subtotal, tax, total.",
|
| 22 |
"labels": [
|
| 23 |
"vendor",
|
| 24 |
"invoice_number",
|
|
|
|
| 33 |
"id": "business-card",
|
| 34 |
"filename": "business-card.png",
|
| 35 |
"label": "Business card",
|
| 36 |
+
"description": "Tight layout, mixed text sizes, multiple contact fields. Good NER showcase.",
|
| 37 |
"labels": [
|
| 38 |
"company",
|
| 39 |
"person",
|
|
|
|
| 45 |
]
|
| 46 |
},
|
| 47 |
{
|
| 48 |
+
"id": "event-poster",
|
| 49 |
+
"filename": "event-poster.png",
|
| 50 |
+
"label": "Event poster",
|
| 51 |
+
"description": "Large-text poster with title, date, artists, ticket info. Florence-2 OCR's home turf.",
|
| 52 |
"labels": [
|
| 53 |
+
"event",
|
| 54 |
+
"date",
|
| 55 |
+
"venue",
|
| 56 |
+
"artist",
|
| 57 |
+
"price",
|
| 58 |
+
"organization"
|
| 59 |
]
|
| 60 |
},
|
| 61 |
{
|
| 62 |
+
"id": "slide",
|
| 63 |
+
"filename": "slide.png",
|
| 64 |
+
"label": "Presentation slide",
|
| 65 |
+
"description": "Roadmap slide with title and three numbered items. Clean printed text on a single background.",
|
| 66 |
"labels": [
|
| 67 |
+
"initiative",
|
| 68 |
"person",
|
| 69 |
+
"date",
|
| 70 |
+
"quarter"
|
| 71 |
]
|
| 72 |
},
|
| 73 |
{
|
| 74 |
+
"id": "letter",
|
| 75 |
+
"filename": "letter.png",
|
| 76 |
+
"label": "Business letter",
|
| 77 |
+
"description": "Short printed business letter with sender, date, recipient, body, and signature.",
|
| 78 |
"labels": [
|
| 79 |
+
"company",
|
| 80 |
"person",
|
| 81 |
+
"address",
|
| 82 |
+
"date",
|
| 83 |
+
"amount",
|
| 84 |
+
"phone"
|
| 85 |
]
|
| 86 |
}
|
| 87 |
]
|
data/samples/letter.png
ADDED
|
data/samples/multi-column.png
DELETED
|
Binary file (43.5 kB)
|
|
|
data/samples/slide.png
ADDED
|
data/samples/table.png
DELETED
|
Binary file (28.1 kB)
|
|
|
hf-entrypoint.sh
CHANGED
|
@@ -10,7 +10,7 @@ set -euo pipefail
|
|
| 10 |
#
|
| 11 |
# Override via the PRELOAD env var in the Space's Settings if you upgrade to
|
| 12 |
# CPU-Upgrade (32 GB) or a GPU tier:
|
| 13 |
-
# PRELOAD="microsoft/Florence-2-base,naver-clova-ix/donut-base-finetuned-cord-v2,urchade/gliner_multi-v2.1"
|
| 14 |
PRELOAD="${PRELOAD:-}"
|
| 15 |
|
| 16 |
SIE_ARGS=(serve --host 127.0.0.1 --port 8080)
|
|
|
|
| 10 |
#
|
| 11 |
# Override via the PRELOAD env var in the Space's Settings if you upgrade to
|
| 12 |
# CPU-Upgrade (32 GB) or a GPU tier:
|
| 13 |
+
# PRELOAD="microsoft/Florence-2-base-ft,naver-clova-ix/donut-base-finetuned-cord-v2,urchade/gliner_multi-v2.1"
|
| 14 |
PRELOAD="${PRELOAD:-}"
|
| 15 |
|
| 16 |
SIE_ARGS=(serve --host 127.0.0.1 --port 8080)
|
src/config.ts
CHANGED
|
@@ -13,29 +13,17 @@ export const RECOGNITION_MODELS: ModelOption[] = [
|
|
| 13 |
label: "LightOnOCR-2-1B (default)",
|
| 14 |
description: "Pixtral encoder + Qwen3 decoder, 2.1B. Strong Markdown output across dense layouts. ~4 GB to download on first call.",
|
| 15 |
},
|
| 16 |
-
{
|
| 17 |
-
id: "microsoft/Florence-2-base",
|
| 18 |
-
label: "Florence-2-base (small, fast)",
|
| 19 |
-
description: "Microsoft DaViT + decoder, 270M. Fast on CPU but terse on dense layouts; better on multi-column text.",
|
| 20 |
-
options: { task: "<OCR>" },
|
| 21 |
-
},
|
| 22 |
-
{
|
| 23 |
-
id: "microsoft/Florence-2-large",
|
| 24 |
-
label: "Florence-2-large",
|
| 25 |
-
description: "Larger Florence-2 variant, 770M. Better than Florence-2-base but still leans terse on receipts.",
|
| 26 |
-
options: { task: "<OCR>" },
|
| 27 |
-
},
|
| 28 |
{
|
| 29 |
id: "PaddlePaddle/PaddleOCR-VL-1.5",
|
| 30 |
label: "PaddleOCR-VL-1.5 (GPU image)",
|
| 31 |
-
description: "Paddle's VLM-OCR, 1.5B. Six task modes. Available on the CUDA image.",
|
| 32 |
options: { task: "ocr" },
|
| 33 |
gpuRequired: true,
|
| 34 |
},
|
| 35 |
{
|
| 36 |
id: "zai-org/GLM-OCR",
|
| 37 |
label: "GLM-OCR (GPU only)",
|
| 38 |
-
description: "CogViT + GLM-0.5B decoder, 9B in bfloat16. Premium quality, needs ~18 GB VRAM.",
|
| 39 |
gpuRequired: true,
|
| 40 |
},
|
| 41 |
];
|
|
@@ -51,6 +39,11 @@ export const STRUCTURED_MODELS: ModelOption[] = [
|
|
| 51 |
label: "Donut on DocVQA",
|
| 52 |
description: "Same Donut architecture, fine-tuned for visual question answering. Returns text answers.",
|
| 53 |
},
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 54 |
];
|
| 55 |
|
| 56 |
export const NER_MODELS: ModelOption[] = [
|
|
@@ -64,6 +57,16 @@ export const NER_MODELS: ModelOption[] = [
|
|
| 64 |
label: "GLiNER large (English)",
|
| 65 |
description: "440M, English-focused, higher quality on English text.",
|
| 66 |
},
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 67 |
];
|
| 68 |
|
| 69 |
export const config = {
|
|
|
|
| 13 |
label: "LightOnOCR-2-1B (default)",
|
| 14 |
description: "Pixtral encoder + Qwen3 decoder, 2.1B. Strong Markdown output across dense layouts. ~4 GB to download on first call.",
|
| 15 |
},
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 16 |
{
|
| 17 |
id: "PaddlePaddle/PaddleOCR-VL-1.5",
|
| 18 |
label: "PaddleOCR-VL-1.5 (GPU image)",
|
| 19 |
+
description: "Paddle's VLM-OCR, 1.5B. Six task modes. Available on the CUDA image (compose.gpu.yml).",
|
| 20 |
options: { task: "ocr" },
|
| 21 |
gpuRequired: true,
|
| 22 |
},
|
| 23 |
{
|
| 24 |
id: "zai-org/GLM-OCR",
|
| 25 |
label: "GLM-OCR (GPU only)",
|
| 26 |
+
description: "CogViT + GLM-0.5B decoder, 9B in bfloat16. Premium quality, needs ~18 GB VRAM (compose.gpu.yml).",
|
| 27 |
gpuRequired: true,
|
| 28 |
},
|
| 29 |
];
|
|
|
|
| 39 |
label: "Donut on DocVQA",
|
| 40 |
description: "Same Donut architecture, fine-tuned for visual question answering. Returns text answers.",
|
| 41 |
},
|
| 42 |
+
{
|
| 43 |
+
id: "naver-clova-ix/donut-base-finetuned-rvlcdip",
|
| 44 |
+
label: "Donut on RVL-CDIP (doc classification)",
|
| 45 |
+
description: "Same Donut architecture, fine-tuned for document-type classification across 16 classes (invoice, receipt, form, ...).",
|
| 46 |
+
},
|
| 47 |
];
|
| 48 |
|
| 49 |
export const NER_MODELS: ModelOption[] = [
|
|
|
|
| 57 |
label: "GLiNER large (English)",
|
| 58 |
description: "440M, English-focused, higher quality on English text.",
|
| 59 |
},
|
| 60 |
+
{
|
| 61 |
+
id: "urchade/gliner_multi_pii-v1",
|
| 62 |
+
label: "GLiNER multi PII",
|
| 63 |
+
description: "GLiNER fine-tuned for PII extraction. Good for redaction-style pipelines on documents.",
|
| 64 |
+
},
|
| 65 |
+
{
|
| 66 |
+
id: "numind/NuNER_Zero",
|
| 67 |
+
label: "NuNER Zero",
|
| 68 |
+
description: "NuMind's zero-shot NER. Different architecture from GLiNER; useful for comparing zero-shot NER families on the same input text.",
|
| 69 |
+
},
|
| 70 |
];
|
| 71 |
|
| 72 |
export const config = {
|
web/public/app.js
CHANGED
|
@@ -25,6 +25,7 @@ let donutBuf = { entities: [], data: null };
|
|
| 25 |
let glinerBuf = [];
|
| 26 |
let modelConfig = null;
|
| 27 |
let registeredSet = new Set();
|
|
|
|
| 28 |
|
| 29 |
function setBadge(text, cls) {
|
| 30 |
els.badge.textContent = text;
|
|
@@ -98,12 +99,15 @@ function populateDropdown(selectEl, options, defaultId) {
|
|
| 98 |
for (const opt of options) {
|
| 99 |
const node = document.createElement("option");
|
| 100 |
node.value = opt.id;
|
| 101 |
-
const
|
| 102 |
-
|
|
|
|
| 103 |
const labelSuffix = !available
|
| 104 |
-
?
|
| 105 |
? " (GPU image needed)"
|
| 106 |
-
:
|
|
|
|
|
|
|
| 107 |
: "";
|
| 108 |
node.textContent = opt.label + labelSuffix;
|
| 109 |
if (!available) node.disabled = true;
|
|
@@ -264,6 +268,7 @@ async function init() {
|
|
| 264 |
} else {
|
| 265 |
els.sieState.textContent = `SIE healthy · ${j.registeredModels} models registered`;
|
| 266 |
registered = j.registered ?? [];
|
|
|
|
| 267 |
}
|
| 268 |
} catch {
|
| 269 |
els.sieState.textContent = "could not reach the local server";
|
|
|
|
| 25 |
let glinerBuf = [];
|
| 26 |
let modelConfig = null;
|
| 27 |
let registeredSet = new Set();
|
| 28 |
+
let cudaAvailable = false;
|
| 29 |
|
| 30 |
function setBadge(text, cls) {
|
| 31 |
els.badge.textContent = text;
|
|
|
|
| 99 |
for (const opt of options) {
|
| 100 |
const node = document.createElement("option");
|
| 101 |
node.value = opt.id;
|
| 102 |
+
const inCatalog = registeredSet.size === 0 || registeredSet.has(opt.id);
|
| 103 |
+
const blockedByCuda = opt.gpuRequired && !cudaAvailable;
|
| 104 |
+
const available = inCatalog && !blockedByCuda;
|
| 105 |
const labelSuffix = !available
|
| 106 |
+
? blockedByCuda
|
| 107 |
? " (GPU image needed)"
|
| 108 |
+
: opt.gpuRequired
|
| 109 |
+
? " (GPU image needed)"
|
| 110 |
+
: " (not registered)"
|
| 111 |
: "";
|
| 112 |
node.textContent = opt.label + labelSuffix;
|
| 113 |
if (!available) node.disabled = true;
|
|
|
|
| 268 |
} else {
|
| 269 |
els.sieState.textContent = `SIE healthy · ${j.registeredModels} models registered`;
|
| 270 |
registered = j.registered ?? [];
|
| 271 |
+
cudaAvailable = !!j.cuda;
|
| 272 |
}
|
| 273 |
} catch {
|
| 274 |
els.sieState.textContent = "could not reach the local server";
|
web/public/index.html
CHANGED
|
@@ -46,7 +46,7 @@
|
|
| 46 |
<span>↓</span><span>↓</span><span>↓</span>
|
| 47 |
</div>
|
| 48 |
<div class="diagram-models">
|
| 49 |
-
<div class="diagram-box diagram-recognition">VLM-OCR<br><span>(
|
| 50 |
<div class="diagram-box diagram-structured">Donut<br><span>(end-to-end JSON)</span></div>
|
| 51 |
<div class="diagram-box diagram-ner">GLiNER<br><span>(zero-shot NER)</span></div>
|
| 52 |
</div>
|
|
|
|
| 46 |
<span>↓</span><span>↓</span><span>↓</span>
|
| 47 |
</div>
|
| 48 |
<div class="diagram-models">
|
| 49 |
+
<div class="diagram-box diagram-recognition">VLM-OCR<br><span>(LightOnOCR-2-1B, PaddleOCR-VL, GLM-OCR)</span></div>
|
| 50 |
<div class="diagram-box diagram-structured">Donut<br><span>(end-to-end JSON)</span></div>
|
| 51 |
<div class="diagram-box diagram-ner">GLiNER<br><span>(zero-shot NER)</span></div>
|
| 52 |
</div>
|
web/server.ts
CHANGED
|
@@ -49,14 +49,20 @@ function setupSse(res: http.ServerResponse) {
|
|
| 49 |
};
|
| 50 |
}
|
| 51 |
|
| 52 |
-
async function fetchModels(): Promise<{ ok: boolean; names: string[] }> {
|
| 53 |
try {
|
| 54 |
const r = await fetch(`${config.sieUrl}/v1/models`, { signal: AbortSignal.timeout(3000) });
|
| 55 |
-
if (!r.ok) return { ok: false, names: [] };
|
| 56 |
-
const json = (await r.json()) as {
|
| 57 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 58 |
} catch {
|
| 59 |
-
return { ok: false, names: [] };
|
| 60 |
}
|
| 61 |
}
|
| 62 |
|
|
@@ -129,7 +135,7 @@ const server = http.createServer(async (req, res) => {
|
|
| 129 |
if (p.startsWith("/samples/")) return serveFile(res, path.join(SAMPLE_DIR, p.slice("/samples/".length)));
|
| 130 |
|
| 131 |
if (p === "/api/health") {
|
| 132 |
-
const { ok, names } = await fetchModels();
|
| 133 |
return send(
|
| 134 |
res,
|
| 135 |
200,
|
|
@@ -138,6 +144,7 @@ const server = http.createServer(async (req, res) => {
|
|
| 138 |
sieUrl: config.sieUrl,
|
| 139 |
registeredModels: names.length,
|
| 140 |
registered: names,
|
|
|
|
| 141 |
}),
|
| 142 |
"application/json",
|
| 143 |
);
|
|
|
|
| 49 |
};
|
| 50 |
}
|
| 51 |
|
| 52 |
+
async function fetchModels(): Promise<{ ok: boolean; names: string[]; cuda: boolean }> {
|
| 53 |
try {
|
| 54 |
const r = await fetch(`${config.sieUrl}/v1/models`, { signal: AbortSignal.timeout(3000) });
|
| 55 |
+
if (!r.ok) return { ok: false, names: [], cuda: false };
|
| 56 |
+
const json = (await r.json()) as {
|
| 57 |
+
models?: { name: string; device?: string; state?: string }[];
|
| 58 |
+
};
|
| 59 |
+
const models = json.models ?? [];
|
| 60 |
+
// GPU compose preloads GPU-only models. If any catalog entry is currently
|
| 61 |
+
// loaded on a non-cpu device, treat this server as GPU-capable.
|
| 62 |
+
const cuda = models.some((m) => (m.device ?? "").toLowerCase().includes("cuda"));
|
| 63 |
+
return { ok: true, names: models.map((m) => m.name), cuda };
|
| 64 |
} catch {
|
| 65 |
+
return { ok: false, names: [], cuda: false };
|
| 66 |
}
|
| 67 |
}
|
| 68 |
|
|
|
|
| 135 |
if (p.startsWith("/samples/")) return serveFile(res, path.join(SAMPLE_DIR, p.slice("/samples/".length)));
|
| 136 |
|
| 137 |
if (p === "/api/health") {
|
| 138 |
+
const { ok, names, cuda } = await fetchModels();
|
| 139 |
return send(
|
| 140 |
res,
|
| 141 |
200,
|
|
|
|
| 144 |
sieUrl: config.sieUrl,
|
| 145 |
registeredModels: names.length,
|
| 146 |
registered: names,
|
| 147 |
+
cuda,
|
| 148 |
}),
|
| 149 |
"application/json",
|
| 150 |
);
|