esandorfi commited on
Commit
68f48a7
Β·
1 Parent(s): ea9122e

Domain features first reorganisation

Browse files
Files changed (37) hide show
  1. AGENTS.md +6 -1
  2. Dockerfile +1 -1
  3. Makefile +1 -1
  4. README.md +36 -20
  5. STORY.md +1 -1
  6. app.py +7 -1
  7. src/api/{main.py β†’ app.py} +43 -181
  8. src/api/app_factory.py +1 -1
  9. src/api/classify/__init__.py +1 -0
  10. src/api/{banks.py β†’ classify/banks.py} +1 -3
  11. src/api/{results.py β†’ classify/results.py} +0 -0
  12. src/api/classify/router.py +64 -0
  13. src/api/classify/schemas.py +29 -0
  14. src/api/{clip_service.py β†’ classify/service.py} +3 -3
  15. src/api/common/__init__.py +1 -0
  16. src/api/{deps.py β†’ common/deps.py} +16 -2
  17. src/api/{image_io.py β†’ common/image_io.py} +0 -0
  18. src/api/{logging_utils.py β†’ common/logging.py} +0 -0
  19. src/api/{middleware.py β†’ common/middleware.py} +0 -0
  20. src/api/{settings.py β†’ common/settings.py} +0 -0
  21. src/api/label_sets/__init__.py +1 -0
  22. src/api/{label_hash.py β†’ label_sets/hash.py} +0 -0
  23. src/api/{registry.py β†’ label_sets/registry.py} +1 -1
  24. src/api/label_sets/router.py +81 -0
  25. src/api/{schemas.py β†’ label_sets/schemas.py} +2 -27
  26. src/api/model/__init__.py +1 -0
  27. src/api/{clip_store.py β†’ model/clip_store.py} +4 -4
  28. src/api/ui/__init__.py +1 -0
  29. src/api/{page-banner.html β†’ ui/page-banner.html} +0 -0
  30. src/api/{page.html β†’ ui/page.html} +0 -0
  31. src/api/{splash.html β†’ ui/splash.html} +3 -2
  32. tests/__pycache__/conftest.cpython-312-pytest-8.3.2.pyc +0 -0
  33. tests/__pycache__/fakes.cpython-312.pyc +0 -0
  34. tests/__pycache__/test_integration_real_clip.cpython-312-pytest-8.3.2.pyc +0 -0
  35. tests/conftest.py +1 -1
  36. tests/fakes.py +4 -4
  37. tests/test_integration_real_clip.py +3 -3
AGENTS.md CHANGED
@@ -5,7 +5,12 @@
5
  - Keep responsibilities separated by layer: API, use-case, model, data.
6
  - Prefer small, typed functions with explicit inputs/outputs.
7
  - Keep side effects at the edges (IO, network, logging).
8
- - Use stable naming and paths (`src/app`, `tests`, `scripts`, `label-dataset`).
 
 
 
 
 
9
 
10
  ## FastAPI (proven patterns)
11
 
 
5
  - Keep responsibilities separated by layer: API, use-case, model, data.
6
  - Prefer small, typed functions with explicit inputs/outputs.
7
  - Keep side effects at the edges (IO, network, logging).
8
+ - Use stable naming and paths (`src/api`, `src/eval`, `tests`, `label-dataset`).
9
+
10
+ ## Structure
11
+
12
+ - Use domain-first folders under `src/api` (`label_sets/`, `classify/`, `model/`, `common/`, `ui/`).
13
+ - Keep schemas with their domain; only shared schema goes in `common/`.
14
 
15
  ## FastAPI (proven patterns)
16
 
Dockerfile CHANGED
@@ -25,4 +25,4 @@ RUN uv sync --no-dev
25
 
26
  ENV PATH="$HOME/app/.venv/bin:$PATH"
27
 
28
- CMD ["uvicorn", "api.main:app", "--host", "0.0.0.0", "--port", "7860", "--workers", "1"]
 
25
 
26
  ENV PATH="$HOME/app/.venv/bin:$PATH"
27
 
28
+ CMD ["uvicorn", "api.app:app", "--host", "0.0.0.0", "--port", "7860", "--workers", "1"]
Makefile CHANGED
@@ -39,7 +39,7 @@ local-install:
39
  uv sync --extra dev --python 3.12
40
 
41
  local-run:
42
- uv run uvicorn api.main:app --host 0.0.0.0 --port 7860 --reload
43
 
44
  local-test:
45
  uv run pytest -q
 
39
  uv sync --extra dev --python 3.12
40
 
41
  local-run:
42
+ uv run uvicorn api.app:app --host 0.0.0.0 --port 7860 --reload
43
 
44
  local-test:
45
  uv run pytest -q
README.md CHANGED
@@ -43,7 +43,7 @@ Use these as starting points or create your own taxonomy.
43
 
44
  ## API quickstart
45
 
46
- 1) Start the service (Docker or `uvicorn app:app`).
47
  2) Upload a label set.
48
  3) Optionally activate a label set.
49
  4) Classify images.
@@ -86,11 +86,13 @@ Guard policy example:
86
 
87
  ## Architecture
88
 
89
- - API layer: FastAPI endpoints and request/response schemas (`src/api/main.py`, `src/api/schemas.py`).
90
- - Use-case layer: two-stage classification (domain -> labels) (`src/api/clip_service.py`).
91
- - Model layer: CLIP model + processor + embedding banks (`src/api/clip_store.py`, `src/api/banks.py`).
92
- - Runtime support: registry, settings, logging, middleware (`src/api/registry.py`, `src/api/settings.py`,
93
- `src/api/logging_utils.py`, `src/api/middleware.py`).
 
 
94
 
95
  ## Coding rules (deeper)
96
 
@@ -223,25 +225,40 @@ uv run photo-eval prep --normalize-only --in-dir /path/to/images --out data_eval
223
 
224
  ## Project layout
225
 
 
 
226
  ```
227
  .
228
  β”œβ”€β”€ Dockerfile
 
229
  β”œβ”€β”€ requirements.txt
230
  └── src
231
  β”œβ”€β”€ api
232
- β”‚ β”œβ”€β”€ banks.py
233
- β”‚ β”œβ”€β”€ clip_service.py
234
- β”‚ β”œβ”€β”€ clip_store.py
235
- β”‚ β”œβ”€β”€ deps.py
236
- β”‚ β”œβ”€β”€ image_io.py
237
- β”‚ β”œβ”€β”€ label_hash.py
238
- β”‚ β”œβ”€β”€ logging_utils.py
239
- β”‚ β”œβ”€β”€ main.py
240
- β”‚ β”œβ”€β”€ middleware.py
241
- β”‚ β”œβ”€β”€ registry.py
242
- β”‚ β”œβ”€β”€ results.py
243
- β”‚ β”œβ”€β”€ schemas.py
244
- β”‚ └── settings.py
 
 
 
 
 
 
 
 
 
 
 
 
245
  └── eval
246
  β”œβ”€β”€ README.md
247
  β”œβ”€β”€ cli.py
@@ -256,4 +273,3 @@ uv run photo-eval prep --normalize-only --in-dir /path/to/images --out data_eval
256
  Emmanuel Sandorfi / Knowledge at Lighton
257
 
258
  01.2026
259
-
 
43
 
44
  ## API quickstart
45
 
46
+ 1) Start the service (Docker or `uvicorn api.app:app`).
47
  2) Upload a label set.
48
  3) Optionally activate a label set.
49
  4) Classify images.
 
86
 
87
  ## Architecture
88
 
89
+ Domain-first layout (URL-centric):
90
+
91
+ - `label_sets/`: label set API + schemas + registry + hash.
92
+ - `classify/`: classify API + schemas + two-stage classifier + results + banks.
93
+ - `model/`: CLIP store and embedding encoding.
94
+ - `common/`: settings, logging, deps, image IO, middleware.
95
+ - `ui/`: splash + page templates.
96
 
97
  ## Coding rules (deeper)
98
 
 
225
 
226
  ## Project layout
227
 
228
+ Note: root `app.py` is a lightweight HF Spaces placeholder that imports `api.app:app`.
229
+
230
  ```
231
  .
232
  β”œβ”€β”€ Dockerfile
233
+ β”œβ”€β”€ app.py
234
  β”œβ”€β”€ requirements.txt
235
  └── src
236
  β”œβ”€β”€ api
237
+ β”‚ β”œβ”€β”€ app.py
238
+ β”‚ β”œβ”€β”€ app_factory.py
239
+ β”‚ β”œβ”€β”€ common
240
+ β”‚ β”‚ β”œβ”€β”€ deps.py
241
+ β”‚ β”‚ β”œβ”€β”€ image_io.py
242
+ β”‚ β”‚ β”œβ”€β”€ logging.py
243
+ β”‚ β”‚ β”œβ”€β”€ middleware.py
244
+ β”‚ β”‚ └── settings.py
245
+ β”‚ β”œβ”€β”€ classify
246
+ β”‚ β”‚ β”œβ”€β”€ banks.py
247
+ β”‚ β”‚ β”œβ”€β”€ results.py
248
+ β”‚ β”‚ β”œβ”€β”€ router.py
249
+ β”‚ β”‚ β”œβ”€β”€ schemas.py
250
+ β”‚ β”‚ └── service.py
251
+ β”‚ β”œβ”€β”€ label_sets
252
+ β”‚ β”‚ β”œβ”€β”€ hash.py
253
+ β”‚ β”‚ β”œβ”€β”€ registry.py
254
+ β”‚ β”‚ β”œβ”€β”€ router.py
255
+ β”‚ β”‚ └── schemas.py
256
+ β”‚ β”œβ”€β”€ model
257
+ β”‚ β”‚ └── clip_store.py
258
+ β”‚ └── ui
259
+ β”‚ β”œβ”€β”€ page-banner.html
260
+ β”‚ β”œβ”€β”€ page.html
261
+ β”‚ └── splash.html
262
  └── eval
263
  β”œβ”€β”€ README.md
264
  β”œβ”€β”€ cli.py
 
273
  Emmanuel Sandorfi / Knowledge at Lighton
274
 
275
  01.2026
 
STORY.md CHANGED
@@ -74,4 +74,4 @@ Label sets aren’t just files; they’re the vocabulary of the system. Changing
74
 
75
  ### Deployment without drama
76
 
77
- HF Spaces drove a pragmatic discipline: CPU‑first defaults, deterministic builds, and a root `app.py` entrypoint to keep deployment friction low. The end result is a system that boots cleanly and behaves predictably in constrained environments.
 
74
 
75
  ### Deployment without drama
76
 
77
+ HF Spaces drove a pragmatic discipline: CPU‑first defaults, deterministic builds, and a root `app.py` shim (required by Spaces) that forwards to `api.app:app`. The end result is a system that boots cleanly and behaves predictably in constrained environments.
app.py CHANGED
@@ -1,3 +1,9 @@
1
- from api.main import app
 
 
 
 
 
 
2
 
3
  __all__ = ["app"]
 
1
+ """HF Spaces placeholder entrypoint.
2
+
3
+ HF Docker Spaces require an `app.py` file to exist. This shim is intentionally
4
+ tiny and forwards the actual ASGI app from `src/api/app.py`.
5
+ """
6
+
7
+ from api.app import app
8
 
9
  __all__ = ["app"]
src/api/{main.py β†’ app.py} RENAMED
@@ -2,31 +2,20 @@ from __future__ import annotations
2
 
3
  from contextlib import asynccontextmanager
4
  from dataclasses import dataclass
5
- from typing import Optional
6
 
7
- from fastapi import Depends, FastAPI, HTTPException, Query, Request, Response
8
  from fastapi.responses import HTMLResponse, JSONResponse
9
- from pathlib import Path
10
 
11
  import markdown
12
 
13
- from api.clip_store import ClipStore
14
- from api.clip_service import TwoStageClassifier
15
- from api.deps import get_request_id, resolve_bank
16
- from api.image_io import load_image_from_base64
17
- from api.logging_utils import setup_logging, log_json
18
- from api.middleware import RequestIdMiddleware
19
- from api.registry import LabelSetRegistry
20
- from api.schemas import (
21
- ActivateResponse,
22
- ClassifyRequest,
23
- ClassifyResponse,
24
- Hit,
25
- LabelSet,
26
- LabelSetCreateResponse,
27
- LabelSetInfo,
28
- )
29
- from api.settings import settings
30
 
31
 
32
  logger = setup_logging()
@@ -72,20 +61,36 @@ async def lifespan(app: FastAPI):
72
  await _maybe_aclose(store)
73
 
74
 
75
- def get_resources(request: Request) -> Resources:
76
- return request.app.state.resources
77
-
78
-
79
- def get_store(res: Resources = Depends(get_resources)) -> ClipStore:
80
- return res.store
81
-
82
-
83
- def get_classifier(res: Resources = Depends(get_resources)) -> TwoStageClassifier:
84
- return res.classifier
85
-
 
 
 
 
 
 
 
 
 
 
 
 
86
 
87
- def get_registry(res: Resources = Depends(get_resources)) -> LabelSetRegistry:
88
- return res.registry
 
 
 
 
89
 
90
 
91
  def create_app(*, resources: Resources | None = None) -> FastAPI:
@@ -129,48 +134,17 @@ def create_app(*, resources: Resources | None = None) -> FastAPI:
129
 
130
  @app.get("/", include_in_schema=False)
131
  def home() -> HTMLResponse:
132
- splash_path = Path(__file__).with_name("splash.html")
133
  try:
134
  html = splash_path.read_text(encoding="utf-8")
135
  except Exception:
136
- html = f"<h1>Photo Class</h1><p>Missing {splash_path}</p>"
137
- return HTMLResponse(content=html)
138
-
139
- def render_page(md_path: Path, *, title: str) -> HTMLResponse:
140
- header_path = Path(__file__).with_name("page-banner.html")
141
- page_template_path = Path(__file__).with_name("page.html")
142
- try:
143
- header_html = header_path.read_text(encoding="utf-8")
144
- template_html = page_template_path.read_text(encoding="utf-8")
145
- except Exception:
146
- return HTMLResponse(content="internal server error", status_code=500)
147
- try:
148
- md_text = md_path.read_text(encoding="utf-8")
149
- except Exception:
150
- return HTMLResponse(content="internal server error", status_code=500)
151
-
152
- if md_text.lstrip().startswith("---"):
153
- parts = md_text.split("---", 2)
154
- if len(parts) == 3:
155
- md_text = parts[2].lstrip()
156
-
157
- content_html = markdown.markdown(
158
- md_text,
159
- extensions=["fenced_code", "tables"],
160
- output_format="html5",
161
- )
162
-
163
- html = (
164
- template_html.replace("{{HEADER}}", header_html)
165
- .replace("{{CONTENT}}", content_html)
166
- .replace("{{TITLE}}", title)
167
- )
168
  return HTMLResponse(content=html)
169
 
170
  @app.get("/readme", include_in_schema=False)
171
  def readme() -> HTMLResponse:
172
  readme_path = Path(__file__).resolve().parents[2] / "README.md"
173
- return render_page(readme_path, title="Readme")
174
 
175
  @app.get("/story", include_in_schema=False)
176
  def story() -> HTMLResponse:
@@ -183,120 +157,8 @@ def create_app(*, resources: Resources | None = None) -> FastAPI:
183
  log_json(logger, event="error.unhandled", request_id=rid, error=str(exc), path=str(request.url.path))
184
  return JSONResponse(status_code=500, content={"detail": "internal server error"})
185
 
186
- @app.post("/api/v1/label-sets", response_model=LabelSetCreateResponse)
187
- def create_label_set(
188
- payload: LabelSet,
189
- request_id: str = Depends(get_request_id),
190
- store: ClipStore = Depends(get_store),
191
- registry: LabelSetRegistry = Depends(get_registry),
192
- ) -> LabelSetCreateResponse:
193
- bank = store.build_bank(payload)
194
- registry.upsert(bank)
195
-
196
- label_count = sum(len(b.ids) for b in bank.labels_by_domain.values())
197
- is_default = (registry.default_hash == bank.label_set_hash)
198
-
199
- log_json(
200
- logger,
201
- event="label_sets.upsert",
202
- request_id=request_id,
203
- label_set_hash=bank.label_set_hash,
204
- name=bank.name,
205
- domain_count=len(bank.domains.ids),
206
- label_count=label_count,
207
- is_default=is_default,
208
- )
209
-
210
- return LabelSetCreateResponse(
211
- label_set_hash=bank.label_set_hash,
212
- name=bank.name,
213
- domain_count=len(bank.domains.ids),
214
- label_count=label_count,
215
- is_default=is_default,
216
- )
217
-
218
- @app.get("/api/v1/label-sets", response_model=list[LabelSetInfo])
219
- def list_label_sets(
220
- request_id: str = Depends(get_request_id),
221
- registry: LabelSetRegistry = Depends(get_registry),
222
- ) -> list[LabelSetInfo]:
223
- out: list[LabelSetInfo] = []
224
- for bank in registry.banks.values():
225
- label_count = sum(len(b.ids) for b in bank.labels_by_domain.values())
226
- out.append(
227
- LabelSetInfo(
228
- label_set_hash=bank.label_set_hash,
229
- name=bank.name,
230
- domain_count=len(bank.domains.ids),
231
- label_count=label_count,
232
- is_default=(registry.default_hash == bank.label_set_hash),
233
- )
234
- )
235
-
236
- log_json(logger, event="label_sets.list", request_id=request_id, count=len(out))
237
- return out
238
-
239
- @app.post("/api/v1/label-sets/{label_set_hash}/activate", response_model=ActivateResponse)
240
- def activate_label_set(
241
- label_set_hash: str,
242
- request_id: str = Depends(get_request_id),
243
- registry: LabelSetRegistry = Depends(get_registry),
244
- ) -> ActivateResponse:
245
- try:
246
- registry.activate(label_set_hash)
247
- except KeyError:
248
- raise HTTPException(status_code=404, detail="Unknown label_set_hash")
249
-
250
- log_json(logger, event="label_sets.activate", request_id=request_id, default_label_set_hash=label_set_hash)
251
- return ActivateResponse(default_label_set_hash=label_set_hash)
252
-
253
- @app.post("/api/v1/classify", response_model=ClassifyResponse)
254
- def classify(
255
- payload: ClassifyRequest,
256
- request: Request,
257
- request_id: str = Depends(get_request_id),
258
- label_set_hash: Optional[str] = Query(default=None, description="If omitted, uses the default label set."),
259
- classifier: TwoStageClassifier = Depends(get_classifier),
260
- registry: LabelSetRegistry = Depends(get_registry),
261
- ) -> ClassifyResponse:
262
- bank = resolve_bank(registry, label_set_hash)
263
-
264
- image = load_image_from_base64(
265
- payload.image_base64,
266
- max_bytes=settings.max_image_mb * 1024 * 1024,
267
- )
268
-
269
- res = classifier.classify(
270
- bank=bank,
271
- image=image,
272
- domain_top_n=payload.domain_top_n or settings.default_domain_top_n,
273
- top_k=payload.top_k or settings.default_top_k,
274
- )
275
-
276
- log_json(
277
- logger,
278
- event="classify",
279
- request_id=request_id,
280
- label_set_hash=bank.label_set_hash,
281
- model_id=settings.clip_model_id,
282
- domain_top_n=payload.domain_top_n,
283
- top_k=payload.top_k,
284
- chosen_domains=res.chosen_domains,
285
- elapsed_ms=res.timings.total_ms,
286
- elapsed_domain_ms=res.timings.domain_ms,
287
- elapsed_labels_ms=res.timings.labels_ms,
288
- )
289
-
290
- return ClassifyResponse(
291
- label_set_hash=bank.label_set_hash,
292
- model_id=settings.clip_model_id,
293
- domain_hits=[Hit(id=i, score=s) for i, s in res.domain_hits],
294
- chosen_domains=res.chosen_domains,
295
- label_hits=[Hit(id=i, score=s) for i, s in res.label_hits],
296
- elapsed_ms=res.timings.total_ms,
297
- elapsed_domain_ms=res.timings.domain_ms,
298
- elapsed_labels_ms=res.timings.labels_ms,
299
- )
300
 
301
  return app
302
 
 
2
 
3
  from contextlib import asynccontextmanager
4
  from dataclasses import dataclass
5
+ from pathlib import Path
6
 
7
+ from fastapi import FastAPI, Request, Response
8
  from fastapi.responses import HTMLResponse, JSONResponse
 
9
 
10
  import markdown
11
 
12
+ from api.classify.router import router as classify_router
13
+ from api.common.logging import log_json, setup_logging
14
+ from api.common.middleware import RequestIdMiddleware
15
+ from api.label_sets.router import router as label_sets_router
16
+ from api.label_sets.registry import LabelSetRegistry
17
+ from api.model.clip_store import ClipStore
18
+ from api.classify.service import TwoStageClassifier
 
 
 
 
 
 
 
 
 
 
19
 
20
 
21
  logger = setup_logging()
 
61
  await _maybe_aclose(store)
62
 
63
 
64
+ def render_page(md_path: Path, *, title: str) -> HTMLResponse:
65
+ header_path = Path(__file__).resolve().parent / "ui" / "page-banner.html"
66
+ page_template_path = Path(__file__).resolve().parent / "ui" / "page.html"
67
+ try:
68
+ header_html = header_path.read_text(encoding="utf-8")
69
+ template_html = page_template_path.read_text(encoding="utf-8")
70
+ except Exception:
71
+ return HTMLResponse(content="internal server error", status_code=500)
72
+ try:
73
+ md_text = md_path.read_text(encoding="utf-8")
74
+ except Exception:
75
+ return HTMLResponse(content="internal server error", status_code=500)
76
+
77
+ if md_text.lstrip().startswith("---"):
78
+ parts = md_text.split("---", 2)
79
+ if len(parts) == 3:
80
+ md_text = parts[2].lstrip()
81
+
82
+ content_html = markdown.markdown(
83
+ md_text,
84
+ extensions=["fenced_code", "tables"],
85
+ output_format="html5",
86
+ )
87
 
88
+ html = (
89
+ template_html.replace("{{HEADER}}", header_html)
90
+ .replace("{{CONTENT}}", content_html)
91
+ .replace("{{TITLE}}", title)
92
+ )
93
+ return HTMLResponse(content=html)
94
 
95
 
96
  def create_app(*, resources: Resources | None = None) -> FastAPI:
 
134
 
135
  @app.get("/", include_in_schema=False)
136
  def home() -> HTMLResponse:
137
+ splash_path = Path(__file__).resolve().parent / "ui" / "splash.html"
138
  try:
139
  html = splash_path.read_text(encoding="utf-8")
140
  except Exception:
141
+ html = f"<h1>Photo Classification</h1><p>Missing {splash_path}</p>"
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
142
  return HTMLResponse(content=html)
143
 
144
  @app.get("/readme", include_in_schema=False)
145
  def readme() -> HTMLResponse:
146
  readme_path = Path(__file__).resolve().parents[2] / "README.md"
147
+ return render_page(readme_path, title="README")
148
 
149
  @app.get("/story", include_in_schema=False)
150
  def story() -> HTMLResponse:
 
157
  log_json(logger, event="error.unhandled", request_id=rid, error=str(exc), path=str(request.url.path))
158
  return JSONResponse(status_code=500, content={"detail": "internal server error"})
159
 
160
+ app.include_router(label_sets_router)
161
+ app.include_router(classify_router)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
162
 
163
  return app
164
 
src/api/app_factory.py CHANGED
@@ -1,7 +1,7 @@
1
  from __future__ import annotations
2
 
3
  from fastapi import FastAPI
4
- from api.main import build_app # we'll define build_app in main.py
5
 
6
  def create_app() -> FastAPI:
7
  return build_app()
 
1
  from __future__ import annotations
2
 
3
  from fastapi import FastAPI
4
+ from api.app import build_app # we'll define build_app in app.py
5
 
6
  def create_app() -> FastAPI:
7
  return build_app()
src/api/classify/__init__.py ADDED
@@ -0,0 +1 @@
 
 
1
+ """Classification domain."""
src/api/{banks.py β†’ classify/banks.py} RENAMED
@@ -3,13 +3,11 @@ from __future__ import annotations
3
  from dataclasses import dataclass
4
  import torch
5
 
6
- Tensor = torch.Tensor
7
-
8
 
9
  @dataclass(frozen=True, slots=True)
10
  class EmbeddingBank:
11
  ids: tuple[str, ...]
12
- feats: Tensor # (N, D) normalized
13
 
14
 
15
  @dataclass(frozen=True, slots=True)
 
3
  from dataclasses import dataclass
4
  import torch
5
 
 
 
6
 
7
  @dataclass(frozen=True, slots=True)
8
  class EmbeddingBank:
9
  ids: tuple[str, ...]
10
+ feats: torch.Tensor # (N, D) normalized
11
 
12
 
13
  @dataclass(frozen=True, slots=True)
src/api/{results.py β†’ classify/results.py} RENAMED
File without changes
src/api/classify/router.py ADDED
@@ -0,0 +1,64 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ from __future__ import annotations
2
+
3
+ from typing import Optional
4
+
5
+ from fastapi import APIRouter, Depends, Query, Request
6
+
7
+ from api.classify.schemas import ClassifyRequest, ClassifyResponse, Hit
8
+ from api.common.deps import get_classifier, get_registry, get_request_id, resolve_bank
9
+ from api.common.logging import log_json, setup_logging
10
+ from api.common.image_io import load_image_from_base64
11
+ from api.common.settings import settings
12
+
13
+
14
+ logger = setup_logging()
15
+ router = APIRouter(prefix="/api/v1", tags=["classify"])
16
+
17
+
18
+ @router.post("/classify", response_model=ClassifyResponse)
19
+ def classify(
20
+ payload: ClassifyRequest,
21
+ request: Request,
22
+ request_id: str = Depends(get_request_id),
23
+ label_set_hash: Optional[str] = Query(default=None, description="If omitted, uses the default label set."),
24
+ classifier=Depends(get_classifier),
25
+ registry=Depends(get_registry),
26
+ ) -> ClassifyResponse:
27
+ bank = resolve_bank(registry, label_set_hash)
28
+
29
+ image = load_image_from_base64(
30
+ payload.image_base64,
31
+ max_bytes=settings.max_image_mb * 1024 * 1024,
32
+ )
33
+
34
+ res = classifier.classify(
35
+ bank=bank,
36
+ image=image,
37
+ domain_top_n=payload.domain_top_n or settings.default_domain_top_n,
38
+ top_k=payload.top_k or settings.default_top_k,
39
+ )
40
+
41
+ log_json(
42
+ logger,
43
+ event="classify",
44
+ request_id=request_id,
45
+ label_set_hash=bank.label_set_hash,
46
+ model_id=settings.clip_model_id,
47
+ domain_top_n=payload.domain_top_n,
48
+ top_k=payload.top_k,
49
+ chosen_domains=res.chosen_domains,
50
+ elapsed_ms=res.timings.total_ms,
51
+ elapsed_domain_ms=res.timings.domain_ms,
52
+ elapsed_labels_ms=res.timings.labels_ms,
53
+ )
54
+
55
+ return ClassifyResponse(
56
+ label_set_hash=bank.label_set_hash,
57
+ model_id=settings.clip_model_id,
58
+ domain_hits=[Hit(id=i, score=s) for i, s in res.domain_hits],
59
+ chosen_domains=res.chosen_domains,
60
+ label_hits=[Hit(id=i, score=s) for i, s in res.label_hits],
61
+ elapsed_ms=res.timings.total_ms,
62
+ elapsed_domain_ms=res.timings.domain_ms,
63
+ elapsed_labels_ms=res.timings.labels_ms,
64
+ )
src/api/classify/schemas.py ADDED
@@ -0,0 +1,29 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ from __future__ import annotations
2
+
3
+ from typing import List
4
+ from pydantic import BaseModel, Field, ConfigDict
5
+
6
+
7
+ class ClassifyRequest(BaseModel):
8
+ model_config = ConfigDict(extra="forbid")
9
+ image_base64: str = Field(..., description="Base64-encoded image bytes (jpg/png/webp).")
10
+ domain_top_n: int = Field(default=2, ge=1, le=3)
11
+ top_k: int = Field(default=5, ge=1, le=20)
12
+
13
+
14
+ class Hit(BaseModel):
15
+ model_config = ConfigDict(extra="forbid")
16
+ id: str
17
+ score: float
18
+
19
+
20
+ class ClassifyResponse(BaseModel):
21
+ model_config = ConfigDict(extra="forbid", protected_namespaces=())
22
+ label_set_hash: str
23
+ model_id: str
24
+ domain_hits: List[Hit]
25
+ chosen_domains: List[str]
26
+ label_hits: List[Hit]
27
+ elapsed_ms: int
28
+ elapsed_domain_ms: int
29
+ elapsed_labels_ms: int
src/api/{clip_service.py β†’ classify/service.py} RENAMED
@@ -5,9 +5,9 @@ from dataclasses import dataclass
5
 
6
  import torch
7
 
8
- from api.banks import EmbeddingBank, LabelSetBank
9
- from api.clip_store import ClipStore
10
- from api.results import ClassificationResult, StageTimings
11
 
12
 
13
  @dataclass(slots=True)
 
5
 
6
  import torch
7
 
8
+ from api.classify.banks import EmbeddingBank, LabelSetBank
9
+ from api.model.clip_store import ClipStore
10
+ from api.classify.results import ClassificationResult, StageTimings
11
 
12
 
13
  @dataclass(slots=True)
src/api/common/__init__.py ADDED
@@ -0,0 +1 @@
 
 
1
+ """Shared utilities for the API."""
src/api/{deps.py β†’ common/deps.py} RENAMED
@@ -4,8 +4,10 @@ from typing import Optional
4
 
5
  from fastapi import Depends, HTTPException, Query, Request
6
 
7
- from api.banks import LabelSetBank
8
- from api.registry import LabelSetRegistry
 
 
9
 
10
 
11
  def get_request_id(request: Request) -> str:
@@ -36,3 +38,15 @@ def get_bank(
36
  label_set_hash: Optional[str] = Depends(get_label_set_hash),
37
  ) -> LabelSetBank:
38
  return resolve_bank(registry, label_set_hash)
 
 
 
 
 
 
 
 
 
 
 
 
 
4
 
5
  from fastapi import Depends, HTTPException, Query, Request
6
 
7
+ from api.classify.banks import LabelSetBank
8
+ from api.label_sets.registry import LabelSetRegistry
9
+ from api.model.clip_store import ClipStore
10
+ from api.classify.service import TwoStageClassifier
11
 
12
 
13
  def get_request_id(request: Request) -> str:
 
38
  label_set_hash: Optional[str] = Depends(get_label_set_hash),
39
  ) -> LabelSetBank:
40
  return resolve_bank(registry, label_set_hash)
41
+
42
+
43
+ def get_store(request: Request) -> ClipStore:
44
+ return request.app.state.resources.store
45
+
46
+
47
+ def get_classifier(request: Request) -> TwoStageClassifier:
48
+ return request.app.state.resources.classifier
49
+
50
+
51
+ def get_registry(request: Request) -> LabelSetRegistry:
52
+ return request.app.state.resources.registry
src/api/{image_io.py β†’ common/image_io.py} RENAMED
File without changes
src/api/{logging_utils.py β†’ common/logging.py} RENAMED
File without changes
src/api/{middleware.py β†’ common/middleware.py} RENAMED
File without changes
src/api/{settings.py β†’ common/settings.py} RENAMED
File without changes
src/api/label_sets/__init__.py ADDED
@@ -0,0 +1 @@
 
 
1
+ """Label set domain."""
src/api/{label_hash.py β†’ label_sets/hash.py} RENAMED
File without changes
src/api/{registry.py β†’ label_sets/registry.py} RENAMED
@@ -2,7 +2,7 @@ from __future__ import annotations
2
 
3
  from dataclasses import dataclass
4
  from typing import Optional
5
- from api.banks import LabelSetBank
6
 
7
 
8
  @dataclass(slots=True)
 
2
 
3
  from dataclasses import dataclass
4
  from typing import Optional
5
+ from api.classify.banks import LabelSetBank
6
 
7
 
8
  @dataclass(slots=True)
src/api/label_sets/router.py ADDED
@@ -0,0 +1,81 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ from __future__ import annotations
2
+
3
+ from fastapi import APIRouter, Depends, HTTPException
4
+
5
+ from api.common.logging import log_json, setup_logging
6
+ from api.common.deps import get_request_id, get_registry, get_store
7
+ from api.label_sets.schemas import ActivateResponse, LabelSet, LabelSetCreateResponse, LabelSetInfo
8
+
9
+
10
+ logger = setup_logging()
11
+ router = APIRouter(prefix="/api/v1/label-sets", tags=["label-sets"])
12
+
13
+
14
+ @router.post("", response_model=LabelSetCreateResponse)
15
+ def create_label_set(
16
+ payload: LabelSet,
17
+ request_id: str = Depends(get_request_id),
18
+ store=Depends(get_store),
19
+ registry=Depends(get_registry),
20
+ ) -> LabelSetCreateResponse:
21
+ bank = store.build_bank(payload)
22
+ registry.upsert(bank)
23
+
24
+ label_count = sum(len(b.ids) for b in bank.labels_by_domain.values())
25
+ is_default = registry.default_hash == bank.label_set_hash
26
+
27
+ log_json(
28
+ logger,
29
+ event="label_sets.upsert",
30
+ request_id=request_id,
31
+ label_set_hash=bank.label_set_hash,
32
+ name=bank.name,
33
+ domain_count=len(bank.domains.ids),
34
+ label_count=label_count,
35
+ is_default=is_default,
36
+ )
37
+
38
+ return LabelSetCreateResponse(
39
+ label_set_hash=bank.label_set_hash,
40
+ name=bank.name,
41
+ domain_count=len(bank.domains.ids),
42
+ label_count=label_count,
43
+ is_default=is_default,
44
+ )
45
+
46
+
47
+ @router.get("", response_model=list[LabelSetInfo])
48
+ def list_label_sets(
49
+ request_id: str = Depends(get_request_id),
50
+ registry=Depends(get_registry),
51
+ ) -> list[LabelSetInfo]:
52
+ out: list[LabelSetInfo] = []
53
+ for bank in registry.banks.values():
54
+ label_count = sum(len(b.ids) for b in bank.labels_by_domain.values())
55
+ out.append(
56
+ LabelSetInfo(
57
+ label_set_hash=bank.label_set_hash,
58
+ name=bank.name,
59
+ domain_count=len(bank.domains.ids),
60
+ label_count=label_count,
61
+ is_default=(registry.default_hash == bank.label_set_hash),
62
+ )
63
+ )
64
+
65
+ log_json(logger, event="label_sets.list", request_id=request_id, count=len(out))
66
+ return out
67
+
68
+
69
+ @router.post("/{label_set_hash}/activate", response_model=ActivateResponse)
70
+ def activate_label_set(
71
+ label_set_hash: str,
72
+ request_id: str = Depends(get_request_id),
73
+ registry=Depends(get_registry),
74
+ ) -> ActivateResponse:
75
+ try:
76
+ registry.activate(label_set_hash)
77
+ except KeyError:
78
+ raise HTTPException(status_code=404, detail="Unknown label_set_hash")
79
+
80
+ log_json(logger, event="label_sets.activate", request_id=request_id, default_label_set_hash=label_set_hash)
81
+ return ActivateResponse(default_label_set_hash=label_set_hash)
src/api/{schemas.py β†’ label_sets/schemas.py} RENAMED
@@ -1,7 +1,7 @@
1
  from __future__ import annotations
2
 
3
- from typing import Dict, List, Optional
4
- from pydantic import BaseModel, Field, ConfigDict
5
 
6
 
7
  class LabelItem(BaseModel):
@@ -44,28 +44,3 @@ class LabelSetCreateResponse(BaseModel):
44
  class ActivateResponse(BaseModel):
45
  model_config = ConfigDict(extra="forbid")
46
  default_label_set_hash: str
47
-
48
-
49
- class ClassifyRequest(BaseModel):
50
- model_config = ConfigDict(extra="forbid")
51
- image_base64: str = Field(..., description="Base64-encoded image bytes (jpg/png/webp).")
52
- domain_top_n: int = Field(default=2, ge=1, le=3)
53
- top_k: int = Field(default=5, ge=1, le=20)
54
-
55
-
56
- class Hit(BaseModel):
57
- model_config = ConfigDict(extra="forbid")
58
- id: str
59
- score: float
60
-
61
-
62
- class ClassifyResponse(BaseModel):
63
- model_config = ConfigDict(extra="forbid", protected_namespaces=())
64
- label_set_hash: str
65
- model_id: str
66
- domain_hits: List[Hit]
67
- chosen_domains: List[str]
68
- label_hits: List[Hit]
69
- elapsed_ms: int
70
- elapsed_domain_ms: int
71
- elapsed_labels_ms: int
 
1
  from __future__ import annotations
2
 
3
+ from typing import Dict, List
4
+ from pydantic import BaseModel, ConfigDict
5
 
6
 
7
  class LabelItem(BaseModel):
 
44
  class ActivateResponse(BaseModel):
45
  model_config = ConfigDict(extra="forbid")
46
  default_label_set_hash: str
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
src/api/model/__init__.py ADDED
@@ -0,0 +1 @@
 
 
1
+ """Model layer."""
src/api/{clip_store.py β†’ model/clip_store.py} RENAMED
@@ -5,10 +5,10 @@ import warnings
5
  import torch
6
  from transformers import CLIPModel, CLIPProcessor
7
 
8
- from api.banks import EmbeddingBank, LabelSetBank
9
- from api.label_hash import stable_hash
10
- from api.schemas import LabelSet
11
- from api.settings import settings
12
 
13
 
14
  class ClipStore:
 
5
  import torch
6
  from transformers import CLIPModel, CLIPProcessor
7
 
8
+ from api.classify.banks import EmbeddingBank, LabelSetBank
9
+ from api.label_sets.hash import stable_hash
10
+ from api.label_sets.schemas import LabelSet
11
+ from api.common.settings import settings
12
 
13
 
14
  class ClipStore:
src/api/ui/__init__.py ADDED
@@ -0,0 +1 @@
 
 
1
+ """UI templates."""
src/api/{page-banner.html β†’ ui/page-banner.html} RENAMED
File without changes
src/api/{page.html β†’ ui/page.html} RENAMED
File without changes
src/api/{splash.html β†’ ui/splash.html} RENAMED
@@ -3,7 +3,7 @@
3
  <head>
4
  <meta charset="utf-8" />
5
  <meta name="viewport" content="width=device-width, initial-scale=1" />
6
- <title>Photo Class</title>
7
  <style>
8
  :root {
9
  --bg: #f8fafc;
@@ -69,7 +69,8 @@
69
  <h1>Photo Classification</h1>
70
  <p>A small, prompt-driven photo classification API built on CLIP. Upload a label set, classify images, and inspect timings.</p>
71
  <p>This project grew through an intensive dialog with Codex β€” a steady build from core API to eval tooling and HF Spaces deployment.</p>
72
- <p>01.2026 by Emmanuel Sandorfi / Knowledge @ Lighton</p>
 
73
  <div class="actions">
74
  <a class="button primary" href="/docs">API Docs</a>
75
  <a class="button" href="/story">Read the Story</a>
 
3
  <head>
4
  <meta charset="utf-8" />
5
  <meta name="viewport" content="width=device-width, initial-scale=1" />
6
+ <title>Photo Classification</title>
7
  <style>
8
  :root {
9
  --bg: #f8fafc;
 
69
  <h1>Photo Classification</h1>
70
  <p>A small, prompt-driven photo classification API built on CLIP. Upload a label set, classify images, and inspect timings.</p>
71
  <p>This project grew through an intensive dialog with Codex β€” a steady build from core API to eval tooling and HF Spaces deployment.</p>
72
+ <p>Emmanuel Sandorfi / Knowledge @ Lighton.ai</p>
73
+ <p>2026, January</p>
74
  <div class="actions">
75
  <a class="button primary" href="/docs">API Docs</a>
76
  <a class="button" href="/story">Read the Story</a>
tests/__pycache__/conftest.cpython-312-pytest-8.3.2.pyc CHANGED
Binary files a/tests/__pycache__/conftest.cpython-312-pytest-8.3.2.pyc and b/tests/__pycache__/conftest.cpython-312-pytest-8.3.2.pyc differ
 
tests/__pycache__/fakes.cpython-312.pyc CHANGED
Binary files a/tests/__pycache__/fakes.cpython-312.pyc and b/tests/__pycache__/fakes.cpython-312.pyc differ
 
tests/__pycache__/test_integration_real_clip.cpython-312-pytest-8.3.2.pyc CHANGED
Binary files a/tests/__pycache__/test_integration_real_clip.cpython-312-pytest-8.3.2.pyc and b/tests/__pycache__/test_integration_real_clip.cpython-312-pytest-8.3.2.pyc differ
 
tests/conftest.py CHANGED
@@ -6,7 +6,7 @@ import pytest
6
  from PIL import Image
7
  from fastapi.testclient import TestClient
8
 
9
- from api.main import build_app
10
  from tests.fakes import FakeClipStore, FakeTwoStageClassifier
11
 
12
 
 
6
  from PIL import Image
7
  from fastapi.testclient import TestClient
8
 
9
+ from api.app import build_app
10
  from tests.fakes import FakeClipStore, FakeTwoStageClassifier
11
 
12
 
tests/fakes.py CHANGED
@@ -2,10 +2,10 @@ from __future__ import annotations
2
 
3
  from dataclasses import dataclass
4
 
5
- from api.banks import EmbeddingBank, LabelSetBank
6
- from api.label_hash import stable_hash
7
- from api.results import ClassificationResult, StageTimings
8
- from api.schemas import LabelSet
9
 
10
 
11
  class FakeClipStore:
 
2
 
3
  from dataclasses import dataclass
4
 
5
+ from api.classify.banks import EmbeddingBank, LabelSetBank
6
+ from api.label_sets.hash import stable_hash
7
+ from api.classify.results import ClassificationResult, StageTimings
8
+ from api.label_sets.schemas import LabelSet
9
 
10
 
11
  class FakeClipStore:
tests/test_integration_real_clip.py CHANGED
@@ -6,9 +6,9 @@ import pytest
6
  from PIL import Image
7
  from fastapi.testclient import TestClient
8
 
9
- from api.main import build_app
10
- from api.clip_store import ClipStore
11
- from api.clip_service import TwoStageClassifier
12
 
13
 
14
  @pytest.mark.integration
 
6
  from PIL import Image
7
  from fastapi.testclient import TestClient
8
 
9
+ from api.app import build_app
10
+ from api.model.clip_store import ClipStore
11
+ from api.classify.service import TwoStageClassifier
12
 
13
 
14
  @pytest.mark.integration