# Architecture ## Overview ![Docling Studio architecture](images/global.png){ width="700" } Two services communicating via REST. The frontend is a Vue 3 SPA served by Nginx in production. The backend is a FastAPI app that wraps Docling's document conversion engine. ### Zooming into the backend The schema above shows the macro view. Inside the backend, the code follows a **Clean Architecture** with strict layer boundaries: ``` ┌──────────────────────────────────────────────────────┐ │ Backend │ │ │ │ ┌──────────┐ │ │ │ api/ │ ← HTTP (FastAPI routes, Pydantic) │ │ └────┬─────┘ │ │ │ calls │ │ ┌────▼─────┐ │ │ │services/ │ ← Use case orchestration │ │ └──┬────┬──┘ │ │ │ │ │ │ ┌───▼──┐ ┌▼───────────┐ │ │ │domain│ │persistence/ │ │ │ │ │ │ │ │ │ │bbox │ │ SQLite CRUD │ ← Storage (your blue box) │ │ │parse │ │ file store │ │ │ └──────┘ └─────────────┘ │ │ ↑ pure Python, no deps ↑ aiosqlite │ └──────────────────────────────────────────────────────┘ ``` Dependencies flow **inward**: `api → services → domain`. The domain layer has zero knowledge of HTTP or database. ## Backend — Clean Architecture The backend follows a strict layered architecture. Dependencies flow inward: API → Services → Domain. The domain layer has zero knowledge of HTTP or database. ``` document-parser/ ├── main.py # FastAPI app, CORS, lifespan, health endpoint │ ├── domain/ # Pure domain — no HTTP, no DB │ ├── models.py # Document, AnalysisJob dataclasses │ ├── ports.py # Abstract protocols (DocumentConverter, DocumentChunker) │ ├── value_objects.py # ConversionResult, ChunkingOptions, ChunkResult │ └── bbox.py # Bounding box coordinate normalization │ ├── api/ # HTTP layer (FastAPI routers) │ ├── schemas.py # Pydantic DTOs (camelCase serialization) │ ├── documents.py # /api/documents endpoints │ └── analyses.py # /api/analyses endpoints (create, rechunk, delete) │ ├── persistence/ # Data layer (SQLite via aiosqlite) │ ├── database.py # Connection management, schema init │ ├── document_repo.py # Document CRUD │ └── analysis_repo.py # AnalysisJob CRUD │ ├── infra/ # Infrastructure adapters │ ├── settings.py # Environment-based configuration │ ├── local_converter.py # In-process Docling converter (local mode) │ ├── serve_converter.py # HTTP client for Docling Serve (remote mode) │ ├── local_chunker.py # In-process chunking (HierarchicalChunker, HybridChunker) │ ├── rate_limiter.py # Sliding-window rate limiting middleware │ └── bbox.py # Bbox coordinate normalization helpers │ ├── services/ # Use case orchestration │ ├── document_service.py # Upload, delete, preview │ └── analysis_service.py # Async Docling processing + chunking │ └── tests/ # pytest (199 tests) ``` ### Layer responsibilities | Layer | Role | Depends on | |-------|------|------------| | **domain** | Dataclasses, value objects, abstract ports | Nothing (pure Python) | | **persistence** | SQLite CRUD, aiosqlite | domain (models) | | **infra** | Adapters: converters, chunker, rate limiter, settings | domain (ports, value objects) | | **services** | Orchestrate use cases, call converters/chunkers | domain + persistence + infra | | **api** | HTTP endpoints, Pydantic DTOs, error handling | services | ### API contract The API uses **camelCase** serialization (via Pydantic `alias_generator`), while the backend uses **snake_case** internally. The `pages_json` field contains raw `dataclasses.asdict()` output, so page data uses **snake_case** (`page_number`, not `pageNumber`). ## Frontend — Feature-Based The frontend is organized by feature, each with its own store, API client, and UI components. ``` frontend/src/ ├── app/ # App shell, router, global styles ├── pages/ # Route-level pages │ ├── HomePage.vue │ ├── StudioPage.vue # PDF viewer + config + results │ ├── DocumentsPage.vue │ ├── HistoryPage.vue │ └── SettingsPage.vue │ ├── features/ # Feature modules │ ├── analysis/ # Analysis store, API, bbox scaling, UI │ │ ├── store.ts │ │ ├── api.ts │ │ ├── bboxScaling.ts # Pure math: page coords → pixel coords │ │ └── ui/ │ │ ├── BboxOverlay.vue │ │ ├── AnalysisPanel.vue │ │ ├── StructureViewer.vue │ │ └── ... │ ├── chunking/ # Chunk panel UI + rechunk action │ ├── document/ # Document store, API, upload │ ├── feature-flags/ # Feature flag store (reads /api/health) │ ├── history/ # History store, navigation │ └── settings/ # Theme, locale, API URL │ └── shared/ # Cross-feature utilities ├── types.ts # All shared TypeScript interfaces ├── i18n.ts # FR/EN translations ├── format.ts # Date/size formatters └── api/http.ts # HTTP client (fetch wrapper) ``` ### Data flow ``` User action → Pinia store action → API client (fetch) → Backend REST endpoint │ Backend response → Pinia store state → Vue reactivity → UI update ``` ### Key design decisions - **Pinia stores** per feature, not global. Each feature owns its state. - **TypeScript strict mode** with shared interfaces in `shared/types.ts`. - **No component library** — custom CSS with CSS variables for theming. - **vue-tsc** in CI to catch type errors before merge. ## Feature Flags The frontend adapts its UI based on the backend's capabilities. On startup, the feature flag store fetches `/api/health` and reads the `engine` and `deploymentMode` fields. | Flag | Condition | Effect | |------|-----------|--------| | `chunking` | `engine === 'local'` | Shows chunking options in the analysis panel | | `disclaimer` | `deploymentMode === 'huggingface'` | Shows a disclaimer banner at the top of the app | This allows the same frontend build to work with both local and remote backends without conditional compilation. ## Rate Limiting The backend applies a sliding-window rate limiter as middleware: - **60 requests** per **60 seconds** per client IP - The `/api/health` endpoint is excluded - When the limit is exceeded, the API returns `429 Too Many Requests` with a `Retry-After` header ## Analysis Lifecycle An analysis job follows this state machine: ``` PENDING → RUNNING → COMPLETED → FAILED ``` | Status | Description | |--------|-------------| | `PENDING` | Job created, waiting for a processing slot | | `RUNNING` | Docling conversion in progress | | `COMPLETED` | Conversion finished — results available (markdown, HTML, pages, chunks) | | `FAILED` | Conversion error — `error_message` contains details | The backend limits parallel jobs via `MAX_CONCURRENT_ANALYSES` (default: 3) to avoid overloading the CPU during Docling processing. ## Local vs Remote Mode The backend supports two conversion engines, selected via the `CONVERSION_ENGINE` environment variable: | | Local | Remote | |---|---|---| | **Engine** | In-process Docling (PyTorch) | HTTP client to [Docling Serve](https://github.com/DS4SD/docling-serve) | | **Chunking** | Available (in-process) | Not available | | **Docker image** | `latest-local` (~1.9 GB) | `latest-remote` (~270 MB) | | **ML models** | Downloaded on first run (~400 MB) | Managed by Docling Serve | | **CPU/RAM** | 4+ CPUs, 6+ GB RAM | 2 CPUs, 2 GB RAM | The converter is selected at startup in `main.py` via `_build_converter()`. The chunker (`_build_chunker()`) is only instantiated in local mode — in remote mode, the chunking feature flag is disabled and the UI hides the chunking panel. ## Health Endpoint `GET /api/health` returns the backend status: ```json { "status": "ok", "engine": "local", "version": "0.3.0", "deploymentMode": "self-hosted" } ``` The frontend uses this response to: 1. Verify the backend is reachable 2. Evaluate feature flags (chunking, disclaimer) 3. Display the app version