docling-studio / docs /architecture.md
Pier-Jean's picture
Upload folder using huggingface_hub
cc59214 verified
# Architecture
## Overview
![Docling Studio architecture](images/global.png){ width="700" }
Two services communicating via REST. The frontend is a Vue 3 SPA served by Nginx in production. The backend is a FastAPI app that wraps Docling's document conversion engine.
### Zooming into the backend
The schema above shows the macro view. Inside the backend, the code follows a **Clean Architecture** with strict layer boundaries:
```
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Backend β”‚
β”‚ β”‚
β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚
β”‚ β”‚ api/ β”‚ ← HTTP (FastAPI routes, Pydantic) β”‚
β”‚ β””β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”˜ β”‚
β”‚ β”‚ calls β”‚
β”‚ β”Œβ”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β” β”‚
β”‚ β”‚services/ β”‚ ← Use case orchestration β”‚
β”‚ β””β”€β”€β”¬β”€β”€β”€β”€β”¬β”€β”€β”˜ β”‚
β”‚ β”‚ β”‚ β”‚
β”‚ β”Œβ”€β”€β”€β–Όβ”€β”€β” β”Œβ–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚
β”‚ β”‚domainβ”‚ β”‚persistence/ β”‚ β”‚
β”‚ β”‚ β”‚ β”‚ β”‚ β”‚
β”‚ β”‚bbox β”‚ β”‚ SQLite CRUD β”‚ ← Storage (your blue box) β”‚
β”‚ β”‚parse β”‚ β”‚ file store β”‚ β”‚
β”‚ β””β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚
β”‚ ↑ pure Python, no deps ↑ aiosqlite β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
```
Dependencies flow **inward**: `api β†’ services β†’ domain`. The domain layer has zero knowledge of HTTP or database.
## Backend β€” Clean Architecture
The backend follows a strict layered architecture. Dependencies flow inward: API β†’ Services β†’ Domain. The domain layer has zero knowledge of HTTP or database.
```
document-parser/
β”œβ”€β”€ main.py # FastAPI app, CORS, lifespan, health endpoint
β”‚
β”œβ”€β”€ domain/ # Pure domain β€” no HTTP, no DB
β”‚ β”œβ”€β”€ models.py # Document, AnalysisJob dataclasses
β”‚ β”œβ”€β”€ ports.py # Abstract protocols (DocumentConverter, DocumentChunker)
β”‚ β”œβ”€β”€ value_objects.py # ConversionResult, ChunkingOptions, ChunkResult
β”‚ └── bbox.py # Bounding box coordinate normalization
β”‚
β”œβ”€β”€ api/ # HTTP layer (FastAPI routers)
β”‚ β”œβ”€β”€ schemas.py # Pydantic DTOs (camelCase serialization)
β”‚ β”œβ”€β”€ documents.py # /api/documents endpoints
β”‚ └── analyses.py # /api/analyses endpoints (create, rechunk, delete)
β”‚
β”œβ”€β”€ persistence/ # Data layer (SQLite via aiosqlite)
β”‚ β”œβ”€β”€ database.py # Connection management, schema init
β”‚ β”œβ”€β”€ document_repo.py # Document CRUD
β”‚ └── analysis_repo.py # AnalysisJob CRUD
β”‚
β”œβ”€β”€ infra/ # Infrastructure adapters
β”‚ β”œβ”€β”€ settings.py # Environment-based configuration
β”‚ β”œβ”€β”€ local_converter.py # In-process Docling converter (local mode)
β”‚ β”œβ”€β”€ serve_converter.py # HTTP client for Docling Serve (remote mode)
β”‚ β”œβ”€β”€ local_chunker.py # In-process chunking (HierarchicalChunker, HybridChunker)
β”‚ β”œβ”€β”€ rate_limiter.py # Sliding-window rate limiting middleware
β”‚ └── bbox.py # Bbox coordinate normalization helpers
β”‚
β”œβ”€β”€ services/ # Use case orchestration
β”‚ β”œβ”€β”€ document_service.py # Upload, delete, preview
β”‚ └── analysis_service.py # Async Docling processing + chunking
β”‚
└── tests/ # pytest (199 tests)
```
### Layer responsibilities
| Layer | Role | Depends on |
|-------|------|------------|
| **domain** | Dataclasses, value objects, abstract ports | Nothing (pure Python) |
| **persistence** | SQLite CRUD, aiosqlite | domain (models) |
| **infra** | Adapters: converters, chunker, rate limiter, settings | domain (ports, value objects) |
| **services** | Orchestrate use cases, call converters/chunkers | domain + persistence + infra |
| **api** | HTTP endpoints, Pydantic DTOs, error handling | services |
### API contract
The API uses **camelCase** serialization (via Pydantic `alias_generator`), while the backend uses **snake_case** internally. The `pages_json` field contains raw `dataclasses.asdict()` output, so page data uses **snake_case** (`page_number`, not `pageNumber`).
## Frontend β€” Feature-Based
The frontend is organized by feature, each with its own store, API client, and UI components.
```
frontend/src/
β”œβ”€β”€ app/ # App shell, router, global styles
β”œβ”€β”€ pages/ # Route-level pages
β”‚ β”œβ”€β”€ HomePage.vue
β”‚ β”œβ”€β”€ StudioPage.vue # PDF viewer + config + results
β”‚ β”œβ”€β”€ DocumentsPage.vue
β”‚ β”œβ”€β”€ HistoryPage.vue
β”‚ └── SettingsPage.vue
β”‚
β”œβ”€β”€ features/ # Feature modules
β”‚ β”œβ”€β”€ analysis/ # Analysis store, API, bbox scaling, UI
β”‚ β”‚ β”œβ”€β”€ store.ts
β”‚ β”‚ β”œβ”€β”€ api.ts
β”‚ β”‚ β”œβ”€β”€ bboxScaling.ts # Pure math: page coords β†’ pixel coords
β”‚ β”‚ └── ui/
β”‚ β”‚ β”œβ”€β”€ BboxOverlay.vue
β”‚ β”‚ β”œβ”€β”€ AnalysisPanel.vue
β”‚ β”‚ β”œβ”€β”€ StructureViewer.vue
β”‚ β”‚ └── ...
β”‚ β”œβ”€β”€ chunking/ # Chunk panel UI + rechunk action
β”‚ β”œβ”€β”€ document/ # Document store, API, upload
β”‚ β”œβ”€β”€ feature-flags/ # Feature flag store (reads /api/health)
β”‚ β”œβ”€β”€ history/ # History store, navigation
β”‚ └── settings/ # Theme, locale, API URL
β”‚
└── shared/ # Cross-feature utilities
β”œβ”€β”€ types.ts # All shared TypeScript interfaces
β”œβ”€β”€ i18n.ts # FR/EN translations
β”œβ”€β”€ format.ts # Date/size formatters
└── api/http.ts # HTTP client (fetch wrapper)
```
### Data flow
```
User action β†’ Pinia store action β†’ API client (fetch) β†’ Backend REST endpoint
β”‚
Backend response β†’ Pinia store state β†’ Vue reactivity β†’ UI update
```
### Key design decisions
- **Pinia stores** per feature, not global. Each feature owns its state.
- **TypeScript strict mode** with shared interfaces in `shared/types.ts`.
- **No component library** β€” custom CSS with CSS variables for theming.
- **vue-tsc** in CI to catch type errors before merge.
## Feature Flags
The frontend adapts its UI based on the backend's capabilities. On startup, the feature flag store fetches `/api/health` and reads the `engine` and `deploymentMode` fields.
| Flag | Condition | Effect |
|------|-----------|--------|
| `chunking` | `engine === 'local'` | Shows chunking options in the analysis panel |
| `disclaimer` | `deploymentMode === 'huggingface'` | Shows a disclaimer banner at the top of the app |
This allows the same frontend build to work with both local and remote backends without conditional compilation.
## Rate Limiting
The backend applies a sliding-window rate limiter as middleware:
- **60 requests** per **60 seconds** per client IP
- The `/api/health` endpoint is excluded
- When the limit is exceeded, the API returns `429 Too Many Requests` with a `Retry-After` header
## Analysis Lifecycle
An analysis job follows this state machine:
```
PENDING β†’ RUNNING β†’ COMPLETED
β†’ FAILED
```
| Status | Description |
|--------|-------------|
| `PENDING` | Job created, waiting for a processing slot |
| `RUNNING` | Docling conversion in progress |
| `COMPLETED` | Conversion finished β€” results available (markdown, HTML, pages, chunks) |
| `FAILED` | Conversion error β€” `error_message` contains details |
The backend limits parallel jobs via `MAX_CONCURRENT_ANALYSES` (default: 3) to avoid overloading the CPU during Docling processing.
## Local vs Remote Mode
The backend supports two conversion engines, selected via the `CONVERSION_ENGINE` environment variable:
| | Local | Remote |
|---|---|---|
| **Engine** | In-process Docling (PyTorch) | HTTP client to [Docling Serve](https://github.com/DS4SD/docling-serve) |
| **Chunking** | Available (in-process) | Not available |
| **Docker image** | `latest-local` (~1.9 GB) | `latest-remote` (~270 MB) |
| **ML models** | Downloaded on first run (~400 MB) | Managed by Docling Serve |
| **CPU/RAM** | 4+ CPUs, 6+ GB RAM | 2 CPUs, 2 GB RAM |
The converter is selected at startup in `main.py` via `_build_converter()`. The chunker (`_build_chunker()`) is only instantiated in local mode β€” in remote mode, the chunking feature flag is disabled and the UI hides the chunking panel.
## Health Endpoint
`GET /api/health` returns the backend status:
```json
{
"status": "ok",
"engine": "local",
"version": "0.3.0",
"deploymentMode": "self-hosted"
}
```
The frontend uses this response to:
1. Verify the backend is reachable
2. Evaluate feature flags (chunking, disclaimer)
3. Display the app version