Spaces:
Running
Running
Docling Studio
A visual document analysis studio powered by Docling.
Upload a PDF, configure the extraction pipeline, and visualize the results β text, tables, images, formulas, bounding boxes β all from your browser.
Features
- PDF viewer with page navigation, bounding box overlay, and resizable results panel
- Configurable Docling pipeline β OCR, table extraction, code/formula enrichment, picture classification & description, image generation
- Bounding box visualization β color-coded element overlay directly on the PDF
- Chunking β split extracted content into semantic chunks (hierarchical, hybrid, or page-based) with configurable token limits
- Markdown & HTML export of extracted content
- Document management β upload, list, delete
- Analysis history β re-visit and open past analyses
- Feature flags β capabilities adapt to the conversion engine (local vs remote)
- Rate limiting β 60 requests per minute per IP to protect the backend
- Deployment modes β self-hosted (default) or HuggingFace Spaces (with disclaimer banner)
- Health endpoint β
/api/healthreports engine type, deployment mode, and database status - Dark / Light theme and FR / EN localization
Tech Stack
| Layer | Stack |
|---|---|
| Frontend | Vue 3, TypeScript, Vite, Pinia |
| Backend | FastAPI, Docling 2.x, SQLite (aiosqlite) |
| CI | GitHub Actions (lint, type-check, test, build) |
| Infra | Docker Compose + Nginx |
Quick Start
# Docker (fastest)
docker run -p 3000:3000 ghcr.io/scub-france/docling-studio:latest-local
Open http://localhost:3000 and upload a PDF.
!!! note The first analysis takes longer as Docling downloads its ML models (~400 MB). Subsequent runs are fast.
See Getting Started for local development setup.
License
MIT β Pier-Jean Malandrino

