docling-studio / docs /index.md
Pier-Jean's picture
Upload folder using huggingface_hub
cc59214 verified

Docling Studio

A visual document analysis studio powered by Docling.

Upload a PDF, configure the extraction pipeline, and visualize the results β€” text, tables, images, formulas, bounding boxes β€” all from your browser.

Docling Studio architecture{ width="600" }

Docling Studio β€” Execution Result

Features

  • PDF viewer with page navigation, bounding box overlay, and resizable results panel
  • Configurable Docling pipeline β€” OCR, table extraction, code/formula enrichment, picture classification & description, image generation
  • Bounding box visualization β€” color-coded element overlay directly on the PDF
  • Chunking β€” split extracted content into semantic chunks (hierarchical, hybrid, or page-based) with configurable token limits
  • Markdown & HTML export of extracted content
  • Document management β€” upload, list, delete
  • Analysis history β€” re-visit and open past analyses
  • Feature flags β€” capabilities adapt to the conversion engine (local vs remote)
  • Rate limiting β€” 60 requests per minute per IP to protect the backend
  • Deployment modes β€” self-hosted (default) or HuggingFace Spaces (with disclaimer banner)
  • Health endpoint β€” /api/health reports engine type, deployment mode, and database status
  • Dark / Light theme and FR / EN localization

Tech Stack

Layer Stack
Frontend Vue 3, TypeScript, Vite, Pinia
Backend FastAPI, Docling 2.x, SQLite (aiosqlite)
CI GitHub Actions (lint, type-check, test, build)
Infra Docker Compose + Nginx

Quick Start

# Docker (fastest)
docker run -p 3000:3000 ghcr.io/scub-france/docling-studio:latest-local

Open http://localhost:3000 and upload a PDF.

!!! note The first analysis takes longer as Docling downloads its ML models (~400 MB). Subsequent runs are fast.

See Getting Started for local development setup.

License

MIT β€” Pier-Jean Malandrino