docling-studio / docs /index.md
Pier-Jean's picture
Upload folder using huggingface_hub
cc59214 verified
# Docling Studio
A visual document analysis studio powered by [Docling](https://github.com/DS4SD/docling).
Upload a PDF, configure the extraction pipeline, and visualize the results β€” text, tables, images, formulas, bounding boxes β€” all from your browser.
![Docling Studio architecture](images/global.png){ width="600" }
![Docling Studio β€” Execution Result](screenshots/DS-execution-result.png)
## Features
- **PDF viewer** with page navigation, bounding box overlay, and resizable results panel
- **Configurable Docling pipeline** β€” OCR, table extraction, code/formula enrichment, picture classification & description, image generation
- **Bounding box visualization** β€” color-coded element overlay directly on the PDF
- **Chunking** β€” split extracted content into semantic chunks (hierarchical, hybrid, or page-based) with configurable token limits
- **Markdown & HTML export** of extracted content
- **Document management** β€” upload, list, delete
- **Analysis history** β€” re-visit and open past analyses
- **Feature flags** β€” capabilities adapt to the conversion engine (local vs remote)
- **Rate limiting** β€” 60 requests per minute per IP to protect the backend
- **Deployment modes** β€” self-hosted (default) or HuggingFace Spaces (with disclaimer banner)
- **Health endpoint** β€” `/api/health` reports engine type, deployment mode, and database status
- **Dark / Light theme** and **FR / EN** localization
## Tech Stack
| Layer | Stack |
|-------|-------|
| **Frontend** | Vue 3, TypeScript, Vite, Pinia |
| **Backend** | FastAPI, Docling 2.x, SQLite (aiosqlite) |
| **CI** | GitHub Actions (lint, type-check, test, build) |
| **Infra** | Docker Compose + Nginx |
## Quick Start
```bash
# Docker (fastest)
docker run -p 3000:3000 ghcr.io/scub-france/docling-studio:latest-local
```
Open [http://localhost:3000](http://localhost:3000) and upload a PDF.
!!! note
The first analysis takes longer as Docling downloads its ML models (~400 MB). Subsequent runs are fast.
See [Getting Started](getting-started.md) for local development setup.
## License
[MIT](https://github.com/scub-france/Docling-Studio/blob/main/LICENSE) β€” Pier-Jean Malandrino