handbook-ocr-engine / README.md
internationalscholarsprogram's picture
Initial deploy: ISP Handbook OCR Engine
b12284c verified
---
title: ISP Handbook OCR Engine
emoji: ๐Ÿ“„
colorFrom: blue
colorTo: indigo
sdk: docker
pinned: false
license: mit
app_port: 7860
---
# ISP Handbook OCR Engine
Extracts structured content from uploaded handbook PDFs using a hybrid
text + OCR pipeline. Supports table detection, real-time editing, and
multi-format export (PDF, DOCX, HTML, JSON).
## Endpoints
| Method | Path | Description |
|--------|----------------------|------------------------------------|
| GET | `/` | Health probe |
| GET | `/health` | Detailed health check |
| GET | `/docs` | Swagger UI |
| POST | `/extract` | Plain text extraction |
| POST | `/extract-structured`| Structured extraction with tables |
| POST | `/export` | Export edited content to file |
| POST | `/save` | Persist to platform database |