Achim Rabus
Deploy Polyscriptor HTR Space demo
78431ff
metadata
title: Polyscriptor HTR Demo
emoji: 📝
colorFrom: blue
colorTo: gray
sdk: docker
pinned: false
license: apache-2.0

Polyscriptor HTR Demo

Polyscriptor is a browser-based demo for handwritten text recognition (HTR) on historical Slavic manuscript material. This Hugging Face Space runs a constrained public version of the Polyscriptor FastAPI/Web interface.

The hosted demo is intended for quick inspection and teaching. It is not the full local research environment used for training, batch processing, GPU inference, or private manuscript collections.

Source Code

The public Polyscriptor source code is available on GitHub:

https://github.com/achimrabus/polyscriptor

This Hugging Face Space contains the curated hosted demo deployment. The GitHub repository contains the broader Polyscriptor codebase, including the web UI, engine plugins, segmentation code, training utilities, and local workflows.

What This Demo Supports

  • CRNN-CTC / PyLaia-inspired HTR presets for selected public model repositories.
  • User-supplied API keys for OpenAI, Gemini, Claude, and OpenWebUI-compatible endpoints.
  • Public model download from the Hugging Face Hub, primarily under achimrabus/*.
  • CPU-only inference.
  • Kraken Classical line segmentation, with HPP as a lightweight fallback.
  • Temporary image uploads during the active session.

Limitations

  • No private models are bundled with this Space.
  • API-based engines require users to paste their own API key in the browser form. The Space does not ship with shared provider credentials.
  • Uploaded files are treated as temporary runtime data and are not part of the repository.
  • Large local GPU/VLM engines from the full Polyscriptor workflow are not enabled here.
  • Accuracy depends strongly on script, language, writing style, image quality, and segmentation quality.

Model Notes

The demo uses publicly available model presets. For best results, choose a model that matches the manuscript tradition as closely as possible. The current public Polyscriptor model cards are available at:

https://huggingface.co/achimrabus

Project Context

Polyscriptor is developed for historical HTR workflows, with a focus on Slavic manuscripts and reproducible comparison of OCR/HTR engines. The full development repository contains additional tooling for local use, training, evaluation, and batch processing; this Space contains only the hosted demo configuration.

Privacy

Do not upload sensitive or unpublished manuscript images unless you are comfortable processing them in a hosted public demo environment. The application uses temporary server-side files during processing, but this Space should be treated as a public demonstration service rather than a secure private workflow.

For API-based engines, provider keys are entered by the user at runtime. Do not commit keys to this repository or add them to the Space configuration unless you intend to provide a shared project credential.