Verus-OCR

Verus-OCR is a scratch OCR model project by 8F-ai.

Status: training code is staged, but the first H200 Hugging Face Jobs launch was blocked by insufficient prepaid Jobs credits on the active account.

Requested first run:

  • Hardware: Hugging Face Jobs h200
  • Duration: 1 hour timeout, 52 minute training budget
  • Dataset: Teklia/IAM-line, a real handwritten OCR dataset
  • Architecture: VerusOCRForCasualLM
  • Target size: about 500M parameters
  • max_position_embeddings: 8190

Run after adding Jobs credits:

hf jobs uv run --flavor h200 --timeout 1h --secrets HF_TOKEN --label project=verus-ocr --label model=Verus-OCR --detach train_verus_ocr.py --repo-id SonYiHF/Verus-OCR --run-minutes 52 --max-steps 1600 --dataset-name Teklia/IAM-line --max-position-embeddings 8190

Use --repo-id 8F-ai/Verus-OCR once the active token has write access to the 8F-ai namespace.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Dataset used to train SonYiHF/Verus-OCR