Syncre's picture
Deploy Arabic Audio Reader worker
6d5a99d verified
metadata
title: Arabic Audio Reader Worker
colorFrom: green
colorTo: green
sdk: docker
app_port: 7860

Arabic Audio Reader Worker

This is the Docker worker bundle for the Arabic PDF Reader.

Hugging Face Space Settings

  • SDK: Docker
  • Hardware: free CPU is acceptable for demos, but cold starts and long books can be slow
  • Free CPU Basic currently provides 2 vCPU, 16 GB RAM, and 50 GB non-persistent disk by default; treat generated audio as short-lived unless you add persistent/object storage
  • Port: 7860
  • Default build: installs SILMA, PaddleOCR Arabic, Tesseract Arabic, and eSpeak NG
  • Optional fast CPU voice: set Docker build arg INSTALL_SUPERTONIC=1 to add Supertonic 3 Arabic-capable local TTS
  • Stronger OCR build: set Docker build arg INSTALL_TAWKEED_OCR=1, INSTALL_KATIB_OCR=1, INSTALL_ARABIC_QWEN_OCR=1, INSTALL_ARABIC_GLM_OCR=1, or INSTALL_BASEER_OCR=1 for Arabic-trained models, or INSTALL_QARI_OCR=1 for the heavier Arabic-book model

Set these Space secrets:

ACCESS_CODE=1234
SECRET_KEY=<generated by outputs\deployment-handoff.md>
CORS_ORIGINS=https://your-vercel-app.vercel.app
COOKIE_SAMESITE=none
COOKIE_SECURE=1
OCR_ENGINE=tesseract
OCR_RENDER_ZOOM=2
TESSERACT_PSM=4
DEFAULT_VOICE_ID=silma-local
OUTPUT_RETENTION_DAYS=7
OUTPUT_MAX_FILES=25
AUDIO_FORMAT=mp3
MP3_BITRATE=96k

Generate the deployment handoff from the main repo to get the exact SECRET_KEY, worker secrets, Vercel environment variables, and final proof command:

python scripts\deployment_handoff.py https://your-space.hf.space --origin https://your-vercel-app.vercel.app --code 1234

Keep outputs\deployment-handoff.md private because it contains deployment secrets.

The compact process recommendation is included at docs/recommended-free-stack.md, with the machine-readable deployment decision card at docs/recommended-decision-card.json and its readable companion at docs/recommended-decision-card.md. The current practical default is PyMuPDF embedded text first, OCR_ENGINE=tesseract OCR_RENDER_ZOOM=2 TESSERACT_PSM=4 for the most readable tested scanned Arabic OCR, SILMA TTS for the first clean voice, and downloadable worker audio.

Optional stronger-worker build args:

INSTALL_QARI_OCR=1
INSTALL_TAWKEED_OCR=1
INSTALL_KATIB_OCR=1
INSTALL_ARABIC_QWEN_OCR=1
INSTALL_ARABIC_GLM_OCR=1
INSTALL_BASEER_OCR=1
INSTALL_PADDLEOCR_VL=1
INSTALL_SUPERTONIC=1

Use INSTALL_TAWKEED_OCR=1, INSTALL_KATIB_OCR=1, INSTALL_ARABIC_QWEN_OCR=1, INSTALL_ARABIC_GLM_OCR=1, or INSTALL_BASEER_OCR=1 first when you want an Arabic-trained OCR model. Use INSTALL_QARI_OCR=1 when you want the strongest Arabic-book OCR and the worker has enough memory/GPU. Leave heavy options at 0 on free CPU Spaces unless a short benchmark proves the stronger model is worth the cold start, build time, memory, and runtime.

After the Space builds, verify it from your main repo:

python scripts\verify_worker.py https://your-space.hf.space --code 1234 --origin https://your-vercel-app.vercel.app --require-cors --smoke-upload --smoke-scanned --smoke-ocr-engine arabic