Syncre's picture
Deploy Arabic Audio Reader worker
6d5a99d verified
---
title: Arabic Audio Reader Worker
colorFrom: green
colorTo: green
sdk: docker
app_port: 7860
---
# Arabic Audio Reader Worker
This is the Docker worker bundle for the Arabic PDF Reader.
## Hugging Face Space Settings
- SDK: Docker
- Hardware: free CPU is acceptable for demos, but cold starts and long books can be slow
- Free CPU Basic currently provides 2 vCPU, 16 GB RAM, and 50 GB non-persistent disk by default; treat generated audio as short-lived unless you add persistent/object storage
- Port: 7860
- Default build: installs SILMA, PaddleOCR Arabic, Tesseract Arabic, and eSpeak NG
- Optional fast CPU voice: set Docker build arg `INSTALL_SUPERTONIC=1` to add Supertonic 3 Arabic-capable local TTS
- Stronger OCR build: set Docker build arg `INSTALL_TAWKEED_OCR=1`, `INSTALL_KATIB_OCR=1`, `INSTALL_ARABIC_QWEN_OCR=1`, `INSTALL_ARABIC_GLM_OCR=1`, or `INSTALL_BASEER_OCR=1` for Arabic-trained models, or `INSTALL_QARI_OCR=1` for the heavier Arabic-book model
Set these Space secrets:
```text
ACCESS_CODE=1234
SECRET_KEY=<generated by outputs\deployment-handoff.md>
CORS_ORIGINS=https://your-vercel-app.vercel.app
COOKIE_SAMESITE=none
COOKIE_SECURE=1
OCR_ENGINE=tesseract
OCR_RENDER_ZOOM=2
TESSERACT_PSM=4
DEFAULT_VOICE_ID=silma-local
OUTPUT_RETENTION_DAYS=7
OUTPUT_MAX_FILES=25
AUDIO_FORMAT=mp3
MP3_BITRATE=96k
```
Generate the deployment handoff from the main repo to get the exact `SECRET_KEY`, worker secrets, Vercel environment variables, and final proof command:
```powershell
python scripts\deployment_handoff.py https://your-space.hf.space --origin https://your-vercel-app.vercel.app --code 1234
```
Keep `outputs\deployment-handoff.md` private because it contains deployment secrets.
The compact process recommendation is included at `docs/recommended-free-stack.md`, with the machine-readable deployment decision card at `docs/recommended-decision-card.json` and its readable companion at `docs/recommended-decision-card.md`. The current practical default is PyMuPDF embedded text first, `OCR_ENGINE=tesseract OCR_RENDER_ZOOM=2 TESSERACT_PSM=4` for the most readable tested scanned Arabic OCR, SILMA TTS for the first clean voice, and downloadable worker audio.
Optional stronger-worker build args:
```text
INSTALL_QARI_OCR=1
INSTALL_TAWKEED_OCR=1
INSTALL_KATIB_OCR=1
INSTALL_ARABIC_QWEN_OCR=1
INSTALL_ARABIC_GLM_OCR=1
INSTALL_BASEER_OCR=1
INSTALL_PADDLEOCR_VL=1
INSTALL_SUPERTONIC=1
```
Use `INSTALL_TAWKEED_OCR=1`, `INSTALL_KATIB_OCR=1`, `INSTALL_ARABIC_QWEN_OCR=1`, `INSTALL_ARABIC_GLM_OCR=1`, or `INSTALL_BASEER_OCR=1` first when you want an Arabic-trained OCR model. Use `INSTALL_QARI_OCR=1` when you want the strongest Arabic-book OCR and the worker has enough memory/GPU. Leave heavy options at `0` on free CPU Spaces unless a short benchmark proves the stronger model is worth the cold start, build time, memory, and runtime.
After the Space builds, verify it from your main repo:
```powershell
python scripts\verify_worker.py https://your-space.hf.space --code 1234 --origin https://your-vercel-app.vercel.app --require-cors --smoke-upload --smoke-scanned --smoke-ocr-engine arabic
```