| ---
|
| title: Arabic Audio Reader Worker
|
| colorFrom: green
|
| colorTo: green
|
| sdk: docker
|
| app_port: 7860
|
| ---
|
|
|
| # Arabic Audio Reader Worker
|
|
|
| This is the Docker worker bundle for the Arabic PDF Reader.
|
|
|
| ## Hugging Face Space Settings
|
|
|
| - SDK: Docker
|
| - Hardware: free CPU is acceptable for demos, but cold starts and long books can be slow
|
| - Free CPU Basic currently provides 2 vCPU, 16 GB RAM, and 50 GB non-persistent disk by default; treat generated audio as short-lived unless you add persistent/object storage
|
| - Port: 7860
|
| - Default build: installs SILMA, PaddleOCR Arabic, Tesseract Arabic, and eSpeak NG
|
| - Optional fast CPU voice: set Docker build arg `INSTALL_SUPERTONIC=1` to add Supertonic 3 Arabic-capable local TTS
|
| - Stronger OCR build: set Docker build arg `INSTALL_TAWKEED_OCR=1`, `INSTALL_KATIB_OCR=1`, `INSTALL_ARABIC_QWEN_OCR=1`, `INSTALL_ARABIC_GLM_OCR=1`, or `INSTALL_BASEER_OCR=1` for Arabic-trained models, or `INSTALL_QARI_OCR=1` for the heavier Arabic-book model
|
|
|
| Set these Space secrets:
|
|
|
| ```text
|
| ACCESS_CODE=1234
|
| SECRET_KEY=<generated by outputs\deployment-handoff.md>
|
| CORS_ORIGINS=https://your-vercel-app.vercel.app
|
| COOKIE_SAMESITE=none
|
| COOKIE_SECURE=1
|
| OCR_ENGINE=tesseract
|
| OCR_RENDER_ZOOM=2
|
| TESSERACT_PSM=4
|
| DEFAULT_VOICE_ID=silma-local
|
| OUTPUT_RETENTION_DAYS=7
|
| OUTPUT_MAX_FILES=25
|
| AUDIO_FORMAT=mp3
|
| MP3_BITRATE=96k
|
| ```
|
|
|
| Generate the deployment handoff from the main repo to get the exact `SECRET_KEY`, worker secrets, Vercel environment variables, and final proof command:
|
|
|
| ```powershell
|
| python scripts\deployment_handoff.py https://your-space.hf.space --origin https://your-vercel-app.vercel.app --code 1234
|
| ```
|
|
|
| Keep `outputs\deployment-handoff.md` private because it contains deployment secrets.
|
|
|
| The compact process recommendation is included at `docs/recommended-free-stack.md`, with the machine-readable deployment decision card at `docs/recommended-decision-card.json` and its readable companion at `docs/recommended-decision-card.md`. The current practical default is PyMuPDF embedded text first, `OCR_ENGINE=tesseract OCR_RENDER_ZOOM=2 TESSERACT_PSM=4` for the most readable tested scanned Arabic OCR, SILMA TTS for the first clean voice, and downloadable worker audio.
|
|
|
| Optional stronger-worker build args:
|
|
|
| ```text
|
| INSTALL_QARI_OCR=1
|
| INSTALL_TAWKEED_OCR=1
|
| INSTALL_KATIB_OCR=1
|
| INSTALL_ARABIC_QWEN_OCR=1
|
| INSTALL_ARABIC_GLM_OCR=1
|
| INSTALL_BASEER_OCR=1
|
| INSTALL_PADDLEOCR_VL=1
|
| INSTALL_SUPERTONIC=1
|
| ```
|
|
|
| Use `INSTALL_TAWKEED_OCR=1`, `INSTALL_KATIB_OCR=1`, `INSTALL_ARABIC_QWEN_OCR=1`, `INSTALL_ARABIC_GLM_OCR=1`, or `INSTALL_BASEER_OCR=1` first when you want an Arabic-trained OCR model. Use `INSTALL_QARI_OCR=1` when you want the strongest Arabic-book OCR and the worker has enough memory/GPU. Leave heavy options at `0` on free CPU Spaces unless a short benchmark proves the stronger model is worth the cold start, build time, memory, and runtime.
|
|
|
| After the Space builds, verify it from your main repo:
|
|
|
| ```powershell
|
| python scripts\verify_worker.py https://your-space.hf.space --code 1234 --origin https://your-vercel-app.vercel.app --require-cors --smoke-upload --smoke-scanned --smoke-ocr-engine arabic
|
| ```
|
|
|