Spaces:

Syncre
/

arabic-audio-reader-worker

Running

App Files Files Community

arabic-audio-reader-worker / README.md

Syncre

Deploy Arabic Audio Reader worker

6d5a99d verified about 3 hours ago

preview code

raw

history blame contribute delete

3.2 kB

	---
	title: Arabic Audio Reader Worker
	colorFrom: green
	colorTo: green
	sdk: docker
	app_port: 7860
	---

	# Arabic Audio Reader Worker

	This is the Docker worker bundle for the Arabic PDF Reader.

	## Hugging Face Space Settings

	- SDK: Docker
	- Hardware: free CPU is acceptable for demos, but cold starts and long books can be slow
	- Free CPU Basic currently provides 2 vCPU, 16 GB RAM, and 50 GB non-persistent disk by default; treat generated audio as short-lived unless you add persistent/object storage
	- Port: 7860
	- Default build: installs SILMA, PaddleOCR Arabic, Tesseract Arabic, and eSpeak NG
	- Optional fast CPU voice: set Docker build arg `INSTALL_SUPERTONIC=1` to add Supertonic 3 Arabic-capable local TTS
	- Stronger OCR build: set Docker build arg `INSTALL_TAWKEED_OCR=1`, `INSTALL_KATIB_OCR=1`, `INSTALL_ARABIC_QWEN_OCR=1`, `INSTALL_ARABIC_GLM_OCR=1`, or `INSTALL_BASEER_OCR=1` for Arabic-trained models, or `INSTALL_QARI_OCR=1` for the heavier Arabic-book model

	Set these Space secrets:

	```text
	ACCESS_CODE=1234
	SECRET_KEY=<generated by outputs\deployment-handoff.md>
	CORS_ORIGINS=https://your-vercel-app.vercel.app
	COOKIE_SAMESITE=none
	COOKIE_SECURE=1
	OCR_ENGINE=tesseract
	OCR_RENDER_ZOOM=2
	TESSERACT_PSM=4
	DEFAULT_VOICE_ID=silma-local
	OUTPUT_RETENTION_DAYS=7
	OUTPUT_MAX_FILES=25
	AUDIO_FORMAT=mp3
	MP3_BITRATE=96k
	```

	Generate the deployment handoff from the main repo to get the exact `SECRET_KEY`, worker secrets, Vercel environment variables, and final proof command:

	```powershell
	python scripts\deployment_handoff.py https://your-space.hf.space --origin https://your-vercel-app.vercel.app --code 1234
	```

	Keep `outputs\deployment-handoff.md` private because it contains deployment secrets.

	The compact process recommendation is included at `docs/recommended-free-stack.md`, with the machine-readable deployment decision card at `docs/recommended-decision-card.json` and its readable companion at `docs/recommended-decision-card.md`. The current practical default is PyMuPDF embedded text first, `OCR_ENGINE=tesseract OCR_RENDER_ZOOM=2 TESSERACT_PSM=4` for the most readable tested scanned Arabic OCR, SILMA TTS for the first clean voice, and downloadable worker audio.

	Optional stronger-worker build args:

	```text
	INSTALL_QARI_OCR=1
	INSTALL_TAWKEED_OCR=1
	INSTALL_KATIB_OCR=1
	INSTALL_ARABIC_QWEN_OCR=1
	INSTALL_ARABIC_GLM_OCR=1
	INSTALL_BASEER_OCR=1
	INSTALL_PADDLEOCR_VL=1
	INSTALL_SUPERTONIC=1
	```

	Use `INSTALL_TAWKEED_OCR=1`, `INSTALL_KATIB_OCR=1`, `INSTALL_ARABIC_QWEN_OCR=1`, `INSTALL_ARABIC_GLM_OCR=1`, or `INSTALL_BASEER_OCR=1` first when you want an Arabic-trained OCR model. Use `INSTALL_QARI_OCR=1` when you want the strongest Arabic-book OCR and the worker has enough memory/GPU. Leave heavy options at `0` on free CPU Spaces unless a short benchmark proves the stronger model is worth the cold start, build time, memory, and runtime.

	After the Space builds, verify it from your main repo:

	```powershell
	python scripts\verify_worker.py https://your-space.hf.space --code 1234 --origin https://your-vercel-app.vercel.app --require-cors --smoke-upload --smoke-scanned --smoke-ocr-engine arabic
	```