Spaces:

Beijuka
/

ocr

Configuration error

App Files Files Community

ocr / README_DEPLOY.md

Beijuka

Upload folder using huggingface_hub

0f922c9 verified 4 months ago

preview code

raw

history blame contribute delete

1.79 kB

	Deployment notes for Hugging Face Spaces

	1) HF_TOKEN secret
	- Create a Hugging Face token at https://huggingface.co/settings/tokens
	- Token should have repository write permissions (to create and push Spaces)
	- In GitHub, go to Settings -> Secrets -> Actions -> New repository secret
	- Name: HF_TOKEN
	- Value: <your_token_here>

	2) Streamlit compatibility
	- The workflow creates the Space with `space_sdk='streamlit'` so it will run as a Streamlit app.
	- Hugging Face Spaces will run `streamlit_app.py` or `app.py` by default; this repo contains `streamlit_app.py` to be explicit.

	3) System dependencies
	- Some OCR engines require system packages (e.g., Tesseract binary, system libs for PaddlePaddle). Hugging Face's Streamlit SDK does not allow installing system packages.
	- If you need system packages, use a Docker-based Space (set `space_sdk='docker'` and add a Dockerfile that installs required system packages).

	4) LLM / Ollama
	- The app optionally uses `ollama` for LLM features. Ollama is not installed by default in Spaces; LLM features will be disabled if `ollama` isn't present.

	5) Tesseract
	- Ensure Tesseract is available in the environment or use the Docker approach to install it.

	6) Running CI/CD
	- After pushing to `main` and setting `HF_TOKEN` secret, the GitHub Actions workflow `.github/workflows/deploy_to_hf.yml` will create the Space and upload the repository.

	Note: This repository includes a `Dockerfile` and the CI workflow is configured to create a Docker-based Space (`space_sdk='docker'`). The Dockerfile installs system dependencies such as Tesseract so the OCR engines can run inside the Space container.

	7) Troubleshooting
	- If the deployment fails, open the Actions run logs to see the error and adjust the workflow or repository accordingly.