website / docs /guides /deploy.md
Andrej Janchevski
docs(deploy): refresh for the post-launch deployment iteration
5ed6f37

How to deploy to Hugging Face Spaces

Push a new version of the site to the HF Space Bani57/website. Everything is git-based β€” there is no build button. The Space rebuilds its Docker image from whatever is on the main branch of its repo. For the rationale and full design see plans/deploy_huggingface_spaces.md.

Prerequisites

  • The Space Bani57/website exists with SDK = Docker, Hardware = CPU basic (free).
  • Space secrets are set: DJANGO_SECRET_KEY, DJANGO_DEBUG=False, DJANGO_ALLOWED_HOSTS=bani57-website.hf.space.
  • The HF Hub model repo Bani57/checkpoints exists and contains the current weights. See Refreshing checkpoints below.
  • Local working tree is on the deployment branch (typically master) and tests pass.

1. Local container smoke test

Always reproduce production locally before pushing.

cp .env.example .env
python -c "import secrets; print('DJANGO_SECRET_KEY=' + secrets.token_urlsafe(50))" >> .env
docker compose up --build

First build is ~20–30 minutes (mamba solve, GPU torch, frontend bundle, checkpoint download). Subsequent builds reuse layer cache and the checkpoints named volume.

Open http://localhost:7860 and walk through:

  • Home page renders, nav works.
  • /cv renders.
  • Each demo page loads and a sample inference completes.
  • A bogus URL (/this/path/does/not/exist) shows the Lara Croft 404.
  • GET /api/v1/health returns 200 ok.

If anything fails, fix it locally β€” never push a broken image to the Space.

2. Push to the Space

Add the HF git remote (one-time):

git remote add hf https://huggingface.co/spaces/Bani57/website

Then for each release:

git push hf master:main

HF Spaces uses a main branch. The Space's Build Logs tab in the web UI shows live build progress. First-time build is again 20–30 min; incremental builds reuse layer cache and run in under 5 min if requirements.txt and environment.yml haven't changed.

3. Verify the live Space

After the build succeeds:

curl https://bani57-website.hf.space/api/v1/health

Then in a browser:

  • Visit https://bani57-website.hf.space.
  • Hard-refresh on a deep route (e.g. https://bani57-website.hf.space/demos/coins). The SPA catch-all should serve index.html and Vue Router should resolve the route.
  • Visit https://bani57-website.hf.space/foo/bar and confirm the Lara 404 page appears.
  • Run the Postman collection (docs/postman/) against https://bani57-website.hf.space/api/v1.
  • Watch the Space's Logs tab for the first 5 minutes. Confirm [entrypoint] checkpoints ready and a clean gunicorn ... Listening at: http://0.0.0.0:7860.

4. Rollback

Spaces are fully versioned by git:

git push hf <previous-good-sha>:main --force

Force-push triggers an image rebuild from the rolled-back commit. Checkpoints in the HF Hub model repo are immutable across rollbacks, so no data loss. Use this only when a regression is bad enough that fixing forward isn't practical.

Refreshing checkpoints

Checkpoints live in Bani57/checkpoints on HF Hub, separate from the code repo. To publish new weights:

huggingface-cli login    # one-time, paste a write token
python scripts/upload_checkpoints.py --create

--create is safe to repeat; it's a no-op when the repo already exists. The script walks the local src/research/.../checkpoints/ dirs and uploads every *.tar / *.ckpt / *.pt / *.pth / *.bin / *.safetensors. Existing files with matching hashes are skipped. Use --dry-run to list what would be sent before committing to the upload.

After a checkpoint refresh, restart the Space (Settings β†’ "Restart this Space"). The container's entrypoint re-runs snapshot_download and picks up the new files.

Configuring Space secrets

In the Space Settings β†’ Variables and secrets:

  • Secrets (encrypted, not exposed in logs):
    • DJANGO_SECRET_KEY β€” required. Generate with python -c "import secrets; print(secrets.token_urlsafe(50))".
    • HF_TOKEN β€” strongly recommended. A read-scope token (huggingface.co/settings/tokens β†’ New token β†’ Read) lifts anonymous rate limits and roughly triples checkpoint download throughput on cold starts. Required only if the checkpoint repo is private.
  • Variables (visible in logs and to viewers of the Space metadata):
    • DJANGO_DEBUG=False.
    • DJANGO_ALLOWED_HOSTS=bani57-website.hf.space.
    • CORS_ALLOWED_ORIGINS=https://bani57-website.hf.space (only matters if a third party calls /api/v1 directly; the SPA itself is same-origin).
    • TORCH_DEVICE=cpu (override on a paid GPU SKU).

Changing a variable or secret triggers an automatic Space restart.

Optional: persistent storage

Free Spaces have 50 GB ephemeral disk that resets on restart, so every cold start re-downloads ~5.4 GB of checkpoints. The first request after a restart waits for snapshot_download to finish. For ~$5/month, the Space's Persistent Storage tier puts /data on a permanent volume. To use it, set the variable CHECKPOINTS_ROOT=/data/checkpoints in the Space settings. The container's entrypoint.sh will write the snapshot there; subsequent restarts find it already populated. The default unifies CHECKPOINTS_ROOT with RESEARCH_ROOT=/app/research, so checkpoints land alongside the bundled research code on free tier β€” clean for one-shot deploys, costly to re-pull on every restart.

See also