pankaj
Clean repo for GitHub: drop unused fallbacks, add LICENSE, regenerate predictions
066d2f6

Deploying to Hugging Face Spaces

The repo ships everything HF Spaces needs: a Dockerfile, requirements.txt, a README.md with the required Space front-matter, and the scripts that build the Chroma vector index at image-build time.

Prerequisites


Step-by-step

1. Create the Space

  1. Go to https://huggingface.co/new-space
  2. Name: e.g. shl-recommender-api
  3. Space SDK: select Docker (NOT Gradio / Streamlit)
  4. Hardware: Free CPU basic (16 GB RAM, plenty for bge-large)
  5. Visibility: Public
  6. Click Create Space

2. Push the code

Spaces are git repos. Add it as a remote and push:

cd /path/to/shl-asss

git init
git add .
git commit -m "SHL recommender β€” initial commit"

# HF requires a Personal Access Token with WRITE scope.
# Create one at https://huggingface.co/settings/tokens
# Then use it as the password when prompted by git push.

git remote add space https://huggingface.co/spaces/<USERNAME>/shl-recommender-api
git branch -M main
git push -u space main
# Username: <USERNAME>
# Password: paste the hf_... token

The Space picks up:

  • Dockerfile β†’ builds the container
  • README.md front-matter β†’ configures the Space (title, port, etc.)

3. Set the environment

Open your Space β†’ Settings β†’ Variables and secrets.

Type Name Value
Variable LLM_PROVIDER openai
Variable LLM_MODEL gpt-5-mini
Secret OPENAI_API_KEY your sk-proj-...
Secret (optional) LANGFUSE_PUBLIC_KEY pk-lf-...
Secret (optional) LANGFUSE_SECRET_KEY sk-lf-...
Secret (optional) LANGFUSE_BASE_URL https://us.cloud.langfuse.com

Each variable change triggers a rebuild β€” it's smart to set them all at once before the first push, or batch later changes.

4. Wait for the build

First build downloads:

  • ~600 MB of pip dependencies
  • ~1.3 GB of bge-large-en-v1.5 weights
  • Embeds 377 documents into a fresh data/chroma/ (the index is built during RUN python -m scripts.index β€” no binary blobs in git)

Expect 5–8 minutes for the first build. The Space dashboard streams logs in real time. Re-runs hit pip's cache and finish in ~2–3 min.

5. Verify

Your Space exposes an HTTPS URL like https://<USERNAME>-shl-recommender-api.hf.space.

curl https://<USERNAME>-shl-recommender-api.hf.space/health
# {"status":"healthy"}

curl -X POST https://<USERNAME>-shl-recommender-api.hf.space/recommend \
  -H "Content-Type: application/json" \
  -d '{"query":"hire java developers under 40 minutes"}'

Or open the auto-generated Swagger UI in a browser:

https://<USERNAME>-shl-recommender-api.hf.space/docs

Spaces stay warm; cold-start is rare. Each /recommend call takes ~2 s (LLM rerank dominates).


Configuration knobs

All env vars; set in the Space's Settings β†’ Variables and secrets.

Env var Default Notes
EMBED_PROVIDER local local (sentence-transformers) or gemini
EMBED_MODEL BAAI/bge-large-en-v1.5 Pin smaller for tight RAM hosts
LLM_PROVIDER gemini (set to openai in Space) openai or gemini
LLM_MODEL varies by provider e.g. gpt-5-mini, gpt-4o-mini, gemini-2.5-flash
OPENAI_BASE_URL unset Set for Azure / OpenRouter / proxy

Memory profile (free tier sanity check)

Component RAM at idle
Python interpreter + libraries ~200 MB
bge-large-en-v1.5 weights ~1.3 GB
Chroma + BM25 index ~30 MB
FastAPI / uvicorn ~50 MB
Total at runtime ~1.6 GB
HF Spaces free tier 16 GB βœ“

Updating the deployment

After any local change, just push to the connected branch:

git add ...
git commit -m "..."
git push space main

The Space auto-detects the push and redeploys.

If data/documents.jsonl changes (re-scrape or re-extract concepts), the Chroma index gets rebuilt during the next image build automatically β€” no manual step.


Troubleshooting

Symptom Likely cause Fix
500 retrieval failed: GEMINI_API_KEY not set LLM_PROVIDER not set, code defaults to Gemini Add LLM_PROVIDER=openai Variable
500 OPENAI_API_KEY not set Forgot the secret Add OPENAI_API_KEY Secret
Build hangs on RUN python -m scripts.index for >10 min Embedding loop is genuinely slow on free CPU; tqdm doesn't flush Wait it out. Look for collection 'shl_baseline' has 377 items to confirm completion.
Push rejected: binary files Chroma binaries in git They shouldn't be β€” .gitignore excludes data/chroma/. If anything else binary slipped in, remove with git rm --cached <file>
Push rejected: valid Hugging Face secrets Token was committed somewhere Search the repo: grep -rn 'hf_' . then strip and amend