Spaces:
Sleeping
Deploying to Hugging Face Spaces
The repo ships everything HF Spaces needs: a Dockerfile, requirements.txt,
a README.md with the required Space front-matter, and the scripts that
build the Chroma vector index at image-build time.
Prerequisites
- Free HF account at https://huggingface.co/join
- An OpenAI API key (for the LLM rerank step)
- Optional: Langfuse keys for tracing
Step-by-step
1. Create the Space
- Go to https://huggingface.co/new-space
- Name: e.g.
shl-recommender-api - Space SDK: select Docker (NOT Gradio / Streamlit)
- Hardware: Free CPU basic (16 GB RAM, plenty for
bge-large) - Visibility: Public
- Click Create Space
2. Push the code
Spaces are git repos. Add it as a remote and push:
cd /path/to/shl-asss
git init
git add .
git commit -m "SHL recommender β initial commit"
# HF requires a Personal Access Token with WRITE scope.
# Create one at https://huggingface.co/settings/tokens
# Then use it as the password when prompted by git push.
git remote add space https://huggingface.co/spaces/<USERNAME>/shl-recommender-api
git branch -M main
git push -u space main
# Username: <USERNAME>
# Password: paste the hf_... token
The Space picks up:
Dockerfileβ builds the containerREADME.mdfront-matter β configures the Space (title, port, etc.)
3. Set the environment
Open your Space β Settings β Variables and secrets.
| Type | Name | Value |
|---|---|---|
| Variable | LLM_PROVIDER |
openai |
| Variable | LLM_MODEL |
gpt-5-mini |
| Secret | OPENAI_API_KEY |
your sk-proj-... |
| Secret (optional) | LANGFUSE_PUBLIC_KEY |
pk-lf-... |
| Secret (optional) | LANGFUSE_SECRET_KEY |
sk-lf-... |
| Secret (optional) | LANGFUSE_BASE_URL |
https://us.cloud.langfuse.com |
Each variable change triggers a rebuild β it's smart to set them all at once before the first push, or batch later changes.
4. Wait for the build
First build downloads:
- ~600 MB of pip dependencies
- ~1.3 GB of
bge-large-en-v1.5weights - Embeds 377 documents into a fresh
data/chroma/(the index is built duringRUN python -m scripts.indexβ no binary blobs in git)
Expect 5β8 minutes for the first build. The Space dashboard streams logs in real time. Re-runs hit pip's cache and finish in ~2β3 min.
5. Verify
Your Space exposes an HTTPS URL like
https://<USERNAME>-shl-recommender-api.hf.space.
curl https://<USERNAME>-shl-recommender-api.hf.space/health
# {"status":"healthy"}
curl -X POST https://<USERNAME>-shl-recommender-api.hf.space/recommend \
-H "Content-Type: application/json" \
-d '{"query":"hire java developers under 40 minutes"}'
Or open the auto-generated Swagger UI in a browser:
https://<USERNAME>-shl-recommender-api.hf.space/docs
Spaces stay warm; cold-start is rare. Each /recommend call takes ~2 s
(LLM rerank dominates).
Configuration knobs
All env vars; set in the Space's Settings β Variables and secrets.
| Env var | Default | Notes |
|---|---|---|
EMBED_PROVIDER |
local |
local (sentence-transformers) or gemini |
EMBED_MODEL |
BAAI/bge-large-en-v1.5 |
Pin smaller for tight RAM hosts |
LLM_PROVIDER |
gemini (set to openai in Space) |
openai or gemini |
LLM_MODEL |
varies by provider | e.g. gpt-5-mini, gpt-4o-mini, gemini-2.5-flash |
OPENAI_BASE_URL |
unset | Set for Azure / OpenRouter / proxy |
Memory profile (free tier sanity check)
| Component | RAM at idle |
|---|---|
| Python interpreter + libraries | ~200 MB |
bge-large-en-v1.5 weights |
~1.3 GB |
| Chroma + BM25 index | ~30 MB |
| FastAPI / uvicorn | ~50 MB |
| Total at runtime | ~1.6 GB |
| HF Spaces free tier | 16 GB β |
Updating the deployment
After any local change, just push to the connected branch:
git add ...
git commit -m "..."
git push space main
The Space auto-detects the push and redeploys.
If data/documents.jsonl changes (re-scrape or re-extract concepts), the
Chroma index gets rebuilt during the next image build automatically β no
manual step.
Troubleshooting
| Symptom | Likely cause | Fix |
|---|---|---|
500 retrieval failed: GEMINI_API_KEY not set |
LLM_PROVIDER not set, code defaults to Gemini |
Add LLM_PROVIDER=openai Variable |
500 OPENAI_API_KEY not set |
Forgot the secret | Add OPENAI_API_KEY Secret |
Build hangs on RUN python -m scripts.index for >10 min |
Embedding loop is genuinely slow on free CPU; tqdm doesn't flush | Wait it out. Look for collection 'shl_baseline' has 377 items to confirm completion. |
Push rejected: binary files |
Chroma binaries in git | They shouldn't be β .gitignore excludes data/chroma/. If anything else binary slipped in, remove with git rm --cached <file> |
Push rejected: valid Hugging Face secrets |
Token was committed somewhere | Search the repo: grep -rn 'hf_' . then strip and amend |