Spaces:
Sleeping
Sleeping
File size: 5,014 Bytes
066d2f6 870800f 066d2f6 870800f 066d2f6 870800f 066d2f6 870800f 066d2f6 870800f 066d2f6 870800f 066d2f6 870800f 066d2f6 870800f 066d2f6 870800f 066d2f6 870800f 066d2f6 870800f 066d2f6 870800f 066d2f6 870800f 066d2f6 870800f 066d2f6 870800f 066d2f6 870800f 066d2f6 870800f 066d2f6 870800f 066d2f6 870800f 066d2f6 870800f 066d2f6 870800f 066d2f6 870800f 066d2f6 870800f 066d2f6 870800f 066d2f6 870800f 066d2f6 870800f 066d2f6 870800f 066d2f6 870800f 066d2f6 870800f 066d2f6 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 | # Deploying to Hugging Face Spaces
The repo ships everything HF Spaces needs: a `Dockerfile`, `requirements.txt`,
a `README.md` with the required Space front-matter, and the scripts that
build the Chroma vector index at image-build time.
## Prerequisites
- Free HF account at https://huggingface.co/join
- An OpenAI API key (for the LLM rerank step)
- *Optional*: Langfuse keys for tracing
---
## Step-by-step
### 1. Create the Space
1. Go to https://huggingface.co/new-space
2. **Name**: e.g. `shl-recommender-api`
3. **Space SDK**: select **Docker** (NOT Gradio / Streamlit)
4. **Hardware**: Free CPU basic (16 GB RAM, plenty for `bge-large`)
5. **Visibility**: Public
6. Click **Create Space**
### 2. Push the code
Spaces are git repos. Add it as a remote and push:
```bash
cd /path/to/shl-asss
git init
git add .
git commit -m "SHL recommender β initial commit"
# HF requires a Personal Access Token with WRITE scope.
# Create one at https://huggingface.co/settings/tokens
# Then use it as the password when prompted by git push.
git remote add space https://huggingface.co/spaces/<USERNAME>/shl-recommender-api
git branch -M main
git push -u space main
# Username: <USERNAME>
# Password: paste the hf_... token
```
The Space picks up:
- `Dockerfile` β builds the container
- `README.md` front-matter β configures the Space (title, port, etc.)
### 3. Set the environment
Open your Space β **Settings** β **Variables and secrets**.
| Type | Name | Value |
|---|---|---|
| Variable | `LLM_PROVIDER` | `openai` |
| Variable | `LLM_MODEL` | `gpt-5-mini` |
| Secret | `OPENAI_API_KEY` | your `sk-proj-...` |
| Secret (optional) | `LANGFUSE_PUBLIC_KEY` | `pk-lf-...` |
| Secret (optional) | `LANGFUSE_SECRET_KEY` | `sk-lf-...` |
| Secret (optional) | `LANGFUSE_BASE_URL` | `https://us.cloud.langfuse.com` |
Each variable change triggers a rebuild β it's smart to set them all at
once before the first push, or batch later changes.
### 4. Wait for the build
First build downloads:
- ~600 MB of pip dependencies
- ~1.3 GB of `bge-large-en-v1.5` weights
- Embeds 377 documents into a fresh `data/chroma/` (the index is built
during `RUN python -m scripts.index` β no binary blobs in git)
**Expect 5β8 minutes** for the first build. The Space dashboard streams
logs in real time. Re-runs hit pip's cache and finish in ~2β3 min.
### 5. Verify
Your Space exposes an HTTPS URL like
`https://<USERNAME>-shl-recommender-api.hf.space`.
```bash
curl https://<USERNAME>-shl-recommender-api.hf.space/health
# {"status":"healthy"}
curl -X POST https://<USERNAME>-shl-recommender-api.hf.space/recommend \
-H "Content-Type: application/json" \
-d '{"query":"hire java developers under 40 minutes"}'
```
Or open the auto-generated Swagger UI in a browser:
```
https://<USERNAME>-shl-recommender-api.hf.space/docs
```
Spaces stay warm; cold-start is rare. Each `/recommend` call takes ~2 s
(LLM rerank dominates).
---
## Configuration knobs
All env vars; set in the Space's Settings β Variables and secrets.
| Env var | Default | Notes |
|---|---|---|
| `EMBED_PROVIDER` | `local` | `local` (sentence-transformers) or `gemini` |
| `EMBED_MODEL` | `BAAI/bge-large-en-v1.5` | Pin smaller for tight RAM hosts |
| `LLM_PROVIDER` | `gemini` *(set to `openai` in Space)* | `openai` or `gemini` |
| `LLM_MODEL` | varies by provider | e.g. `gpt-5-mini`, `gpt-4o-mini`, `gemini-2.5-flash` |
| `OPENAI_BASE_URL` | unset | Set for Azure / OpenRouter / proxy |
---
## Memory profile (free tier sanity check)
| Component | RAM at idle |
|---|---|
| Python interpreter + libraries | ~200 MB |
| `bge-large-en-v1.5` weights | ~1.3 GB |
| Chroma + BM25 index | ~30 MB |
| FastAPI / uvicorn | ~50 MB |
| **Total at runtime** | **~1.6 GB** |
| HF Spaces free tier | 16 GB β |
---
## Updating the deployment
After any local change, just push to the connected branch:
```bash
git add ...
git commit -m "..."
git push space main
```
The Space auto-detects the push and redeploys.
If `data/documents.jsonl` changes (re-scrape or re-extract concepts), the
Chroma index gets rebuilt during the next image build automatically β no
manual step.
---
## Troubleshooting
| Symptom | Likely cause | Fix |
|---|---|---|
| `500 retrieval failed: GEMINI_API_KEY not set` | `LLM_PROVIDER` not set, code defaults to Gemini | Add `LLM_PROVIDER=openai` Variable |
| `500 OPENAI_API_KEY not set` | Forgot the secret | Add `OPENAI_API_KEY` Secret |
| Build hangs on `RUN python -m scripts.index` for >10 min | Embedding loop is genuinely slow on free CPU; tqdm doesn't flush | Wait it out. Look for `collection 'shl_baseline' has 377 items` to confirm completion. |
| Push rejected: `binary files` | Chroma binaries in git | They shouldn't be β `.gitignore` excludes `data/chroma/`. If anything else binary slipped in, remove with `git rm --cached <file>` |
| Push rejected: `valid Hugging Face secrets` | Token was committed somewhere | Search the repo: `grep -rn 'hf_' .` then strip and amend |
|