ProBas_RAG_Assistant / DEPLOY_HF.md
Mohamed284's picture
Deploy ProBas RAG Assistant with enriched prebuilt index
0ca97fd
|
Raw
History Blame Contribute Delete
3.43 kB
# Deploying ProBas RAG Assistant to Hugging Face Spaces
The Space ships the **prebuilt index** and loads it directly on startup β€” no
re-embedding, and the 1.2 GB raw dataset does **not** need to be uploaded. The
app's `load_any_bundle()` fallback loads any bundle present under
`indexes/probas_rag/` even when the raw `probas_processes_by_classification_rag_json/`
directory is absent.
At query time the app still calls the API for the query embedding and the chat
completion, so the Space needs `OPENAI_API_KEY` set as a **secret**.
## What gets uploaded
| File | Purpose | Size |
|------|---------|------|
| `app.py` | the app | small |
| `requirements.txt` | deps | small |
| `README.md` | includes the Space metadata header | small |
| `.gitattributes` | LFS rules for the index | small |
| `check_progress.py` | optional build monitor | small |
| `indexes/probas_rag/bundle_v3_*.json` | record/BM25 bundle | ~489 MB (LFS) |
| `indexes/probas_rag/bundle_embeddings_v3_*.npy` | embeddings | ~227 MB (LFS) |
Do **not** upload `.env`, the raw dataset folder, or `.venv`.
## Steps
Assuming you cloned the Space already:
```bash
git clone https://huggingface.co/spaces/IPTS-PRODDEV/ProBas_RAG_Assistant
cd ProBas_RAG_Assistant
```
1. Copy the app files in. From this project directory:
```bash
SRC="/media/mohamed/New Volume/Leuphana_cousres/SA_Projects/Probas RAG Assistant"
DST="/media/mohamed/New Volume/Leuphana_cousres/SA_Projects/Probas RAG Assistant/ProBas_RAG_Assistant" # the HF clone
cp "$SRC/app.py" "$SRC/requirements.txt" "$SRC/README.md" \
"$SRC/.gitattributes" "$SRC/check_progress.py" "$DST/"
mkdir -p "$DST/indexes/probas_rag"
cp "$SRC/indexes/probas_rag/bundle_v3_"*.json "$DST/indexes/probas_rag/"
cp "$SRC/indexes/probas_rag/bundle_embeddings_v3_"*.npy "$DST/indexes/probas_rag/"
```
2. Make sure the Space does not ignore the index. Create `$DST/.gitignore` with:
```gitignore
.env
__pycache__/
*.pyc
.venv/
# NOTE: indexes/ is intentionally NOT ignored β€” the prebuilt bundle ships with the Space.
```
3. Enable LFS and stage the large files (the `.gitattributes` already tracks
`*.npy` and `indexes/probas_rag/*.json`):
```bash
cd "$DST"
git lfs install
git add .gitattributes
git add app.py requirements.txt README.md check_progress.py .gitignore
git add indexes/probas_rag/bundle_v3_*.json indexes/probas_rag/bundle_embeddings_v3_*.npy
git lfs ls-files # confirm both large files are LFS-tracked
```
4. Set the API key as a **Space secret** (Settings β†’ Variables and secrets β†’
New secret), name `OPENAI_API_KEY`. Optionally also set `OPENAI_BASE_URL`
and `PROBAS_EMBEDDING_MODEL` as variables if you want to override the defaults.
5. Commit and push:
```bash
git commit -m "Deploy ProBas RAG Assistant with prebuilt index"
git push
```
The Space will build, install requirements, and on boot load the prebuilt bundle
(~15 s) instead of embedding. The first chat request warms the API connection.
## Hardware note
The bundle holds 23,172 records and a (23172, 2560) float32 embedding matrix
(~237 MB in RAM) plus the BM25 token lists. The free CPU Space tier (16 GB) is
sufficient. If startup is killed for memory, upgrade to a larger CPU tier.
## Security
Rotate the API key that was previously committed to git history in this project
before reusing it as a Space secret.