ProBas_RAG_Assistant / DEPLOY_HF.md
Mohamed284's picture
Deploy ProBas RAG Assistant with enriched prebuilt index
0ca97fd
|
Raw
History Blame Contribute Delete
3.43 kB

A newer version of the Gradio SDK is available: 6.19.0

Upgrade

Deploying ProBas RAG Assistant to Hugging Face Spaces

The Space ships the prebuilt index and loads it directly on startup — no re-embedding, and the 1.2 GB raw dataset does not need to be uploaded. The app's load_any_bundle() fallback loads any bundle present under indexes/probas_rag/ even when the raw probas_processes_by_classification_rag_json/ directory is absent.

At query time the app still calls the API for the query embedding and the chat completion, so the Space needs OPENAI_API_KEY set as a secret.

What gets uploaded

File Purpose Size
app.py the app small
requirements.txt deps small
README.md includes the Space metadata header small
.gitattributes LFS rules for the index small
check_progress.py optional build monitor small
indexes/probas_rag/bundle_v3_*.json record/BM25 bundle ~489 MB (LFS)
indexes/probas_rag/bundle_embeddings_v3_*.npy embeddings ~227 MB (LFS)

Do not upload .env, the raw dataset folder, or .venv.

Steps

Assuming you cloned the Space already:

git clone https://huggingface.co/spaces/IPTS-PRODDEV/ProBas_RAG_Assistant
cd ProBas_RAG_Assistant
  1. Copy the app files in. From this project directory:

    SRC="/media/mohamed/New Volume/Leuphana_cousres/SA_Projects/Probas RAG Assistant"
    DST="/media/mohamed/New Volume/Leuphana_cousres/SA_Projects/Probas RAG Assistant/ProBas_RAG_Assistant"   # the HF clone
    
    cp "$SRC/app.py" "$SRC/requirements.txt" "$SRC/README.md" \
       "$SRC/.gitattributes" "$SRC/check_progress.py" "$DST/"
    
    mkdir -p "$DST/indexes/probas_rag"
    cp "$SRC/indexes/probas_rag/bundle_v3_"*.json   "$DST/indexes/probas_rag/"
    cp "$SRC/indexes/probas_rag/bundle_embeddings_v3_"*.npy "$DST/indexes/probas_rag/"
    
  2. Make sure the Space does not ignore the index. Create $DST/.gitignore with:

    .env
    __pycache__/
    *.pyc
    .venv/
    # NOTE: indexes/ is intentionally NOT ignored — the prebuilt bundle ships with the Space.
    
  3. Enable LFS and stage the large files (the .gitattributes already tracks *.npy and indexes/probas_rag/*.json):

    cd "$DST"
    git lfs install
    git add .gitattributes
    git add app.py requirements.txt README.md check_progress.py .gitignore
    git add indexes/probas_rag/bundle_v3_*.json indexes/probas_rag/bundle_embeddings_v3_*.npy
    git lfs ls-files   # confirm both large files are LFS-tracked
    
  4. Set the API key as a Space secret (Settings → Variables and secrets → New secret), name OPENAI_API_KEY. Optionally also set OPENAI_BASE_URL and PROBAS_EMBEDDING_MODEL as variables if you want to override the defaults.

  5. Commit and push:

    git commit -m "Deploy ProBas RAG Assistant with prebuilt index"
    git push
    

The Space will build, install requirements, and on boot load the prebuilt bundle (~15 s) instead of embedding. The first chat request warms the API connection.

Hardware note

The bundle holds 23,172 records and a (23172, 2560) float32 embedding matrix (~237 MB in RAM) plus the BM25 token lists. The free CPU Space tier (16 GB) is sufficient. If startup is killed for memory, upgrade to a larger CPU tier.

Security

Rotate the API key that was previously committed to git history in this project before reusing it as a Space secret.