Spaces:

vivekchakraverty
/

gdscript-assistant

Running on Zero

App Files Files Community

gdscript-assistant / DEPLOY.md

vivekchakraverty

GDScript RAG assistant: app + corpus (index added later via Colab)

777ea0e verified 2 days ago

preview code

raw

history blame contribute delete

2.76 kB

A newer version of the Gradio SDK is available: 6.15.2

Upgrade

Deploying the GDScript Assistant (Colab-built jina index)

The 280 MB jina index is built on a free Colab GPU and pushed straight to the Space, so it never moves over your local connection. You only push the app + chunks.jsonl (~90 MB) once.

0. Prerequisites

HuggingFace account + write token (https://huggingface.co/settings/tokens).
git, git-lfs, pip install huggingface_hub.
data/chunks.jsonl is already staged in this folder.

Phase 1 — Push the app + corpus (your machine)

The app tolerates a missing index (it answers without retrieval until the index is added), so deploy first:

huggingface-cli login            # write token
huggingface-cli repo create gdscript-assistant --type space --space_sdk gradio
cd hf-space/gdscript-assistant
git init && git lfs install
git add . && git commit -m "GDScript RAG assistant (app + corpus)"
git remote add origin https://huggingface.co/spaces/<user>/gdscript-assistant
git push -u origin main           # ~90MB: chunks.jsonl (LFS) + code

Then in Space → Settings → Hardware → select "ZeroGPU".

Phase 2 — Build the jina index on Colab (free GPU, ~10 min)

Open https://colab.research.google.com → new notebook → Runtime → Change runtime type → T4 GPU.

Cell 1 (install):

!pip install -q "transformers<5" sentence-transformers einops faiss-cpu huggingface_hub

Cell 2: paste the contents of colab_build_index.py, set at the top:
```
SPACE_REPO = "<user>/gdscript-assistant"
HF_TOKEN   = "hf_...your_write_token..."
```
Run it. It pulls chunks.jsonl from the Space, embeds 91,720 chunks with jina-embeddings-v2-base-code on the GPU, builds the FAISS index, and uploads data/embeddings.faiss + data/id_map.json back to the Space.
The Space auto-restarts and now answers with full RAG + sources.

Phase 3 — Verify on the Space

Ask "Write a CharacterBody2D top-down movement script" → GDScript answer, a ✅ gdtoolkit validation badge, and a 📚 Retrieved sources list.
Force a mistake to see the 🔧 auto-correct path.
Hitting ZeroGPU quota? HF PRO ($9/mo) gives much more GPU time.

Notes

Index format is built to match rag.py exactly (cosine IndexIDMap2, faiss_id == chunk id; id_map.json keyed by str(id)).
requirements.txt pins transformers~=4.45 so jina (query embedding) and Qwen2.5-Coder both load with no patches.
Validation checks syntax + style (gdtoolkit), not runtime/scene semantics.
Fallback (local build): if you ever build the index locally (python crawl_gdscript.py embed), run bash stage_index.sh then push — but jina on this CPU is ~50h, so Colab is strongly preferred.