Spaces:
Running on Zero
Running on Zero
A newer version of the Gradio SDK is available: 6.15.2
Deploying the GDScript Assistant (Colab-built jina index)
The 280 MB jina index is built on a free Colab GPU and pushed straight to the
Space, so it never moves over your local connection. You only push the app +
chunks.jsonl (~90 MB) once.
0. Prerequisites
- HuggingFace account + write token (https://huggingface.co/settings/tokens).
git,git-lfs,pip install huggingface_hub.data/chunks.jsonlis already staged in this folder.
Phase 1 β Push the app + corpus (your machine)
The app tolerates a missing index (it answers without retrieval until the index is added), so deploy first:
huggingface-cli login # write token
huggingface-cli repo create gdscript-assistant --type space --space_sdk gradio
cd hf-space/gdscript-assistant
git init && git lfs install
git add . && git commit -m "GDScript RAG assistant (app + corpus)"
git remote add origin https://huggingface.co/spaces/<user>/gdscript-assistant
git push -u origin main # ~90MB: chunks.jsonl (LFS) + code
Then in Space β Settings β Hardware β select "ZeroGPU".
Phase 2 β Build the jina index on Colab (free GPU, ~10 min)
- Open https://colab.research.google.com β new notebook β Runtime β Change runtime type β T4 GPU.
- Cell 1 (install):
!pip install -q "transformers<5" sentence-transformers einops faiss-cpu huggingface_hub - Cell 2: paste the contents of
colab_build_index.py, set at the top:Run it. It pullsSPACE_REPO = "<user>/gdscript-assistant" HF_TOKEN = "hf_...your_write_token..."chunks.jsonlfrom the Space, embeds 91,720 chunks withjina-embeddings-v2-base-codeon the GPU, builds the FAISS index, and uploadsdata/embeddings.faiss+data/id_map.jsonback to the Space. - The Space auto-restarts and now answers with full RAG + sources.
Phase 3 β Verify on the Space
- Ask "Write a CharacterBody2D top-down movement script" β GDScript answer, a β gdtoolkit validation badge, and a π Retrieved sources list.
- Force a mistake to see the π§ auto-correct path.
- Hitting ZeroGPU quota? HF PRO ($9/mo) gives much more GPU time.
Notes
- Index format is built to match
rag.pyexactly (cosineIndexIDMap2,faiss_id == chunk id;id_map.jsonkeyed bystr(id)). requirements.txtpinstransformers~=4.45so jina (query embedding) and Qwen2.5-Coder both load with no patches.- Validation checks syntax + style (gdtoolkit), not runtime/scene semantics.
- Fallback (local build): if you ever build the index locally
(
python crawl_gdscript.py embed), runbash stage_index.shthen push β but jina on this CPU is ~50h, so Colab is strongly preferred.