gdscript-assistant / DEPLOY.md
vivekchakraverty's picture
GDScript RAG assistant: app + corpus (index added later via Colab)
777ea0e verified

A newer version of the Gradio SDK is available: 6.15.2

Upgrade

Deploying the GDScript Assistant (Colab-built jina index)

The 280 MB jina index is built on a free Colab GPU and pushed straight to the Space, so it never moves over your local connection. You only push the app + chunks.jsonl (~90 MB) once.

0. Prerequisites

Phase 1 β€” Push the app + corpus (your machine)

The app tolerates a missing index (it answers without retrieval until the index is added), so deploy first:

huggingface-cli login            # write token
huggingface-cli repo create gdscript-assistant --type space --space_sdk gradio
cd hf-space/gdscript-assistant
git init && git lfs install
git add . && git commit -m "GDScript RAG assistant (app + corpus)"
git remote add origin https://huggingface.co/spaces/<user>/gdscript-assistant
git push -u origin main           # ~90MB: chunks.jsonl (LFS) + code

Then in Space β†’ Settings β†’ Hardware β†’ select "ZeroGPU".

Phase 2 β€” Build the jina index on Colab (free GPU, ~10 min)

  1. Open https://colab.research.google.com β†’ new notebook β†’ Runtime β†’ Change runtime type β†’ T4 GPU.
  2. Cell 1 (install):
    !pip install -q "transformers<5" sentence-transformers einops faiss-cpu huggingface_hub
    
  3. Cell 2: paste the contents of colab_build_index.py, set at the top:
    SPACE_REPO = "<user>/gdscript-assistant"
    HF_TOKEN   = "hf_...your_write_token..."
    
    Run it. It pulls chunks.jsonl from the Space, embeds 91,720 chunks with jina-embeddings-v2-base-code on the GPU, builds the FAISS index, and uploads data/embeddings.faiss + data/id_map.json back to the Space.
  4. The Space auto-restarts and now answers with full RAG + sources.

Phase 3 β€” Verify on the Space

  • Ask "Write a CharacterBody2D top-down movement script" β†’ GDScript answer, a βœ… gdtoolkit validation badge, and a πŸ“š Retrieved sources list.
  • Force a mistake to see the πŸ”§ auto-correct path.
  • Hitting ZeroGPU quota? HF PRO ($9/mo) gives much more GPU time.

Notes

  • Index format is built to match rag.py exactly (cosine IndexIDMap2, faiss_id == chunk id; id_map.json keyed by str(id)).
  • requirements.txt pins transformers~=4.45 so jina (query embedding) and Qwen2.5-Coder both load with no patches.
  • Validation checks syntax + style (gdtoolkit), not runtime/scene semantics.
  • Fallback (local build): if you ever build the index locally (python crawl_gdscript.py embed), run bash stage_index.sh then push β€” but jina on this CPU is ~50h, so Colab is strongly preferred.