# Deploying the GDScript Assistant (Colab-built jina index) The 280 MB jina index is built on a **free Colab GPU** and pushed straight to the Space, so it never moves over your local connection. You only push the app + `chunks.jsonl` (~90 MB) once. ## 0. Prerequisites - HuggingFace account + **write token** (https://huggingface.co/settings/tokens). - `git`, `git-lfs`, `pip install huggingface_hub`. - `data/chunks.jsonl` is already staged in this folder. ## Phase 1 — Push the app + corpus (your machine) The app tolerates a missing index (it answers without retrieval until the index is added), so deploy first: ```bash huggingface-cli login # write token huggingface-cli repo create gdscript-assistant --type space --space_sdk gradio cd hf-space/gdscript-assistant git init && git lfs install git add . && git commit -m "GDScript RAG assistant (app + corpus)" git remote add origin https://huggingface.co/spaces//gdscript-assistant git push -u origin main # ~90MB: chunks.jsonl (LFS) + code ``` Then in **Space → Settings → Hardware → select "ZeroGPU"**. ## Phase 2 — Build the jina index on Colab (free GPU, ~10 min) 1. Open https://colab.research.google.com → new notebook → **Runtime → Change runtime type → T4 GPU**. 2. Cell 1 (install): ```python !pip install -q "transformers<5" sentence-transformers einops faiss-cpu huggingface_hub ``` 3. Cell 2: paste the contents of **`colab_build_index.py`**, set at the top: ```python SPACE_REPO = "/gdscript-assistant" HF_TOKEN = "hf_...your_write_token..." ``` Run it. It pulls `chunks.jsonl` from the Space, embeds 91,720 chunks with `jina-embeddings-v2-base-code` on the GPU, builds the FAISS index, and **uploads `data/embeddings.faiss` + `data/id_map.json` back to the Space**. 4. The Space auto-restarts and now answers with full RAG + sources. ## Phase 3 — Verify on the Space - Ask *"Write a CharacterBody2D top-down movement script"* → GDScript answer, a **✅ gdtoolkit validation** badge, and a **📚 Retrieved sources** list. - Force a mistake to see the **🔧 auto-correct** path. - Hitting ZeroGPU quota? HF **PRO** ($9/mo) gives much more GPU time. ## Notes - Index format is built to match `rag.py` exactly (cosine `IndexIDMap2`, `faiss_id == chunk id`; `id_map.json` keyed by `str(id)`). - `requirements.txt` pins `transformers~=4.45` so jina (query embedding) and Qwen2.5-Coder both load with no patches. - Validation checks **syntax + style** (gdtoolkit), not runtime/scene semantics. - Fallback (local build): if you ever build the index locally (`python crawl_gdscript.py embed`), run `bash stage_index.sh` then push — but jina on this CPU is ~50h, so Colab is strongly preferred.