Spaces:
Running on Zero
Running on Zero
| # Deploying the GDScript Assistant (Colab-built jina index) | |
| The 280 MB jina index is built on a **free Colab GPU** and pushed straight to the | |
| Space, so it never moves over your local connection. You only push the app + | |
| `chunks.jsonl` (~90 MB) once. | |
| ## 0. Prerequisites | |
| - HuggingFace account + **write token** (https://huggingface.co/settings/tokens). | |
| - `git`, `git-lfs`, `pip install huggingface_hub`. | |
| - `data/chunks.jsonl` is already staged in this folder. | |
| ## Phase 1 β Push the app + corpus (your machine) | |
| The app tolerates a missing index (it answers without retrieval until the index | |
| is added), so deploy first: | |
| ```bash | |
| huggingface-cli login # write token | |
| huggingface-cli repo create gdscript-assistant --type space --space_sdk gradio | |
| cd hf-space/gdscript-assistant | |
| git init && git lfs install | |
| git add . && git commit -m "GDScript RAG assistant (app + corpus)" | |
| git remote add origin https://huggingface.co/spaces/<user>/gdscript-assistant | |
| git push -u origin main # ~90MB: chunks.jsonl (LFS) + code | |
| ``` | |
| Then in **Space β Settings β Hardware β select "ZeroGPU"**. | |
| ## Phase 2 β Build the jina index on Colab (free GPU, ~10 min) | |
| 1. Open https://colab.research.google.com β new notebook β | |
| **Runtime β Change runtime type β T4 GPU**. | |
| 2. Cell 1 (install): | |
| ```python | |
| !pip install -q "transformers<5" sentence-transformers einops faiss-cpu huggingface_hub | |
| ``` | |
| 3. Cell 2: paste the contents of **`colab_build_index.py`**, set at the top: | |
| ```python | |
| SPACE_REPO = "<user>/gdscript-assistant" | |
| HF_TOKEN = "hf_...your_write_token..." | |
| ``` | |
| Run it. It pulls `chunks.jsonl` from the Space, embeds 91,720 chunks with | |
| `jina-embeddings-v2-base-code` on the GPU, builds the FAISS index, and | |
| **uploads `data/embeddings.faiss` + `data/id_map.json` back to the Space**. | |
| 4. The Space auto-restarts and now answers with full RAG + sources. | |
| ## Phase 3 β Verify on the Space | |
| - Ask *"Write a CharacterBody2D top-down movement script"* β GDScript answer, a | |
| **β gdtoolkit validation** badge, and a **π Retrieved sources** list. | |
| - Force a mistake to see the **π§ auto-correct** path. | |
| - Hitting ZeroGPU quota? HF **PRO** ($9/mo) gives much more GPU time. | |
| ## Notes | |
| - Index format is built to match `rag.py` exactly (cosine `IndexIDMap2`, | |
| `faiss_id == chunk id`; `id_map.json` keyed by `str(id)`). | |
| - `requirements.txt` pins `transformers~=4.45` so jina (query embedding) and | |
| Qwen2.5-Coder both load with no patches. | |
| - Validation checks **syntax + style** (gdtoolkit), not runtime/scene semantics. | |
| - Fallback (local build): if you ever build the index locally | |
| (`python crawl_gdscript.py embed`), run `bash stage_index.sh` then push β but | |
| jina on this CPU is ~50h, so Colab is strongly preferred. | |