Spaces:

vivekchakraverty
/

gdscript-assistant

Running on Zero

App Files Files Community

gdscript-assistant / DEPLOY.md

vivekchakraverty

GDScript RAG assistant: app + corpus (index added later via Colab)

777ea0e verified 2 days ago

preview code

raw

history blame contribute delete

2.76 kB

	# Deploying the GDScript Assistant (Colab-built jina index)

	The 280 MB jina index is built on a free Colab GPU and pushed straight to the
	Space, so it never moves over your local connection. You only push the app +
	`chunks.jsonl` (~90 MB) once.

	## 0. Prerequisites
	- HuggingFace account + write token (https://huggingface.co/settings/tokens).
	- `git`, `git-lfs`, `pip install huggingface_hub`.
	- `data/chunks.jsonl` is already staged in this folder.

	## Phase 1 — Push the app + corpus (your machine)
	The app tolerates a missing index (it answers without retrieval until the index
	is added), so deploy first:
	```bash
	huggingface-cli login # write token
	huggingface-cli repo create gdscript-assistant --type space --space_sdk gradio
	cd hf-space/gdscript-assistant
	git init && git lfs install
	git add . && git commit -m "GDScript RAG assistant (app + corpus)"
	git remote add origin https://huggingface.co/spaces/<user>/gdscript-assistant
	git push -u origin main # ~90MB: chunks.jsonl (LFS) + code
	```
	Then in Space → Settings → Hardware → select "ZeroGPU".

	## Phase 2 — Build the jina index on Colab (free GPU, ~10 min)
	1. Open https://colab.research.google.com → new notebook →
	Runtime → Change runtime type → T4 GPU.
	2. Cell 1 (install):
	```python
	!pip install -q "transformers<5" sentence-transformers einops faiss-cpu huggingface_hub
	```
	3. Cell 2: paste the contents of `colab_build_index.py`, set at the top:
	```python
	SPACE_REPO = "<user>/gdscript-assistant"
	HF_TOKEN = "hf_...your_write_token..."
	```
	Run it. It pulls `chunks.jsonl` from the Space, embeds 91,720 chunks with
	`jina-embeddings-v2-base-code` on the GPU, builds the FAISS index, and
	uploads `data/embeddings.faiss` + `data/id_map.json` back to the Space.
	4. The Space auto-restarts and now answers with full RAG + sources.

	## Phase 3 — Verify on the Space
	- Ask "Write a CharacterBody2D top-down movement script" → GDScript answer, a
	✅ gdtoolkit validation badge, and a 📚 Retrieved sources list.
	- Force a mistake to see the 🔧 auto-correct path.
	- Hitting ZeroGPU quota? HF PRO ($9/mo) gives much more GPU time.

	## Notes
	- Index format is built to match `rag.py` exactly (cosine `IndexIDMap2`,
	`faiss_id == chunk id`; `id_map.json` keyed by `str(id)`).
	- `requirements.txt` pins `transformers~=4.45` so jina (query embedding) and
	Qwen2.5-Coder both load with no patches.
	- Validation checks syntax + style (gdtoolkit), not runtime/scene semantics.
	- Fallback (local build): if you ever build the index locally
	(`python crawl_gdscript.py embed`), run `bash stage_index.sh` then push — but
	jina on this CPU is ~50h, so Colab is strongly preferred.