File size: 2,764 Bytes
777ea0e
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
# Deploying the GDScript Assistant (Colab-built jina index)

The 280 MB jina index is built on a **free Colab GPU** and pushed straight to the
Space, so it never moves over your local connection. You only push the app +
`chunks.jsonl` (~90 MB) once.

## 0. Prerequisites
- HuggingFace account + **write token** (https://huggingface.co/settings/tokens).
- `git`, `git-lfs`, `pip install huggingface_hub`.
- `data/chunks.jsonl` is already staged in this folder.

## Phase 1 β€” Push the app + corpus (your machine)
The app tolerates a missing index (it answers without retrieval until the index
is added), so deploy first:
```bash
huggingface-cli login            # write token
huggingface-cli repo create gdscript-assistant --type space --space_sdk gradio
cd hf-space/gdscript-assistant
git init && git lfs install
git add . && git commit -m "GDScript RAG assistant (app + corpus)"
git remote add origin https://huggingface.co/spaces/<user>/gdscript-assistant
git push -u origin main           # ~90MB: chunks.jsonl (LFS) + code
```
Then in **Space β†’ Settings β†’ Hardware β†’ select "ZeroGPU"**.

## Phase 2 β€” Build the jina index on Colab (free GPU, ~10 min)
1. Open https://colab.research.google.com β†’ new notebook β†’
   **Runtime β†’ Change runtime type β†’ T4 GPU**.
2. Cell 1 (install):
   ```python
   !pip install -q "transformers<5" sentence-transformers einops faiss-cpu huggingface_hub
   ```
3. Cell 2: paste the contents of **`colab_build_index.py`**, set at the top:
   ```python
   SPACE_REPO = "<user>/gdscript-assistant"
   HF_TOKEN   = "hf_...your_write_token..."
   ```
   Run it. It pulls `chunks.jsonl` from the Space, embeds 91,720 chunks with
   `jina-embeddings-v2-base-code` on the GPU, builds the FAISS index, and
   **uploads `data/embeddings.faiss` + `data/id_map.json` back to the Space**.
4. The Space auto-restarts and now answers with full RAG + sources.

## Phase 3 β€” Verify on the Space
- Ask *"Write a CharacterBody2D top-down movement script"* β†’ GDScript answer, a
  **βœ… gdtoolkit validation** badge, and a **πŸ“š Retrieved sources** list.
- Force a mistake to see the **πŸ”§ auto-correct** path.
- Hitting ZeroGPU quota? HF **PRO** ($9/mo) gives much more GPU time.

## Notes
- Index format is built to match `rag.py` exactly (cosine `IndexIDMap2`,
  `faiss_id == chunk id`; `id_map.json` keyed by `str(id)`).
- `requirements.txt` pins `transformers~=4.45` so jina (query embedding) and
  Qwen2.5-Coder both load with no patches.
- Validation checks **syntax + style** (gdtoolkit), not runtime/scene semantics.
- Fallback (local build): if you ever build the index locally
  (`python crawl_gdscript.py embed`), run `bash stage_index.sh` then push β€” but
  jina on this CPU is ~50h, so Colab is strongly preferred.