--- title: GDScript Coding Assistant emoji: 🤖 colorFrom: purple colorTo: green sdk: gradio app_file: app.py pinned: false license: mit short_description: RAG GDScript assistant with gdtoolkit validation --- # 🤖 GDScript Coding Assistant A Godot 4 / GDScript coding assistant that answers using **RAG** over a curated **91,720-chunk** corpus crawled from the official docs, demo repos, tutorial sites and YouTube descriptions. Generated GDScript is **syntax-validated with `gdtoolkit`** before it's shown. ## How it works ``` question ─▶ jina query-embed (CPU) ─▶ FAISS top-k GDScript snippets ─▶ Qwen2.5-Coder-7B-Instruct (ZeroGPU) ─▶ answer ─▶ gdtoolkit parse + lint (CPU) ─▶ ✅/❌ + optional 1× self-fix ``` - **Retriever:** `jinaai/jina-embeddings-v2-base-code` (768-dim, code-tuned), prebuilt FAISS cosine index bundled via Git LFS (`data/embeddings.faiss`, `data/chunks.jsonl`). - **Generator:** `Qwen/Qwen2.5-Coder-7B-Instruct` on **ZeroGPU** (only the generation call uses the GPU). - **Validation:** `gdtoolkit` (`gdparse` syntax + `gdlint` style). Note: this checks *syntax and style*, not runtime/scene semantics. ## Setup (hardware) In **Space → Settings → Hardware**, select **ZeroGPU**. The `spaces` package + `@spaces.GPU` decorator in `generate.py` do the rest. ## Local dev ```bash pip install -r requirements.txt # fast UI/flow test without downloading the 7B model: GDRAG_STUB_LLM=1 python app.py # real retrieval needs data/embeddings.faiss + data/chunks.jsonl present python rag.py "how do I use @export and signals" python validate.py ``` ## Data provenance & licensing Snippets come from public Godot resources with **varying licenses** (docs CC-BY, repos MIT/Apache/GPL/…). Each retrieved snippet shows its source; respect the original licenses when reusing generated code.