Spaces:

Undrick
/

NLP_Lab

Sleeping

App Files Files Community

apytel commited on May 17

Commit

11ba2bd

1 Parent(s): f9d0b31

Redesigns UI for FreeCAD RAG Python script generator

Browse files

Replaces the generic chatbot interface with a custom Gradio

Files changed (20) hide show

CLAUDE.md +53 -0
README.md +96 -8
app.py +118 -60
build_index.py +87 -0
freecad-docs +1 -0
requirements.txt +13 -0
src/__init__.py +0 -0
src/__pycache__/__init__.cpython-313.pyc +0 -0
src/__pycache__/chunk.cpython-313.pyc +0 -0
src/__pycache__/citations.cpython-313.pyc +0 -0
src/__pycache__/config.cpython-313.pyc +0 -0
src/__pycache__/generate.cpython-313.pyc +0 -0
src/__pycache__/ingest.cpython-313.pyc +0 -0
src/__pycache__/retrieve.cpython-313.pyc +0 -0
src/chunk.py +118 -0
src/citations.py +37 -0
src/config.py +26 -0
src/generate.py +185 -0
src/ingest.py +62 -0
src/retrieve.py +199 -0

CLAUDE.md ADDED Viewed

	@@ -0,0 +1,53 @@

+# CLAUDE.md
+This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
+## Commands
+```bash
+# Install dependencies
+pip install -r requirements.txt
+# Build retrieval indices (one-time, requires freecad-docs/ to be cloned)
+git clone --depth 1 https://github.com/FreeCAD/FreeCAD-documentation freecad-docs
+python build_index.py --repo freecad-docs
+# Run the Gradio app
+python app.py
+```
+The app requires `data/chunks.parquet`, `data/index.faiss`, and `data/bm25.pkl` to exist before it will serve requests. `indices_ready()` in `src/retrieve.py` checks for these.
+## Architecture
+The system is a two-phase pipeline: **offline indexing** (`build_index.py`) and **online serving** (`app.py`).
+### Offline: `build_index.py`
+Reads FreeCAD wiki markdown from `freecad-docs/wiki/`, passes pages through `src/ingest.py` → `src/chunk.py`, then builds two indices written to `data/`:
+- **BM25** (`bm25s`, `bm25.pkl`) — tokenised with a custom camelCase/snake_case tokeniser in `src/retrieve.py:_tokenize`
+- **Dense** (`FAISS IndexFlatIP`, `index.faiss`) — embeddings from `BAAI/bge-small-en-v1.5`
+### Online: `app.py` → `src/retrieve.py` → `src/generate.py`
+1. `HybridRetriever.retrieve(query)` runs BM25 + dense search, fuses with Reciprocal Rank Fusion (k=60), optionally reranks with `BAAI/bge-reranker-base` cross-encoder, returns top-N `Citation` objects.
+2. `generate_response()` formats citations into a numbered context block, prepends the system prompt (with two few-shot examples), and calls the OpenAI chat API.
+3. The response is split into a `python` code block and a prose explanation with inline `[N]` citation references.
+### Key files
+- `src/config.py` — all tuneable constants (chunk size, top-K values, model names, file paths). Change retrieval hyperparameters here.
+- `src/chunk.py` — header-split + code-block-preserving chunker. Fenced code blocks are replaced with UUID placeholders before splitting so they are never broken mid-block.
+- `src/retrieve.py` — all retrieval logic including lazy model singletons (`_load_*` functions) that are cached at module level for the Gradio process lifetime.
+- `src/generate.py` — system prompt, two few-shot examples (parametric box, revolve), and the OpenAI call. The few-shot examples are the authoritative reference for expected script style.
+- `src/citations.py` — `Citation` dataclass, context block formatter, and citation markdown renderer.
+- `src/ingest.py` — walks `freecad-docs/wiki/*.md`, skips Category/Template/MediaWiki pages, and flags ~25 high-priority scripting pages for front-sorting.
+## FreeCAD script generation constraints
+All generated scripts must:
+- Target **FreeCAD 1.1** (released March 25, 2026)
+- Never import `*Gui` modules — they crash headless (`freecadcmd`)
+- Use `body.newObject(...)` not `doc.addObject(...)` for PartDesign features
+- Call `doc.recompute()` after every feature
+- Add dress-up features (Fillet, Chamfer) only after all additive/subtractive features
+- Reference geometry by index to minimise Topological Naming Problem risk
+These rules are encoded in `_SYSTEM_PROMPT` in `src/generate.py` and must stay consistent with any few-shot examples added there.

README.md CHANGED Viewed

@@ -1,16 +1,104 @@
 ---
-title: NLP Lab
-emoji: 💬
-colorFrom: yellow
-colorTo: purple
 sdk: gradio
 sdk_version: 6.5.1
 app_file: app.py
 pinned: false
-hf_oauth: true
-hf_oauth_scopes:
-- inference-api
 license: apache-2.0
 ---
-An example chatbot using [Gradio](https://gradio.app), [`huggingface_hub`](https://huggingface.co/docs/huggingface_hub/v0.22.2/en/index), and the [Hugging Face Inference API](https://huggingface.co/docs/api-inference/index).

 ---
+title: FreeCAD RAG Assistant
+emoji: 🛠️
+colorFrom: blue
+colorTo: indigo
 sdk: gradio
 sdk_version: 6.5.1
+python_version: "3.11"
 app_file: app.py
 pinned: false
 license: apache-2.0
+short_description: Generate parametric FreeCAD Python from natural language via RAG
 ---
+# FreeCAD RAG Assistant
+A RAG (Retrieval-Augmented Generation) system that generates complete, runnable **FreeCAD 1.1 Python scripts** from natural-language descriptions of parts.
+## Architecture
+```
+Query
+  │
+  ├─► BM25 retrieval (bm25s)          ─┐
+  │                                    ├─► RRF fusion ─► Cross-encoder rerank ─► Top-5 chunks
+  └─► Dense retrieval (bge-small-en)  ─┘
+                                                                │
+                                                  OpenAI (gpt-4o-mini) + system prompt
+                                                                │
+                                              Generated Python + inline citations
+```
+**Corpus**: [FreeCAD/FreeCAD-documentation](https://github.com/FreeCAD/FreeCAD-documentation) (CC0 1.0) — ~1,500 English wiki pages covering PartDesign, Sketcher, Python scripting API, and release notes.
+## Setup
+### 1. Install dependencies
+```bash
+pip install -r requirements.txt
+```
+### 2. Build the retrieval index (one-time, run locally)
+```bash
+# Clone the FreeCAD documentation repo
+git clone --depth 1 https://github.com/FreeCAD/FreeCAD-documentation freecad-docs
+# Build BM25 + FAISS indices (outputs to data/)
+python build_index.py --repo freecad-docs
+```
+This produces `data/chunks.parquet`, `data/index.faiss`, and `data/bm25.pkl`. Commit these to the repo before pushing to Hugging Face Spaces.
+### 3. Run
+```bash
+python app.py
+```
+Enter your OpenAI API key in the UI (it is never stored or logged).
+## Retrieval modes
+| Toggle | Method | Wins on |
+|--------|--------|---------|
+| BM25 | `bm25s` with camelCase/snake_case tokenisation | Exact API tokens: `addConstraint`, `Coincident`, `PartDesign::Pad` |
+| Dense | `BAAI/bge-small-en-v1.5` + FAISS IndexFlatIP | Paraphrased intent: "round the edges" → Fillet |
+| Rerank | `BAAI/bge-reranker-base` cross-encoder | Precision: re-scores top-30 fused candidates |
+| Hybrid (default) | Reciprocal Rank Fusion (k=60) | Best overall recall |
+## Project structure
+```
+├── app.py               # Gradio Blocks UI
+├── build_index.py       # One-off corpus ingestion + indexing
+├── requirements.txt
+├── src/
+│   ├── config.py        # All tuneable constants
+│   ├── ingest.py        # Markdown page loader
+│   ├── chunk.py         # Header-split + code-block-preserving chunker
+│   ├── retrieve.py      # BM25Retriever, DenseRetriever, RRF, HybridRetriever
+│   ├── generate.py      # System prompt, few-shots, OpenAI call
+│   └── citations.py     # Citation dataclass + rendering
+└── data/                # Pre-built indices (commit via git-LFS if > 100 MB)
+    ├── chunks.parquet
+    ├── index.faiss
+    └── bm25.pkl
+```
+## FreeCAD-specific notes
+- All generated scripts target **FreeCAD 1.1** (released March 25, 2026).
+- Scripts are safe to run with `freecadcmd` (headless) — `*Gui` modules are never imported.
+- The system prompt explicitly warns about the **Topological Naming Problem**: geometry is referenced by index where possible, and dress-up features (Fillet, Chamfer) are always added after all additive/subtractive features.
+- `doc.recompute()` is called after every feature to avoid silent failures.
+## Evaluation queries
+See section 12 of the technical report for the 12-query test set covering: parametric box, flange with bolt pattern, hex nut, L-bracket, threaded shaft, spreadsheet-driven gear, revolution, coincident constraint question, TNP question, linear pattern, helix sweep, and multi-loop sketch.
+## License
+Source code: Apache 2.0. Documentation corpus: [CC0 1.0](https://creativecommons.org/publicdomain/zero/1.0/) (FreeCAD Wiki). Attribution to FreeCAD Wiki (CC-BY 3.0) shown in the UI.

app.py CHANGED Viewed

@@ -1,68 +1,126 @@
 import gradio as gr
-from huggingface_hub import InferenceClient
-def respond(
-    message,
-    history: list[dict[str, str]],
-    system_message,
-    max_tokens,
-    temperature,
-    top_p,
-    hf_token: gr.OAuthToken,
-):
-    """
-    For more information on `huggingface_hub` Inference API support, please check the docs: https://huggingface.co/docs/huggingface_hub/v0.22.2/en/guides/inference
-    """
-    client = InferenceClient(token=hf_token.token, model="openai/gpt-oss-20b")
-    messages = [{"role": "system", "content": system_message}]
-    messages.extend(history)
-    messages.append({"role": "user", "content": message})
-    response = ""
-    for message in client.chat_completion(
-        messages,
-        max_tokens=max_tokens,
-        stream=True,
-        temperature=temperature,
-        top_p=top_p,
-    ):
-        choices = message.choices
-        token = ""
-        if len(choices) and choices[0].delta.content:
-            token = choices[0].delta.content
-        response += token
-        yield response
-"""
-For information on how to customize the ChatInterface, peruse the gradio docs: https://www.gradio.app/docs/chatinterface
-"""
-chatbot = gr.ChatInterface(
-    respond,
-    additional_inputs=[
-        gr.Textbox(value="You are a friendly Chatbot.", label="System message"),
-        gr.Slider(minimum=1, maximum=2048, value=512, step=1, label="Max new tokens"),
-        gr.Slider(minimum=0.1, maximum=4.0, value=0.7, step=0.1, label="Temperature"),
-        gr.Slider(
-            minimum=0.1,
-            maximum=1.0,
-            value=0.95,
-            step=0.05,
-            label="Top-p (nucleus sampling)",
-        ),
-    ],
 )
-with gr.Blocks() as demo:
-    with gr.Sidebar():
-        gr.LoginButton()
-    chatbot.render()
 if __name__ == "__main__":

+"""FreeCAD RAG Assistant — Gradio Blocks UI."""
 import gradio as gr
+from src.generate import generate_response
+from src.retrieve import HybridRetriever, indices_ready
+from src.config import HF_MODELS, DEFAULT_MODEL
+# ── example queries ───────────────────────────────────────────────────────────
+_EXAMPLES = [
+    ["Create a parametric box width=50 height=30 depth=20 with 5mm fillets on all vertical edges"],
+    ["Make a flange OD=80mm ID=40mm thickness=10mm with 4 M6 bolt holes on a 60mm PCD using PolarPattern"],
+    ["Create a hex nut for M10 thread conforming to ISO 4032 dimensions"],
+    ["Generate an L-bracket 60x40x4mm with two 6mm countersunk holes"],
+    ["Create a 20mm-diameter shaft with M20 thread for 30mm at one end"],
+    ["Make a parametric gear blank where the number of teeth is driven by a Spreadsheet cell"],
+    ["Create a wine-glass shape by revolving a profile around the Z axis"],
+    ["How do I add a coincident constraint between two endpoints in a Sketcher script?"],
+    ["What is the topological naming problem and how should I avoid it in generated scripts?"],
+    ["Linear pattern of 5 pockets along X with 15mm spacing"],
+    ["Sweep a circle along a helical path to make a spring"],
+    ["Create a Pad with a sketch containing an interior circular hole (multi-loop sketch)"],
+]
+_INDEX_WARNING = (
+    "> **Index not found.** Run `python build_index.py --repo <path-to-freecad-docs>` "
+    "to build the retrieval index before using this app."
 )
+# ── generation handler ────────────────────────────────────────────────────────
+def run(
+    prompt: str,
+    use_bm25: bool,
+    use_dense: bool,
+    use_rerank: bool,
+    top_n: int,
+    model: str,
+):
+    if not prompt.strip():
+        return "", "Please enter a request.", []
+    if not indices_ready():
+        return "", _INDEX_WARNING, []
+    retriever  = HybridRetriever(
+        use_bm25=use_bm25,
+        use_dense=use_dense,
+        use_rerank=use_rerank,
+        top_n=top_n,
+    )
+    try:
+        citations = retriever.retrieve(prompt)
+    except Exception as exc:  # noqa: BLE001
+        return "", f"Retrieval error: {exc}", []
+    code, explain, err = generate_response(
+        query=prompt,
+        citations=citations,
+        model=model,
+    )
+    if err:
+        return "", f"**Error:** {err}", []
+    chunk_rows = [
+        [c.id, c.page_title, c.section, c.source_url, f"{c.score:.4f}"]
+        for c in citations
+    ]
+    return code, explain, chunk_rows
+# ── UI ────────────────────────────────────────────────────────────────────────
+with gr.Blocks(title="FreeCAD RAG Assistant", analytics_enabled=False) as demo:
+    gr.Markdown(
+        "# FreeCAD Python Code Generator\n"
+        "Describe a parametric part and get a complete, runnable FreeCAD 1.1 Python script "
+        "retrieved from the official FreeCAD wiki documentation.\n\n"
+        "> Source: [FreeCAD Wiki](https://wiki.freecad.org), CC-BY 3.0"
+    )
+    with gr.Row():
+        # ── left column: inputs ───────────────────────────────────────────────
+        with gr.Column(scale=2):
+            prompt = gr.Textbox(
+                label="Describe the part or ask a scripting question",
+                lines=4,
+                placeholder="Create a parametric flange with 4 M6 bolt holes on a 60mm PCD…",
+            )
+            with gr.Accordion("Retrieval settings", open=False):
+                use_bm25   = gr.Checkbox(value=True,  label="Enable BM25 (keyword retrieval)")
+                use_dense  = gr.Checkbox(value=True,  label="Enable dense retrieval (semantic)")
+                use_rerank = gr.Checkbox(value=True,  label="Enable cross-encoder reranking")
+                top_n      = gr.Slider(minimum=3, maximum=10, value=5, step=1,
+                                       label="Final chunks passed to LLM (top-N)")
+                model      = gr.Dropdown(
+                    choices=HF_MODELS, value=DEFAULT_MODEL, label="HuggingFace model"
+                )
+            run_btn = gr.Button("Generate", variant="primary")
+            gr.Examples(examples=_EXAMPLES, inputs=[prompt], label="Example queries", cache_examples=False)
+        # ── right column: outputs ─────────────────────────────────────────────
+        with gr.Column(scale=3):
+            code_out    = gr.Code(label="Generated FreeCAD Python", language="python")
+            explain_out = gr.Markdown(label="Explanation & citations")
+            with gr.Accordion("Retrieved chunks", open=False):
+                chunks_out = gr.Dataframe(
+                    headers=["#", "Page", "Section", "URL", "Score"],
+                    wrap=True,
+                    label="Top retrieved chunks",
+                )
+    run_btn.click(
+        fn=run,
+        inputs=[prompt, use_bm25, use_dense, use_rerank, top_n, model],
+        outputs=[code_out, explain_out, chunks_out],
+    )
 if __name__ == "__main__":

build_index.py ADDED Viewed

	@@ -0,0 +1,87 @@

+"""
+One-off script: clone the FreeCAD docs repo, chunk, embed, and build indices.
+Usage:
+    git clone --depth 1 https://github.com/FreeCAD/FreeCAD-documentation freecad-docs
+    python build_index.py --repo freecad-docs
+Outputs written to data/:
+    chunks.parquet   — all chunk metadata + text
+    index.faiss      — FAISS IndexFlatIP of bge-small-en-v1.5 embeddings
+    bm25.pkl         — serialised bm25s index
+"""
+import argparse
+import os
+import pickle
+import bm25s
+import faiss
+import numpy as np
+import pandas as pd
+from sentence_transformers import SentenceTransformer
+from tqdm import tqdm
+from src.chunk import chunk_pages
+from src.config import BM25_FILE, CHUNKS_FILE, EMBED_MODEL, FAISS_FILE
+from src.ingest import load_freecad_docs
+def _embed_batched(model: SentenceTransformer, texts: list[str], batch_size: int = 64) -> np.ndarray:
+    all_vecs = []
+    for i in tqdm(range(0, len(texts), batch_size), desc="Embedding"):
+        batch = texts[i : i + batch_size]
+        vecs = model.encode(batch, normalize_embeddings=True, show_progress_bar=False)
+        all_vecs.append(vecs)
+    return np.vstack(all_vecs).astype("float32")
+def build(repo_root: str, data_dir: str = "data") -> None:
+    os.makedirs(data_dir, exist_ok=True)
+    print("Loading FreeCAD docs...")
+    pages  = load_freecad_docs(repo_root)
+    print(f"  {len(pages)} pages loaded")
+    print("Chunking...")
+    chunks = chunk_pages(pages)
+    print(f"  {len(chunks)} chunks produced")
+    df = pd.DataFrame(chunks).set_index("chunk_id")
+    df.to_parquet(CHUNKS_FILE)
+    print(f"  Saved {CHUNKS_FILE}")
+    texts = df["text"].tolist()
+    # ── BM25 index ────────────────────────────────────────────────────────────
+    print("Building BM25 index...")
+    from src.retrieve import _tokenize  # noqa: PLC0415
+    tokenized = bm25s.tokenize([" ".join(_tokenize(t)) for t in texts])
+    bm25_index = bm25s.BM25(method="bm25+")
+    bm25_index.index(tokenized)
+    with open(BM25_FILE, "wb") as f:
+        pickle.dump(bm25_index, f)
+    print(f"  Saved {BM25_FILE}")
+    # ── Dense index ───────────────────────────────────────────────────────────
+    print(f"Loading embedding model: {EMBED_MODEL}")
+    model = SentenceTransformer(EMBED_MODEL)
+    print("Embedding chunks (this may take a few minutes on CPU)...")
+    vecs = _embed_batched(model, texts)
+    dim   = vecs.shape[1]
+    index = faiss.IndexFlatIP(dim)
+    index.add(vecs)
+    faiss.write_index(index, FAISS_FILE)
+    print(f"  Saved {FAISS_FILE}  ({index.ntotal} vectors, dim={dim})")
+    print("\nDone. Commit the data/ directory to your Spaces repo.")
+if __name__ == "__main__":
+    parser = argparse.ArgumentParser()
+    parser.add_argument("--repo", default="freecad-docs",
+                        help="Path to the cloned FreeCAD-documentation repository")
+    parser.add_argument("--data-dir", default="data")
+    args = parser.parse_args()
+    build(args.repo, args.data_dir)

freecad-docs ADDED Viewed

	@@ -0,0 +1 @@


1	+ Subproject commit 0499378a238ce4c77c643b9cc4a03d0947381e45

requirements.txt ADDED Viewed

	@@ -0,0 +1,13 @@

+gradio>=6.5.1
+openai>=1.50.0
+sentence-transformers>=3.0.0
+bm25s>=0.2.6
+faiss-cpu>=1.8.0
+langchain-text-splitters>=0.3.0
+markdown-it-py>=3.0.0
+huggingface_hub>=0.25.0
+numpy>=2.0
+tiktoken>=0.7.0
+pyarrow>=15.0
+pandas>=2.0.0
+tqdm>=4.66.0

src/__init__.py ADDED Viewed

File without changes

src/__pycache__/__init__.cpython-313.pyc ADDED Viewed

Binary file (151 Bytes). View file

src/__pycache__/chunk.cpython-313.pyc ADDED Viewed

Binary file (5.46 kB). View file

src/__pycache__/citations.cpython-313.pyc ADDED Viewed

Binary file (2.49 kB). View file

src/__pycache__/config.cpython-313.pyc ADDED Viewed

Binary file (806 Bytes). View file

src/__pycache__/generate.cpython-313.pyc ADDED Viewed

Binary file (7.96 kB). View file

src/__pycache__/ingest.cpython-313.pyc ADDED Viewed

Binary file (3.34 kB). View file

src/__pycache__/retrieve.cpython-313.pyc ADDED Viewed

Binary file (10.4 kB). View file

src/chunk.py ADDED Viewed

	@@ -0,0 +1,118 @@

+"""Markdown-aware, code-block-preserving chunker for FreeCAD wiki pages."""
+import re
+import uuid
+from typing import Any
+import tiktoken
+from langchain_text_splitters import MarkdownHeaderTextSplitter, RecursiveCharacterTextSplitter
+from src.config import CHUNK_SIZE, CHUNK_OVERLAP
+_enc = tiktoken.get_encoding("cl100k_base")
+# Matches fenced code blocks (``` or ~~~, with optional language tag)
+_FENCE_RE = re.compile(r"(```[\w]*\n.*?```|~~~[\w]*\n.*?~~~)", re.DOTALL)
+_HEADERS_TO_SPLIT = [("#", "h1"), ("##", "h2"), ("###", "h3")]
+_SPLITTER = RecursiveCharacterTextSplitter(
+    separators=["\n\n", "\n", ". ", " ", ""],
+    chunk_size=CHUNK_SIZE * 4,  # chars; ~4 chars per token
+    chunk_overlap=CHUNK_OVERLAP * 4,
+    length_function=len,
+)
+def _count_tokens(text: str) -> int:
+    return len(_enc.encode(text))
+def _protect_code_blocks(text: str) -> tuple[str, dict[str, str]]:
+    """Replace fenced code blocks with stable placeholders. Returns modified text + map."""
+    placeholders: dict[str, str] = {}
+    def replace(m: re.Match) -> str:
+        key = f"__CODEBLOCK_{uuid.uuid4().hex}__"
+        placeholders[key] = m.group(0)
+        return key
+    return _FENCE_RE.sub(replace, text), placeholders
+def _restore_code_blocks(text: str, placeholders: dict[str, str]) -> str:
+    for key, code in placeholders.items():
+        text = text.replace(key, code)
+    return text
+def _classify(text: str) -> str:
+    has_code = bool(_FENCE_RE.search(text)) or bool(re.search(r"^\s{4}", text, re.MULTILINE))
+    has_prose = bool(re.search(r"[a-zA-Z]{20,}", text))
+    if has_code and has_prose:
+        return "mixed"
+    if has_code:
+        return "code"
+    return "text"
+def chunk_page(page: dict[str, Any]) -> list[dict[str, Any]]:
+    """
+    Split one wiki page dict into a list of chunk dicts ready for embedding.
+    page keys: source_file, page_title, source_url, raw_text, priority
+    """
+    raw = page["raw_text"]
+    protected, placeholders = _protect_code_blocks(raw)
+    # Structural split on headers
+    header_splitter = MarkdownHeaderTextSplitter(
+        headers_to_split_on=_HEADERS_TO_SPLIT, strip_headers=False
+    )
+    sections = header_splitter.split_text(protected)
+    chunks: list[dict[str, Any]] = []
+    for sec in sections:
+        content = sec.page_content
+        meta = sec.metadata  # {"h1": ..., "h2": ..., "h3": ...}
+        # Restore code blocks before deciding whether to split further
+        restored = _restore_code_blocks(content, placeholders)
+        tok_len = _count_tokens(restored)
+        if tok_len <= CHUNK_SIZE:
+            candidates = [restored]
+        else:
+            # Re-protect code blocks for the recursive splitter
+            protected2, ph2 = _protect_code_blocks(restored)
+            raw_splits = _SPLITTER.split_text(protected2)
+            candidates = [_restore_code_blocks(s, ph2) for s in raw_splits]
+        section_label = meta.get("h3") or meta.get("h2") or meta.get("h1") or ""
+        for text in candidates:
+            text = text.strip()
+            if not text or _count_tokens(text) < 30:
+                continue
+            # Build preamble for BM25/embedding quality
+            preamble = f"[Page: {page['page_title']} | Section: {section_label}]\n" if section_label else f"[Page: {page['page_title']}]\n"
+            full_text = preamble + text
+            chunks.append({
+                "source_file": page["source_file"],
+                "source_url": page["source_url"],
+                "page_title": page["page_title"],
+                "section": section_label,
+                "type": _classify(text),
+                "text": full_text,
+                "token_len": _count_tokens(full_text),
+                "char_len": len(full_text),
+            })
+    return chunks
+def chunk_pages(pages: list[dict]) -> list[dict]:
+    all_chunks: list[dict] = []
+    for page in pages:
+        all_chunks.extend(chunk_page(page))
+    for i, c in enumerate(all_chunks):
+        c["chunk_id"] = i
+    return all_chunks

src/citations.py ADDED Viewed

	@@ -0,0 +1,37 @@

+from dataclasses import dataclass, field
+from typing import Optional
+import re
+@dataclass
+class Citation:
+    id: int
+    chunk_id: int
+    source_url: str
+    page_title: str
+    section: str
+    snippet: str
+    score: float = 0.0
+def build_context_block(citations: list[Citation]) -> str:
+    """Format citations as numbered context for the LLM prompt."""
+    parts = []
+    for c in citations:
+        header = f"[{c.id}] (Page: {c.page_title} | Section: {c.section} | URL: {c.source_url})"
+        parts.append(f"{header}\n{c.snippet}")
+    return "\n\n---\n\n".join(parts)
+def extract_inline_refs(text: str) -> set[int]:
+    """Return the set of citation IDs referenced inline, e.g. [1], [2]."""
+    return {int(m) for m in re.findall(r"\[(\d+)\]", text)}
+def render_citation_markdown(citations: list[Citation], used_ids: Optional[set[int]] = None) -> str:
+    lines = ["### Sources"]
+    for c in citations:
+        if used_ids is not None and c.id not in used_ids:
+            continue
+        lines.append(f"{c.id}. [{c.page_title} — {c.section}]({c.source_url})")
+    return "\n".join(lines)

src/config.py ADDED Viewed

	@@ -0,0 +1,26 @@

+EMBED_MODEL   = "BAAI/bge-small-en-v1.5"
+RERANK_MODEL  = "BAAI/bge-reranker-base"
+CHUNK_SIZE    = 700   # tokens
+CHUNK_OVERLAP = 120
+TOP_K_BM25    = 20
+TOP_K_DENSE   = 20
+TOP_K_FUSED   = 30
+TOP_N_FINAL   = 5
+RRF_K         = 60
+DEFAULT_MODEL = "Qwen/Qwen2.5-Coder-32B-Instruct"
+HF_MODELS = [
+    "Qwen/Qwen2.5-Coder-32B-Instruct",
+    "meta-llama/Llama-3.3-70B-Instruct",
+    "meta-llama/Llama-3.1-8B-Instruct",
+    "mistralai/Mistral-7B-Instruct-v0.3",
+]
+DATA_DIR      = "data"
+CHUNKS_FILE   = "data/chunks.parquet"
+FAISS_FILE    = "data/index.faiss"
+BM25_FILE     = "data/bm25.pkl"
+WIKI_BASE_URL = "https://wiki.freecad.org"

src/generate.py ADDED Viewed

	@@ -0,0 +1,185 @@

+"""LLM generation: build prompt, call HuggingFace Inference API, return code + explanation."""
+from __future__ import annotations
+from huggingface_hub import InferenceClient
+from src.citations import Citation, build_context_block, extract_inline_refs, render_citation_markdown
+from src.config import DEFAULT_MODEL
+# ── few-shot examples ─────────────────────────────────────────────────────────
+_FEW_SHOT_BOX = '''
+### Example 1 — Parametric box with fillets
+User: Create a parametric box width=50, height=30, depth=20 with 5mm fillets on vertical edges.
+```python
+import FreeCAD as App
+import Part, Sketcher
+doc = App.newDocument("ParametricBox")
+body   = doc.addObject("PartDesign::Body", "Body")
+sketch = body.newObject("Sketcher::SketchObject", "RectSketch")
+sketch.Support = (body.Origin.OriginFeatures[3], [""])  # XY_Plane
+sketch.MapMode = "FlatFace"
+width, depth, height, fillet_r = 50.0, 20.0, 30.0, 5.0
+sketch.addGeometry(Part.LineSegment(App.Vector(0,0,0),     App.Vector(width,0,0)),    False)
+sketch.addGeometry(Part.LineSegment(App.Vector(width,0,0), App.Vector(width,depth,0)), False)
+sketch.addGeometry(Part.LineSegment(App.Vector(width,depth,0), App.Vector(0,depth,0)), False)
+sketch.addGeometry(Part.LineSegment(App.Vector(0,depth,0), App.Vector(0,0,0)),         False)
+sketch.addConstraint(Sketcher.Constraint("Coincident", 0,2, 1,1))
+sketch.addConstraint(Sketcher.Constraint("Coincident", 1,2, 2,1))
+sketch.addConstraint(Sketcher.Constraint("Coincident", 2,2, 3,1))
+sketch.addConstraint(Sketcher.Constraint("Coincident", 3,2, 0,1))
+sketch.addConstraint(Sketcher.Constraint("Horizontal", 0))
+sketch.addConstraint(Sketcher.Constraint("Horizontal", 2))
+sketch.addConstraint(Sketcher.Constraint("Vertical",   1))
+sketch.addConstraint(Sketcher.Constraint("Vertical",   3))
+sketch.addConstraint(Sketcher.Constraint("DistanceX", 0, 1, 0, 2, width))
+sketch.addConstraint(Sketcher.Constraint("DistanceY", 1, 1, 1, 2, depth))
+doc.recompute()
+pad = body.newObject("PartDesign::Pad", "Pad")
+pad.Profile = sketch
+pad.Length  = height
+doc.recompute()
+fillet = body.newObject("PartDesign::Fillet", "Fillet")
+fillet.Base   = (pad, ["Edge1","Edge2","Edge3","Edge4"])
+fillet.Radius = fillet_r
+doc.recompute()
+doc.saveAs("output.FCStd")
+```
+'''.strip()
+_FEW_SHOT_REVOLVE = '''
+### Example 2 — Revolved profile (cylinder / shaft)
+User: Create a 20mm-diameter, 60mm-long cylindrical shaft using Revolution.
+```python
+import FreeCAD as App
+import Part, Sketcher
+doc  = App.newDocument("Shaft")
+body = doc.addObject("PartDesign::Body", "Body")
+sketch = body.newObject("Sketcher::SketchObject", "Profile")
+sketch.Support = (body.Origin.OriginFeatures[4], [""])  # XZ_Plane
+sketch.MapMode = "FlatFace"
+radius, length = 10.0, 60.0
+sketch.addGeometry(Part.LineSegment(App.Vector(0,0,0),      App.Vector(radius,0,0)),      False)
+sketch.addGeometry(Part.LineSegment(App.Vector(radius,0,0), App.Vector(radius,length,0)), False)
+sketch.addGeometry(Part.LineSegment(App.Vector(radius,length,0), App.Vector(0,length,0)), False)
+sketch.addGeometry(Part.LineSegment(App.Vector(0,length,0), App.Vector(0,0,0)),           False)
+sketch.addConstraint(Sketcher.Constraint("Coincident", 0,2, 1,1))
+sketch.addConstraint(Sketcher.Constraint("Coincident", 1,2, 2,1))
+sketch.addConstraint(Sketcher.Constraint("Coincident", 2,2, 3,1))
+sketch.addConstraint(Sketcher.Constraint("Coincident", 3,2, 0,1))
+sketch.addConstraint(Sketcher.Constraint("DistanceX", 0, 1, 0, 2, radius))
+sketch.addConstraint(Sketcher.Constraint("DistanceY", 1, 1, 1, 2, length))
+sketch.addConstraint(Sketcher.Constraint("PointOnObject", 0, 1, -1))  # origin on Y-axis
+doc.recompute()
+rev = body.newObject("PartDesign::Revolution", "Revolution")
+rev.Profile     = sketch
+rev.ReferenceAxis = (sketch, ["V_Axis"])
+rev.Angle       = 360.0
+doc.recompute()
+doc.saveAs("output.FCStd")
+```
+'''.strip()
+# ── system prompt ─────────────────────────────────────────────────────────────
+_SYSTEM_PROMPT = f"""You are an expert FreeCAD 1.1 Python scripting assistant specialised in \
+parametric solid modelling with the PartDesign and Sketcher workbenches.
+OUTPUT CONTRACT (strict):
+1. Return ONE complete, self-contained Python script enclosed in a single ```python ... ``` block, \
+runnable with `freecadcmd script.py`.
+2. The script MUST:
+   - import FreeCAD as App, Part, Sketcher (never import PartDesignGui / FreeCADGui / SketcherGui — they crash headless)
+   - call App.newDocument(...)
+   - create a PartDesign::Body BEFORE any Sketch
+   - attach every Sketch to a standard plane from body.Origin.OriginFeatures (index 3=XY, 4=XZ, 5=YZ)
+   - call doc.recompute() after EVERY feature creation
+   - end with doc.saveAs("output.FCStd")
+3. Use named variables for every dimension so the model is parametric.
+4. Reference geometry by INDEX where possible (e.g. Sketcher.Constraint("Coincident", 0, 2, 1, 1)), \
+NOT by topological name strings like "Face1" or "Edge3", to minimise Topological Naming Problem risk \
+(mitigated but NOT eliminated in FreeCAD 1.0/1.1).
+5. Add all dress-up features (Fillet, Chamfer, etc.) AFTER all additive/subtractive features.
+6. After the code block, write one short paragraph explaining the key design decisions.
+7. Cite the retrieved sources inline as [1], [2], etc. in comments and in the explanation.
+8. End with a numbered citation list: `1. <Page Title> — <URL>`
+KNOWN PITFALLS (never repeat these errors):
+- Missing doc.recompute() → silent failure
+- Mixing App.ActiveDocument and the captured doc variable
+- Creating PartDesign features via doc.addObject instead of body.newObject
+- Importing *Gui modules in headless scripts
+{_FEW_SHOT_BOX}
+{_FEW_SHOT_REVOLVE}
+""".strip()
+# ── main entry point ──────────────────────────────────────────────────────────
+def generate_response(
+    query: str,
+    citations: list[Citation],
+    model: str = DEFAULT_MODEL,
+) -> tuple[str, str, str]:
+    """
+    Returns (code_block, explanation_md, error_msg).
+    code_block: the raw python code (no fences).
+    explanation_md: explanation + citations markdown.
+    error_msg: non-empty string on failure.
+    """
+    if not citations:
+        return "", "", "No relevant documentation chunks were retrieved. Try broadening the query."
+    client = InferenceClient()
+    context = build_context_block(citations)
+    user_msg = f"RETRIEVED CONTEXT:\n{context}\n\nUSER REQUEST:\n{query}"
+    try:
+        resp = client.chat.completions.create(
+            model=model,
+            messages=[
+                {"role": "system", "content": _SYSTEM_PROMPT},
+                {"role": "user",   "content": user_msg},
+            ],
+            temperature=0.2,
+            max_tokens=2500,
+        )
+    except Exception as exc:  # noqa: BLE001
+        return "", "", f"HuggingFace API error: {exc}"
+    full_text = resp.choices[0].message.content or ""
+    # Split code block from rest of response
+    code_match = __import__("re").search(r"```python\n(.*?)```", full_text, __import__("re").DOTALL)
+    if code_match:
+        code = code_match.group(1).rstrip()
+        after_code = full_text[code_match.end():].strip()
+    else:
+        code = full_text
+        after_code = ""
+    used_ids = extract_inline_refs(full_text)
+    cite_md  = render_citation_markdown(citations, used_ids or None)
+    explain  = (after_code + "\n\n" + cite_md).strip()
+    return code, explain, ""

src/ingest.py ADDED Viewed

	@@ -0,0 +1,62 @@

+"""Parse the FreeCAD-documentation repo into a list of page dicts."""
+import os
+import re
+from pathlib import Path
+from typing import Iterator
+from src.config import WIKI_BASE_URL
+_SKIP_PATTERNS = re.compile(
+    r"(Category:|File:|Template:|Special:|MediaWiki:|User:|Talk:|^index\.md$)",
+    re.IGNORECASE,
+)
+_PRIORITY_PAGES = {
+    "Python_scripting_tutorial", "FreeCAD_Scripting_Basics", "Scripting_and_macros",
+    "Part_scripting", "Sketcher_scripting", "PartDesign_scripting",
+    "Topological_naming_problem", "Scripted_objects", "Scripted_objects_migration",
+    "PartDesign_Pad", "PartDesign_Pocket", "PartDesign_Revolution", "PartDesign_Body",
+    "PartDesign_Fillet", "PartDesign_Chamfer", "PartDesign_Hole", "PartDesign_Boolean",
+    "PartDesign_AdditiveLoft", "PartDesign_AdditivePipe", "PartDesign_Workbench",
+    "Sketcher_Workbench", "Release_notes_1.0", "Release_notes_1.1",
+    "Basic_Part_Design_Tutorial_019", "Creating_a_simple_part_with_PartDesign",
+    "Spreadsheet_Workbench",
+}
+def _page_title(stem: str) -> str:
+    return stem.replace("_", " ")
+def _source_url(stem: str) -> str:
+    return f"{WIKI_BASE_URL}/{stem}"
+def iter_pages(wiki_dir: str | Path) -> Iterator[dict]:
+    wiki_path = Path(wiki_dir)
+    for md_file in sorted(wiki_path.glob("*.md")):
+        stem = md_file.stem
+        if _SKIP_PATTERNS.search(stem):
+            continue
+        raw = md_file.read_text(encoding="utf-8", errors="replace")
+        if len(raw.strip()) < 200:
+            continue
+        yield {
+            "source_file": str(md_file),
+            "page_title": _page_title(stem),
+            "source_url": _source_url(stem),
+            "raw_text": raw,
+            "priority": stem in _PRIORITY_PAGES,
+        }
+def load_freecad_docs(repo_root: str) -> list[dict]:
+    wiki_dir = os.path.join(repo_root, "wiki")
+    if not os.path.isdir(wiki_dir):
+        raise FileNotFoundError(
+            f"Expected wiki/ directory at {wiki_dir}. "
+            "Clone https://github.com/FreeCAD/FreeCAD-documentation first."
+        )
+    pages = list(iter_pages(wiki_dir))
+    pages.sort(key=lambda p: (not p["priority"], p["page_title"]))
+    return pages

src/retrieve.py ADDED Viewed

	@@ -0,0 +1,199 @@

+"""Retrieval: BM25 + Dense (FAISS) + RRF fusion + cross-encoder reranking."""
+from __future__ import annotations
+import os
+import pickle
+import re
+from typing import Optional
+import numpy as np
+import pandas as pd
+from src.citations import Citation
+from src.config import (
+    BM25_FILE, CHUNKS_FILE, EMBED_MODEL, FAISS_FILE,
+    RRF_K, RERANK_MODEL, TOP_K_BM25, TOP_K_DENSE, TOP_K_FUSED, TOP_N_FINAL,
+)
+# ── tokeniser ────────────────────────────────────────────────────────────────
+_TOKEN_RE  = re.compile(r"[A-Za-z_][A-Za-z0-9_.:]*|\d+")
+_CAMEL_RE  = re.compile(r"(?<!^)(?=[A-Z])")
+_STOP      = {"the","a","an","of","to","in","is","are","and","or","this","that","it","be"}
+def _tokenize(text: str) -> list[str]:
+    tokens = _TOKEN_RE.findall(text)
+    out: list[str] = []
+    for t in tokens:
+        tl = t.lower()
+        if tl in _STOP:
+            continue
+        out.append(tl)
+        parts = _CAMEL_RE.split(t)
+        if len(parts) > 1:
+            out.extend(p.lower() for p in parts if p and p.lower() not in _STOP)
+        for sub in re.split(r"[._:]+", t):
+            if sub and sub.lower() not in _STOP and sub.lower() != tl:
+                out.append(sub.lower())
+    return out
+# ── lazy singletons ───────────────────────────────────────────────────────────
+_chunks_df: Optional[pd.DataFrame]    = None
+_bm25_index                           = None
+_faiss_index                          = None
+_embed_model                          = None
+_rerank_model                         = None
+def _load_chunks() -> pd.DataFrame:
+    global _chunks_df
+    if _chunks_df is None:
+        if not os.path.exists(CHUNKS_FILE):
+            raise FileNotFoundError(
+                f"{CHUNKS_FILE} not found. Run `python build_index.py` first."
+            )
+        _chunks_df = pd.read_parquet(CHUNKS_FILE)
+    return _chunks_df
+def _load_bm25():
+    global _bm25_index
+    if _bm25_index is None:
+        if not os.path.exists(BM25_FILE):
+            raise FileNotFoundError(f"{BM25_FILE} not found.")
+        with open(BM25_FILE, "rb") as f:
+            _bm25_index = pickle.load(f)
+    return _bm25_index
+def _load_faiss():
+    global _faiss_index
+    if _faiss_index is None:
+        import faiss  # noqa: PLC0415
+        if not os.path.exists(FAISS_FILE):
+            raise FileNotFoundError(f"{FAISS_FILE} not found.")
+        _faiss_index = faiss.read_index(FAISS_FILE)
+    return _faiss_index
+def _load_embed():
+    global _embed_model
+    if _embed_model is None:
+        from sentence_transformers import SentenceTransformer  # noqa: PLC0415
+        _embed_model = SentenceTransformer(EMBED_MODEL)
+    return _embed_model
+def _load_reranker():
+    global _rerank_model
+    if _rerank_model is None:
+        from sentence_transformers import CrossEncoder  # noqa: PLC0415
+        _rerank_model = CrossEncoder(RERANK_MODEL)
+    return _rerank_model
+def indices_ready() -> bool:
+    return all(os.path.exists(p) for p in (CHUNKS_FILE, BM25_FILE, FAISS_FILE))
+# ── retrieval methods ─────────────────────────────────────────────────────────
+def _bm25_search(query: str, top_k: int) -> list[tuple[int, float]]:
+    """Returns [(chunk_id, score), ...]."""
+    import bm25s  # noqa: PLC0415
+    bm25 = _load_bm25()
+    query_tokens_arr = bm25s.tokenize([" ".join(_tokenize(query))])
+    results, scores = bm25.retrieve(query_tokens_arr, k=top_k)
+    return list(zip(results[0].tolist(), scores[0].tolist()))
+def _dense_search(query: str, top_k: int) -> list[tuple[int, float]]:
+    """Returns [(chunk_id, score), ...]."""
+    model  = _load_embed()
+    index  = _load_faiss()
+    # BGE models expect a query prefix
+    vec = model.encode(f"Represent this sentence for searching relevant passages: {query}",
+                       normalize_embeddings=True).reshape(1, -1).astype("float32")
+    scores, ids = index.search(vec, top_k)
+    return [(int(i), float(s)) for i, s in zip(ids[0], scores[0]) if i >= 0]
+def _rrf_fuse(
+    bm25_hits: list[tuple[int, float]],
+    dense_hits: list[tuple[int, float]],
+    k: int = RRF_K,
+    top_n: int = TOP_K_FUSED,
+) -> list[tuple[int, float]]:
+    scores: dict[int, float] = {}
+    for rank, (cid, _) in enumerate(bm25_hits):
+        scores[cid] = scores.get(cid, 0.0) + 1.0 / (k + rank + 1)
+    for rank, (cid, _) in enumerate(dense_hits):
+        scores[cid] = scores.get(cid, 0.0) + 1.0 / (k + rank + 1)
+    ranked = sorted(scores.items(), key=lambda x: x[1], reverse=True)
+    return ranked[:top_n]
+def _rerank(query: str, hits: list[tuple[int, float]], top_n: int, df: pd.DataFrame) -> list[tuple[int, float]]:
+    reranker = _load_reranker()
+    pairs = [(query, df.loc[cid, "text"]) for cid, _ in hits]
+    scores = reranker.predict(pairs)
+    ranked = sorted(zip([cid for cid, _ in hits], scores), key=lambda x: x[1], reverse=True)
+    return [(int(cid), float(s)) for cid, s in ranked[:top_n]]
+# ── public API ────────────────────────────────────────────────────────────────
+class HybridRetriever:
+    def __init__(
+        self,
+        use_bm25:   bool = True,
+        use_dense:  bool = True,
+        use_rerank: bool = True,
+        top_n:      int  = TOP_N_FINAL,
+    ):
+        self.use_bm25   = use_bm25
+        self.use_dense  = use_dense
+        self.use_rerank = use_rerank
+        self.top_n      = top_n
+    def retrieve(self, query: str) -> list[Citation]:
+        df = _load_chunks()
+        bm25_hits:  list[tuple[int, float]] = []
+        dense_hits: list[tuple[int, float]] = []
+        if self.use_bm25:
+            bm25_hits = _bm25_search(query, TOP_K_BM25)
+        if self.use_dense:
+            dense_hits = _dense_search(query, TOP_K_DENSE)
+        if self.use_bm25 and self.use_dense:
+            fused = _rrf_fuse(bm25_hits, dense_hits)
+        elif self.use_bm25:
+            fused = bm25_hits[:TOP_K_FUSED]
+        elif self.use_dense:
+            fused = dense_hits[:TOP_K_FUSED]
+        else:
+            return []
+        if self.use_rerank and len(fused) > 0:
+            final = _rerank(query, fused, self.top_n, df)
+        else:
+            final = fused[:self.top_n]
+        citations: list[Citation] = []
+        for rank, (cid, score) in enumerate(final, start=1):
+            row = df.loc[cid]
+            citations.append(Citation(
+                id=rank,
+                chunk_id=int(cid),
+                source_url=str(row["source_url"]),
+                page_title=str(row["page_title"]),
+                section=str(row.get("section", "")),
+                snippet=str(row["text"])[:600],
+                score=float(score),
+            ))
+        return citations