A newer version of the Gradio SDK is available: 6.19.0
CLAUDE.md
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
Commands
# Install dependencies
pip install -r requirements.txt
# Build retrieval indices (one-time, requires freecad-docs/ to be cloned)
git clone --depth 1 https://github.com/FreeCAD/FreeCAD-documentation freecad-docs
python build_index.py --repo freecad-docs
# Run the Gradio app
python app.py
The app requires data/chunks.parquet, data/index.faiss, and data/bm25.pkl to exist before it will serve requests. indices_ready() in src/retrieve.py checks for these.
Architecture
The system is a two-phase pipeline: offline indexing (build_index.py) and online serving (app.py).
Offline: build_index.py
Reads FreeCAD wiki markdown from freecad-docs/wiki/, passes pages through src/ingest.py β src/chunk.py, then builds two indices written to data/:
- BM25 (
bm25s,bm25.pkl) β tokenised with a custom camelCase/snake_case tokeniser insrc/retrieve.py:_tokenize - Dense (
FAISS IndexFlatIP,index.faiss) β embeddings fromBAAI/bge-small-en-v1.5
Online: app.py β src/retrieve.py β src/generate.py
HybridRetriever.retrieve(query)runs BM25 + dense search, fuses with Reciprocal Rank Fusion (k=60), optionally reranks withBAAI/bge-reranker-basecross-encoder, returns top-NCitationobjects.generate_response()formats citations into a numbered context block, prepends the system prompt (with two few-shot examples), and calls the OpenAI chat API.- The response is split into a
pythoncode block and a prose explanation with inline[N]citation references.
Key files
src/config.pyβ all tuneable constants (chunk size, top-K values, model names, file paths). Change retrieval hyperparameters here.src/chunk.pyβ header-split + code-block-preserving chunker. Fenced code blocks are replaced with UUID placeholders before splitting so they are never broken mid-block.src/retrieve.pyβ all retrieval logic including lazy model singletons (_load_*functions) that are cached at module level for the Gradio process lifetime.src/generate.pyβ system prompt, two few-shot examples (parametric box, revolve), and the OpenAI call. The few-shot examples are the authoritative reference for expected script style.src/citations.pyβCitationdataclass, context block formatter, and citation markdown renderer.src/ingest.pyβ walksfreecad-docs/wiki/*.md, skips Category/Template/MediaWiki pages, and flags ~25 high-priority scripting pages for front-sorting.
FreeCAD script generation constraints
All generated scripts must:
- Target FreeCAD 1.1 (released March 25, 2026)
- Never import
*Guimodules β they crash headless (freecadcmd) - Use
body.newObject(...)notdoc.addObject(...)for PartDesign features - Call
doc.recompute()after every feature - Add dress-up features (Fillet, Chamfer) only after all additive/subtractive features
- Reference geometry by index to minimise Topological Naming Problem risk
These rules are encoded in _SYSTEM_PROMPT in src/generate.py and must stay consistent with any few-shot examples added there.