| --- |
| title: DevDocs AI |
| emoji: π€ |
| colorFrom: indigo |
| colorTo: green |
| sdk: gradio |
| sdk_version: "5.9.1" |
| python_version: "3.10" |
| app_file: app.py |
| pinned: false |
| --- |
| |
| # DevDocsAI |
|
|
| # DevDocs AI β Codebase RAG Assistant |
|
|
| A production-quality **Retrieval-Augmented Generation** system for querying codebases with natural language. Upload any ZIP archive, index it once, and ask questions about the code. |
|  |
|
|
| ## Architecture |
|
|
| ``` |
| User Query |
| β |
| βΌ |
| [Query Rewriter] β optional rule-based or LLM rewrite |
| β |
| βΌ |
| [Retriever] β similarity search OR MMR (configurable) |
| β ChromaDB + HuggingFace all-MiniLM-L6-v2 embeddings |
| βΌ |
| [Retrieved Chunks] |
| β |
| ββββ [LLM Generator] β Answer (gpt-4.1-nano, 1 call) |
| β |
| ββββ [Evaluator] |
| βββ Retrieval Metrics (Recall@K, MRR, nDCG) β FREE |
| βββ LLM Judge (Accuracy, Completeness, Relevance, Groundedness) β 1 call |
| ``` |
|
|
| ## Cost Model |
|
|
| | Operation | Cost | |
| |----------------------|------------------| |
| | Embedding (indexing) | **FREE** (local) | |
| | Embedding (query) | **FREE** (local) | |
| | Answer generation | ~$0.0001 / query | |
| | LLM judge evaluation | ~$0.0001 / query | |
| | Query rewriting (LLM)| ~$0.00005 / query| |
|
|
| > At $5 budget you can run ~25,000 queries with full evaluation enabled. |
|
|
|
|
| ## Project Structure |
|
|
| ``` |
| devdocs-ai/ |
| βββ app.py # Gradio UI (3 tabs) |
| βββ config.py # All configuration in one place |
| βββ requirements.txt |
| βββ .env.example |
| β |
| βββ ingestion/ |
| β βββ __init__.py |
| β βββ loader.py # ZIP extraction + file reading |
| β βββ chunker.py # AST-aware Python chunking + generic splitter |
| β βββ indexer.py # HuggingFace embeddings + ChromaDB persistence |
| β |
| βββ retrieval/ |
| β βββ __init__.py |
| β βββ retriever.py # Similarity + MMR search |
| β βββ query_rewriter.py # Rule-based + optional LLM rewrite |
| β |
| βββ llm/ |
| β βββ __init__.py |
| β βββ generator.py # Grounded answer generation via litellm |
| β |
| βββ evaluation/ |
| β βββ __init__.py |
| β βββ metrics.py # Recall@K, MRR, nDCG (free, keyword-based) |
| β βββ judge.py # LLM-as-judge (Accuracy/Completeness/Relevance/Groundedness) |
| β |
| βββ utils/ |
| β βββ __init__.py |
| β βββ helpers.py # Logging, display formatters |
| β |
| βββ data/ |
| βββ uploads/ # Extracted ZIP contents (auto-created) |
| βββ vector_db/ # ChromaDB persistent storage (auto-created) |
| ``` |
|
|
| ## Quick Start |
|
|
| ### 1. Clone / download the project |
|
|
| ```bash |
| cd devdocs-ai |
| ``` |
|
|
| ### 2. Create virtual environment |
|
|
| ```bash |
| python -m venv venv |
| source venv/bin/activate # Linux/macOS |
| # venv\Scripts\activate # Windows |
| ``` |
|
|
| ### 3. Install dependencies |
|
|
| ```bash |
| pip install -r requirements.txt |
| ``` |
|
|
| > First run will download the `all-MiniLM-L6-v2` model (~90 MB) automatically. |
|
|
| ### 4. Set your OpenAI API key |
|
|
| ```bash |
| cp .env.example .env |
| # Edit .env and set OPENAI_API_KEY=sk-... |
| ``` |
|
|
| Or export directly: |
|
|
| ```bash |
| export OPENAI_API_KEY="sk-your-key-here" |
| ``` |
|
|
| ### 5. Launch the app |
|
|
| ```bash |
| python app.py |
| ``` |
|
|
| Open **http://localhost:7860** in your browser. |
|
|
| --- |
|
|
| ## Usage Guide |
|
|
| ### Tab 1 β Index Repository |
|  |
| 1. Click **Upload ZIP file** and select your repository archive. |
| 2. Click **π Index Repository**. |
| 3. Wait for the status message β indexing is one-time per repository. |
|
|
| > Re-indexing a new ZIP clears the previous index automatically. |
|
|
| ### Tab 2 β Ask Questions |
|
|
| 1. Type a natural language question. |
| 2. Configure retrieval options: |
| - **Top-K**: number of chunks to retrieve (default 5) |
| - **Use MMR**: diversity-aware retrieval (avoids redundant chunks) |
| - **Use query rewriting**: expands abbreviations before retrieval |
| - **Run evaluation**: computes all metrics (costs 1 extra LLM call) |
| 3. Click **π Ask**. |
| 4. View the **Answer**, **Retrieved Chunks**, and **Metrics Panel**. |
|  |
| |
|
|
| ### Tab 3 β Compare Modes |
|
|
| Run both **Similarity** and **MMR** retrieval side-by-side for the same question to compare answer quality and chunk diversity. |
|  |
| --- |
|
|
| ## Configuration Reference |
|
|
| All parameters are in `config.py`: |
|
|
| | Parameter | Default | Description | |
| |------------------------|-----------------------|------------------------------------------| |
| | `EMBEDDING_MODEL` | `all-MiniLM-L6-v2` | HuggingFace sentence-transformer model | |
| | `CHUNK_SIZE` | `400` tokens | Target chunk size | |
| | `CHUNK_OVERLAP` | `60` tokens | Overlap between consecutive chunks | |
| | `DEFAULT_TOP_K` | `5` | Chunks retrieved per query | |
| | `MMR_FETCH_K` | `20` | Candidate pool size for MMR | |
| | `MMR_LAMBDA_MULT` | `0.5` | MMR diversity/relevance balance (0β1) | |
| | `LLM_MODEL` | `openai/gpt-4.1-nano` | LLM for answer generation | |
| | `LLM_MAX_TOKENS` | `1024` | Max tokens in LLM response | |
| | `ALLOWED_EXTENSIONS` | `.py .js .ts .md ...` | File types included in indexing | |
| | `MAX_FILE_SIZE_MB` | `2` | Files larger than this are skipped | |
|
|
| --- |
|
|
| ## Evaluation Metrics Explained |
|
|
| ### Retrieval Metrics (free, keyword-based proxy) |
|
|
| | Metric | Formula | Range | |
| |------------|--------------------------------------------------|-------| |
| | Recall@K | relevant retrieved / K | 0β1 | |
| | MRR | 1 / rank of first relevant doc | 0β1 | |
| | nDCG@K | DCG / IDCG using binary relevance | 0β1 | |
|
|
| > Relevance is determined by keyword overlap between query and chunk (β₯2 shared tokens). |
|
|
| ### Answer Quality (LLM judge, 1 call) |
|
|
| | Dimension | Meaning | Scale | |
| |---------------|---------------------------------------------------|-------| |
| | Accuracy | Every claim is factually correct given context | 1β5 | |
| | Completeness | All parts of the question are addressed | 1β5 | |
| | Relevance | Answer is focused and on-topic | 1β5 | |
| | Groundedness | All claims are directly supported by context | 1β5 | |
| | Overall | Mean of the four scores | 1β5 | |
|
|
| --- |
|  |
| ## Supported File Types |
|
|
| `.py` `.js` `.ts` `.jsx` `.tsx` `.md` `.txt` `.java` `.go` `.rs` `.cpp` `.c` `.h` |
|
|
| --- |
|
|
| ## Chunking Strategy |
|
|
| | File Type | Strategy | |
| |---------------|-----------------------------------------------------------------| |
| | `.py` | AST-based: one chunk per top-level function/class | |
| | All others | Recursive character splitter (400-token chunks, 60-token overlap)| |
|
|
| Python files that fail AST parsing (e.g. syntax errors) fall back to the generic splitter automatically. |
|
|
| --- |
|
|
| ## Troubleshooting |
|
|
| **"Vector store is empty" error** |
| β Index a repository first via Tab 1. |
|
|
| **Slow first query** |
| β The embedding model is downloaded on first use (~90 MB). Subsequent runs are fast. |
|
|
| **"No API key" warnings** |
| β Set `OPENAI_API_KEY` in `.env` or as an environment variable. |
|
|
| **ChromaDB dimension mismatch error** |
| β Delete `data/vector_db/` and re-index. This happens if you switch embedding models mid-session. |
|
|
| ```bash |
| rm -rf data/vector_db/ |
| ``` |
|
|
| **Out of memory on large repos** |
| β Lower `MAX_FILE_SIZE_MB` in `config.py` or reduce `CHUNK_SIZE`. |