File size: 7,951 Bytes
967d35a 49f9e95 967d35a 7497600 967d35a f9e2c6d 7773c49 f9e2c6d 7773c49 f9e2c6d | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 | ---
title: DevDocs AI
emoji: π€
colorFrom: indigo
colorTo: green
sdk: gradio
sdk_version: "5.9.1"
python_version: "3.10"
app_file: app.py
pinned: false
---
# DevDocsAI
# DevDocs AI β Codebase RAG Assistant
A production-quality **Retrieval-Augmented Generation** system for querying codebases with natural language. Upload any ZIP archive, index it once, and ask questions about the code.

## Architecture
```
User Query
β
βΌ
[Query Rewriter] β optional rule-based or LLM rewrite
β
βΌ
[Retriever] β similarity search OR MMR (configurable)
β ChromaDB + HuggingFace all-MiniLM-L6-v2 embeddings
βΌ
[Retrieved Chunks]
β
ββββ [LLM Generator] β Answer (gpt-4.1-nano, 1 call)
β
ββββ [Evaluator]
βββ Retrieval Metrics (Recall@K, MRR, nDCG) β FREE
βββ LLM Judge (Accuracy, Completeness, Relevance, Groundedness) β 1 call
```
## Cost Model
| Operation | Cost |
|----------------------|------------------|
| Embedding (indexing) | **FREE** (local) |
| Embedding (query) | **FREE** (local) |
| Answer generation | ~$0.0001 / query |
| LLM judge evaluation | ~$0.0001 / query |
| Query rewriting (LLM)| ~$0.00005 / query|
> At $5 budget you can run ~25,000 queries with full evaluation enabled.
## Project Structure
```
devdocs-ai/
βββ app.py # Gradio UI (3 tabs)
βββ config.py # All configuration in one place
βββ requirements.txt
βββ .env.example
β
βββ ingestion/
β βββ __init__.py
β βββ loader.py # ZIP extraction + file reading
β βββ chunker.py # AST-aware Python chunking + generic splitter
β βββ indexer.py # HuggingFace embeddings + ChromaDB persistence
β
βββ retrieval/
β βββ __init__.py
β βββ retriever.py # Similarity + MMR search
β βββ query_rewriter.py # Rule-based + optional LLM rewrite
β
βββ llm/
β βββ __init__.py
β βββ generator.py # Grounded answer generation via litellm
β
βββ evaluation/
β βββ __init__.py
β βββ metrics.py # Recall@K, MRR, nDCG (free, keyword-based)
β βββ judge.py # LLM-as-judge (Accuracy/Completeness/Relevance/Groundedness)
β
βββ utils/
β βββ __init__.py
β βββ helpers.py # Logging, display formatters
β
βββ data/
βββ uploads/ # Extracted ZIP contents (auto-created)
βββ vector_db/ # ChromaDB persistent storage (auto-created)
```
## Quick Start
### 1. Clone / download the project
```bash
cd devdocs-ai
```
### 2. Create virtual environment
```bash
python -m venv venv
source venv/bin/activate # Linux/macOS
# venv\Scripts\activate # Windows
```
### 3. Install dependencies
```bash
pip install -r requirements.txt
```
> First run will download the `all-MiniLM-L6-v2` model (~90 MB) automatically.
### 4. Set your OpenAI API key
```bash
cp .env.example .env
# Edit .env and set OPENAI_API_KEY=sk-...
```
Or export directly:
```bash
export OPENAI_API_KEY="sk-your-key-here"
```
### 5. Launch the app
```bash
python app.py
```
Open **http://localhost:7860** in your browser.
---
## Usage Guide
### Tab 1 β Index Repository

1. Click **Upload ZIP file** and select your repository archive.
2. Click **π Index Repository**.
3. Wait for the status message β indexing is one-time per repository.
> Re-indexing a new ZIP clears the previous index automatically.
### Tab 2 β Ask Questions
1. Type a natural language question.
2. Configure retrieval options:
- **Top-K**: number of chunks to retrieve (default 5)
- **Use MMR**: diversity-aware retrieval (avoids redundant chunks)
- **Use query rewriting**: expands abbreviations before retrieval
- **Run evaluation**: computes all metrics (costs 1 extra LLM call)
3. Click **π Ask**.
4. View the **Answer**, **Retrieved Chunks**, and **Metrics Panel**.

### Tab 3 β Compare Modes
Run both **Similarity** and **MMR** retrieval side-by-side for the same question to compare answer quality and chunk diversity.

---
## Configuration Reference
All parameters are in `config.py`:
| Parameter | Default | Description |
|------------------------|-----------------------|------------------------------------------|
| `EMBEDDING_MODEL` | `all-MiniLM-L6-v2` | HuggingFace sentence-transformer model |
| `CHUNK_SIZE` | `400` tokens | Target chunk size |
| `CHUNK_OVERLAP` | `60` tokens | Overlap between consecutive chunks |
| `DEFAULT_TOP_K` | `5` | Chunks retrieved per query |
| `MMR_FETCH_K` | `20` | Candidate pool size for MMR |
| `MMR_LAMBDA_MULT` | `0.5` | MMR diversity/relevance balance (0β1) |
| `LLM_MODEL` | `openai/gpt-4.1-nano` | LLM for answer generation |
| `LLM_MAX_TOKENS` | `1024` | Max tokens in LLM response |
| `ALLOWED_EXTENSIONS` | `.py .js .ts .md ...` | File types included in indexing |
| `MAX_FILE_SIZE_MB` | `2` | Files larger than this are skipped |
---
## Evaluation Metrics Explained
### Retrieval Metrics (free, keyword-based proxy)
| Metric | Formula | Range |
|------------|--------------------------------------------------|-------|
| Recall@K | relevant retrieved / K | 0β1 |
| MRR | 1 / rank of first relevant doc | 0β1 |
| nDCG@K | DCG / IDCG using binary relevance | 0β1 |
> Relevance is determined by keyword overlap between query and chunk (β₯2 shared tokens).
### Answer Quality (LLM judge, 1 call)
| Dimension | Meaning | Scale |
|---------------|---------------------------------------------------|-------|
| Accuracy | Every claim is factually correct given context | 1β5 |
| Completeness | All parts of the question are addressed | 1β5 |
| Relevance | Answer is focused and on-topic | 1β5 |
| Groundedness | All claims are directly supported by context | 1β5 |
| Overall | Mean of the four scores | 1β5 |
---

## Supported File Types
`.py` `.js` `.ts` `.jsx` `.tsx` `.md` `.txt` `.java` `.go` `.rs` `.cpp` `.c` `.h`
---
## Chunking Strategy
| File Type | Strategy |
|---------------|-----------------------------------------------------------------|
| `.py` | AST-based: one chunk per top-level function/class |
| All others | Recursive character splitter (400-token chunks, 60-token overlap)|
Python files that fail AST parsing (e.g. syntax errors) fall back to the generic splitter automatically.
---
## Troubleshooting
**"Vector store is empty" error**
β Index a repository first via Tab 1.
**Slow first query**
β The embedding model is downloaded on first use (~90 MB). Subsequent runs are fast.
**"No API key" warnings**
β Set `OPENAI_API_KEY` in `.env` or as an environment variable.
**ChromaDB dimension mismatch error**
β Delete `data/vector_db/` and re-index. This happens if you switch embedding models mid-session.
```bash
rm -rf data/vector_db/
```
**Out of memory on large repos**
β Lower `MAX_FILE_SIZE_MB` in `config.py` or reduce `CHUNK_SIZE`. |