Spaces:

boy177
/

DevDocs

Sleeping

App Files Files Community

DevDocs / README.md

manan75

changed settings

7497600 about 1 month ago

preview code

raw

history blame contribute delete

7.95 kB

	---
	title: DevDocs AI
	emoji: 🤖
	colorFrom: indigo
	colorTo: green
	sdk: gradio
	sdk_version: "5.9.1"
	python_version: "3.10"
	app_file: app.py
	pinned: false
	---

	# DevDocsAI

	# DevDocs AI — Codebase RAG Assistant

	A production-quality Retrieval-Augmented Generation system for querying codebases with natural language. Upload any ZIP archive, index it once, and ask questions about the code.
	![alt text](one.png)

	## Architecture

	```
	User Query
	│
	▼
	[Query Rewriter] ← optional rule-based or LLM rewrite
	│
	▼
	[Retriever] ← similarity search OR MMR (configurable)
	│ ChromaDB + HuggingFace all-MiniLM-L6-v2 embeddings
	▼
	[Retrieved Chunks]
	│
	├──→ [LLM Generator] → Answer (gpt-4.1-nano, 1 call)
	│
	└──→ [Evaluator]
	├── Retrieval Metrics (Recall@K, MRR, nDCG) — FREE
	└── LLM Judge (Accuracy, Completeness, Relevance, Groundedness) — 1 call
	```

	## Cost Model

	\| Operation \| Cost \|
	\|----------------------\|------------------\|
	\| Embedding (indexing) \| FREE (local) \|
	\| Embedding (query) \| FREE (local) \|
	\| Answer generation \| ~$0.0001 / query \|
	\| LLM judge evaluation \| ~$0.0001 / query \|
	\| Query rewriting (LLM)\| ~$0.00005 / query\|

	> At $5 budget you can run ~25,000 queries with full evaluation enabled.


	## Project Structure

	```
	devdocs-ai/
	├── app.py # Gradio UI (3 tabs)
	├── config.py # All configuration in one place
	├── requirements.txt
	├── .env.example
	│
	├── ingestion/
	│ ├── __init__.py
	│ ├── loader.py # ZIP extraction + file reading
	│ ├── chunker.py # AST-aware Python chunking + generic splitter
	│ └── indexer.py # HuggingFace embeddings + ChromaDB persistence
	│
	├── retrieval/
	│ ├── __init__.py
	│ ├── retriever.py # Similarity + MMR search
	│ └── query_rewriter.py # Rule-based + optional LLM rewrite
	│
	├── llm/
	│ ├── __init__.py
	│ └── generator.py # Grounded answer generation via litellm
	│
	├── evaluation/
	│ ├── __init__.py
	│ ├── metrics.py # Recall@K, MRR, nDCG (free, keyword-based)
	│ └── judge.py # LLM-as-judge (Accuracy/Completeness/Relevance/Groundedness)
	│
	├── utils/
	│ ├── __init__.py
	│ └── helpers.py # Logging, display formatters
	│
	└── data/
	├── uploads/ # Extracted ZIP contents (auto-created)
	└── vector_db/ # ChromaDB persistent storage (auto-created)
	```

	## Quick Start

	### 1. Clone / download the project

	```bash
	cd devdocs-ai
	```

	### 2. Create virtual environment

	```bash
	python -m venv venv
	source venv/bin/activate # Linux/macOS
	# venv\Scripts\activate # Windows
	```

	### 3. Install dependencies

	```bash
	pip install -r requirements.txt
	```

	> First run will download the `all-MiniLM-L6-v2` model (~90 MB) automatically.

	### 4. Set your OpenAI API key

	```bash
	cp .env.example .env
	# Edit .env and set OPENAI_API_KEY=sk-...
	```

	Or export directly:

	```bash
	export OPENAI_API_KEY="sk-your-key-here"
	```

	### 5. Launch the app

	```bash
	python app.py
	```

	Open http://localhost:7860 in your browser.

	---

	## Usage Guide

	### Tab 1 — Index Repository
	![alt text](two.png)
	1. Click Upload ZIP file and select your repository archive.
	2. Click 🚀 Index Repository.
	3. Wait for the status message — indexing is one-time per repository.

	> Re-indexing a new ZIP clears the previous index automatically.

	### Tab 2 — Ask Questions

	1. Type a natural language question.
	2. Configure retrieval options:
	- Top-K: number of chunks to retrieve (default 5)
	- Use MMR: diversity-aware retrieval (avoids redundant chunks)
	- Use query rewriting: expands abbreviations before retrieval
	- Run evaluation: computes all metrics (costs 1 extra LLM call)
	3. Click 🔍 Ask.
	4. View the Answer, Retrieved Chunks, and Metrics Panel.
	![alt text](<three.png>)


	### Tab 3 — Compare Modes

	Run both Similarity and MMR retrieval side-by-side for the same question to compare answer quality and chunk diversity.
	![alt text](<four.png>)
	---

	## Configuration Reference

	All parameters are in `config.py`:

	\| Parameter \| Default \| Description \|
	\|------------------------\|-----------------------\|------------------------------------------\|
	\| `EMBEDDING_MODEL` \| `all-MiniLM-L6-v2` \| HuggingFace sentence-transformer model \|
	\| `CHUNK_SIZE` \| `400` tokens \| Target chunk size \|
	\| `CHUNK_OVERLAP` \| `60` tokens \| Overlap between consecutive chunks \|
	\| `DEFAULT_TOP_K` \| `5` \| Chunks retrieved per query \|
	\| `MMR_FETCH_K` \| `20` \| Candidate pool size for MMR \|
	\| `MMR_LAMBDA_MULT` \| `0.5` \| MMR diversity/relevance balance (0–1) \|
	\| `LLM_MODEL` \| `openai/gpt-4.1-nano` \| LLM for answer generation \|
	\| `LLM_MAX_TOKENS` \| `1024` \| Max tokens in LLM response \|
	\| `ALLOWED_EXTENSIONS` \| `.py .js .ts .md ...` \| File types included in indexing \|
	\| `MAX_FILE_SIZE_MB` \| `2` \| Files larger than this are skipped \|

	---

	## Evaluation Metrics Explained

	### Retrieval Metrics (free, keyword-based proxy)

	\| Metric \| Formula \| Range \|
	\|------------\|--------------------------------------------------\|-------\|
	\| Recall@K \| relevant retrieved / K \| 0–1 \|
	\| MRR \| 1 / rank of first relevant doc \| 0–1 \|
	\| nDCG@K \| DCG / IDCG using binary relevance \| 0–1 \|

	> Relevance is determined by keyword overlap between query and chunk (≥2 shared tokens).

	### Answer Quality (LLM judge, 1 call)

	\| Dimension \| Meaning \| Scale \|
	\|---------------\|---------------------------------------------------\|-------\|
	\| Accuracy \| Every claim is factually correct given context \| 1–5 \|
	\| Completeness \| All parts of the question are addressed \| 1–5 \|
	\| Relevance \| Answer is focused and on-topic \| 1–5 \|
	\| Groundedness \| All claims are directly supported by context \| 1–5 \|
	\| Overall \| Mean of the four scores \| 1–5 \|

	---
	![alt text](<Screenshot 2026-03-28 113804.png>)
	## Supported File Types

	`.py` `.js` `.ts` `.jsx` `.tsx` `.md` `.txt` `.java` `.go` `.rs` `.cpp` `.c` `.h`

	---

	## Chunking Strategy

	\| File Type \| Strategy \|
	\|---------------\|-----------------------------------------------------------------\|
	\| `.py` \| AST-based: one chunk per top-level function/class \|
	\| All others \| Recursive character splitter (400-token chunks, 60-token overlap)\|

	Python files that fail AST parsing (e.g. syntax errors) fall back to the generic splitter automatically.

	---

	## Troubleshooting

	"Vector store is empty" error
	→ Index a repository first via Tab 1.

	Slow first query
	→ The embedding model is downloaded on first use (~90 MB). Subsequent runs are fast.

	"No API key" warnings
	→ Set `OPENAI_API_KEY` in `.env` or as an environment variable.

	ChromaDB dimension mismatch error
	→ Delete `data/vector_db/` and re-index. This happens if you switch embedding models mid-session.

	```bash
	rm -rf data/vector_db/
	```

	Out of memory on large repos
	→ Lower `MAX_FILE_SIZE_MB` in `config.py` or reduce `CHUNK_SIZE`.