Spaces:

boy177
/

DevDocs

Sleeping

App Files Files Community

manan75 commited on Mar 28

Commit

f9e2c6d

1 Parent(s): 7773c49

initial commit

Browse files

Files changed (20) hide show

.gitignore +23 -0
README.md +232 -9
app.py +643 -0
config.py +75 -0
dev.zip +3 -0
evaluation/__init__.py +1 -0
evaluation/judge.py +126 -0
evaluation/metrics.py +131 -0
ingestion/__init__.py +1 -0
ingestion/chunker.py +133 -0
ingestion/indexer.py +106 -0
ingestion/loader.py +85 -0
llm/__init__.py +1 -0
llm/generator.py +105 -0
requirements.txt +25 -0
retrieval/__init__.py +1 -0
retrieval/query_rewriter.py +103 -0
retrieval/retriever.py +64 -0
utils/__init__.py +1 -0
utils/helpers.py +100 -0

.gitignore ADDED Viewed

	@@ -0,0 +1,23 @@

+# Python
+__pycache__/
+*.py[cod]
+*.pyo
+venv/
+.venv/
+*.egg-info/
+# Environment
+.env
+# Data (generated at runtime)
+data/uploads/
+data/vector_db/
+# IDE
+.vscode/
+.idea/
+*.swp
+# OS
+.DS_Store
+Thumbs.db

README.md CHANGED Viewed

@@ -1,12 +1,235 @@
 ---
-title: DevDocs
-emoji: ⚡
-colorFrom: green
-colorTo: red
-sdk: gradio
-sdk_version: 6.10.0
-app_file: app.py
-pinned: false
 ---
-Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference

+# DevDocsAI
+# DevDocs AI — Codebase RAG Assistant
+A production-quality **Retrieval-Augmented Generation** system for querying codebases with natural language. Upload any ZIP archive, index it once, and ask questions about the code.
+![alt text](one.png)
+## Architecture
+```
+User Query
+    │
+    ▼
+[Query Rewriter]  ← optional rule-based or LLM rewrite
+    │
+    ▼
+[Retriever]  ← similarity search OR MMR (configurable)
+    │         ChromaDB + HuggingFace all-MiniLM-L6-v2 embeddings
+    ▼
+[Retrieved Chunks]
+    │
+    ├──→ [LLM Generator]  → Answer  (gpt-4.1-nano, 1 call)
+    │
+    └──→ [Evaluator]
+              ├── Retrieval Metrics (Recall@K, MRR, nDCG) — FREE
+              └── LLM Judge (Accuracy, Completeness, Relevance, Groundedness) — 1 call
+```
+## Cost Model
+| Operation            | Cost             |
+|----------------------|------------------|
+| Embedding (indexing) | **FREE** (local) |
+| Embedding (query)    | **FREE** (local) |
+| Answer generation    | ~$0.0001 / query |
+| LLM judge evaluation | ~$0.0001 / query |
+| Query rewriting (LLM)| ~$0.00005 / query|
+> At $5 budget you can run ~25,000 queries with full evaluation enabled.
+## Project Structure
+```
+devdocs-ai/
+├── app.py                    # Gradio UI (3 tabs)
+├── config.py                 # All configuration in one place
+├── requirements.txt
+├── .env.example
+│
+├── ingestion/
+│   ├── __init__.py
+│   ├── loader.py             # ZIP extraction + file reading
+│   ├── chunker.py            # AST-aware Python chunking + generic splitter
+│   └── indexer.py            # HuggingFace embeddings + ChromaDB persistence
+│
+├── retrieval/
+│   ├── __init__.py
+│   ├── retriever.py          # Similarity + MMR search
+│   └── query_rewriter.py     # Rule-based + optional LLM rewrite
+│
+├── llm/
+│   ├── __init__.py
+│   └── generator.py          # Grounded answer generation via litellm
+│
+├── evaluation/
+│   ├── __init__.py
+│   ├── metrics.py            # Recall@K, MRR, nDCG (free, keyword-based)
+│   └── judge.py              # LLM-as-judge (Accuracy/Completeness/Relevance/Groundedness)
+│
+├── utils/
+│   ├── __init__.py
+│   └── helpers.py            # Logging, display formatters
+│
+└── data/
+    ├── uploads/              # Extracted ZIP contents (auto-created)
+    └── vector_db/            # ChromaDB persistent storage (auto-created)
+```
+## Quick Start
+### 1. Clone / download the project
+```bash
+cd devdocs-ai
+```
+### 2. Create virtual environment
+```bash
+python -m venv venv
+source venv/bin/activate        # Linux/macOS
+# venv\Scripts\activate         # Windows
+```
+### 3. Install dependencies
+```bash
+pip install -r requirements.txt
+```
+> First run will download the `all-MiniLM-L6-v2` model (~90 MB) automatically.
+### 4. Set your OpenAI API key
+```bash
+cp .env.example .env
+# Edit .env and set OPENAI_API_KEY=sk-...
+```
+Or export directly:
+```bash
+export OPENAI_API_KEY="sk-your-key-here"
+```
+### 5. Launch the app
+```bash
+python app.py
+```
+Open **http://localhost:7860** in your browser.
+---
+## Usage Guide
+### Tab 1 — Index Repository
+![alt text](two.png)
+1. Click **Upload ZIP file** and select your repository archive.
+2. Click **🚀 Index Repository**.
+3. Wait for the status message — indexing is one-time per repository.
+> Re-indexing a new ZIP clears the previous index automatically.
+### Tab 2 — Ask Questions
+1. Type a natural language question.
+2. Configure retrieval options:
+   - **Top-K**: number of chunks to retrieve (default 5)
+   - **Use MMR**: diversity-aware retrieval (avoids redundant chunks)
+   - **Use query rewriting**: expands abbreviations before retrieval
+   - **Run evaluation**: computes all metrics (costs 1 extra LLM call)
+3. Click **🔍 Ask**.
+4. View the **Answer**, **Retrieved Chunks**, and **Metrics Panel**.
+ ![alt text](<three.png>)
+### Tab 3 — Compare Modes
+Run both **Similarity** and **MMR** retrieval side-by-side for the same question to compare answer quality and chunk diversity.
+ ![alt text](<four.png>)
+---
+## Configuration Reference
+All parameters are in `config.py`:
+| Parameter              | Default               | Description                              |
+|------------------------|-----------------------|------------------------------------------|
+| `EMBEDDING_MODEL`      | `all-MiniLM-L6-v2`   | HuggingFace sentence-transformer model   |
+| `CHUNK_SIZE`           | `400` tokens          | Target chunk size                        |
+| `CHUNK_OVERLAP`        | `60` tokens           | Overlap between consecutive chunks      |
+| `DEFAULT_TOP_K`        | `5`                   | Chunks retrieved per query               |
+| `MMR_FETCH_K`          | `20`                  | Candidate pool size for MMR              |
+| `MMR_LAMBDA_MULT`      | `0.5`                 | MMR diversity/relevance balance (0–1)    |
+| `LLM_MODEL`            | `openai/gpt-4.1-nano` | LLM for answer generation                |
+| `LLM_MAX_TOKENS`       | `1024`                | Max tokens in LLM response               |
+| `ALLOWED_EXTENSIONS`   | `.py .js .ts .md ...` | File types included in indexing          |
+| `MAX_FILE_SIZE_MB`     | `2`                   | Files larger than this are skipped       |
+---
+## Evaluation Metrics Explained
+### Retrieval Metrics (free, keyword-based proxy)
+| Metric     | Formula                                          | Range |
+|------------|--------------------------------------------------|-------|
+| Recall@K   | relevant retrieved / K                           | 0–1   |
+| MRR        | 1 / rank of first relevant doc                   | 0–1   |
+| nDCG@K     | DCG / IDCG using binary relevance                | 0–1   |
+> Relevance is determined by keyword overlap between query and chunk (≥2 shared tokens).
+### Answer Quality (LLM judge, 1 call)
+| Dimension     | Meaning                                           | Scale |
+|---------------|---------------------------------------------------|-------|
+| Accuracy      | Every claim is factually correct given context    | 1–5   |
+| Completeness  | All parts of the question are addressed           | 1–5   |
+| Relevance     | Answer is focused and on-topic                    | 1–5   |
+| Groundedness  | All claims are directly supported by context      | 1–5   |
+| Overall       | Mean of the four scores                           | 1–5   |
 ---
+![alt text](<Screenshot 2026-03-28 113804.png>)
+## Supported File Types
+`.py` `.js` `.ts` `.jsx` `.tsx` `.md` `.txt` `.java` `.go` `.rs` `.cpp` `.c` `.h`
 ---
+## Chunking Strategy
+| File Type     | Strategy                                                        |
+|---------------|-----------------------------------------------------------------|
+| `.py`         | AST-based: one chunk per top-level function/class               |
+| All others    | Recursive character splitter (400-token chunks, 60-token overlap)|
+Python files that fail AST parsing (e.g. syntax errors) fall back to the generic splitter automatically.
+---
+## Troubleshooting
+**"Vector store is empty" error**
+→ Index a repository first via Tab 1.
+**Slow first query**
+→ The embedding model is downloaded on first use (~90 MB). Subsequent runs are fast.
+**"No API key" warnings**
+→ Set `OPENAI_API_KEY` in `.env` or as an environment variable.
+**ChromaDB dimension mismatch error**
+→ Delete `data/vector_db/` and re-index. This happens if you switch embedding models mid-session.
+```bash
+rm -rf data/vector_db/
+```
+**Out of memory on large repos**
+→ Lower `MAX_FILE_SIZE_MB` in `config.py` or reduce `CHUNK_SIZE`.

app.py ADDED Viewed

	@@ -0,0 +1,643 @@

+"""app.py — Gradio UI for DevDocs AI: Codebase RAG Assistant.
+A polished, product-like interface with a softer visual language,
+modern typography, improved spacing, and clearer output cards.
+Dashboard tabs:
+  1. Index Repository — upload ZIP, trigger ingestion pipeline.
+  2. Ask Questions    — query the indexed codebase with configurable retrieval.
+  3. Compare Modes    — side-by-side similarity vs MMR retrieval.
+"""
+import logging
+import shutil
+from pathlib import Path
+from typing import Tuple
+import gradio as gr
+from config import UPLOAD_DIR, DEFAULT_TOP_K
+from ingestion.loader import extract_zip, load_files
+from ingestion.chunker import chunk_documents
+from ingestion.indexer import index_documents, is_index_populated
+from retrieval.retriever import retrieve
+from retrieval.query_rewriter import rewrite_query
+from llm.generator import generate_answer
+from evaluation.metrics import compute_retrieval_metrics
+from evaluation.judge import judge_answer
+from utils.helpers import setup_logging, format_chunks_for_display, format_metrics_for_display
+setup_logging(logging.INFO)
+logger = logging.getLogger(__name__)
+# ──────────────────────────────────────────────────────────────────────────────
+# Pipeline functions
+# ──────────────────────────────────────────────────────────────────────────────
+def run_indexing(zip_file) -> str:
+    """Gradio handler: extract ZIP → load files → chunk → embed → index."""
+    if zip_file is None:
+        return "❌ Please upload a ZIP file first."
+    try:
+        src = Path(zip_file.name)
+        dest = UPLOAD_DIR / src.name
+        shutil.copy2(src, dest)
+        gr.Info("📦 Extracting ZIP archive...")
+        extract_dir = extract_zip(str(dest))
+        gr.Info("📂 Loading source files...")
+        raw_docs = load_files(extract_dir)
+        if not raw_docs:
+            return "⚠️ No supported source files found in the ZIP."
+        gr.Info(f"✂️ Chunking {len(raw_docs)} files...")
+        chunks = chunk_documents(raw_docs)
+        gr.Info(f"🧠 Embedding and indexing {len(chunks)} chunks...")
+        index_documents(chunks)
+        return (
+            f"✅ Indexing complete!\n\n"
+            f"Files processed: {len(raw_docs)}\n"
+            f"Chunks indexed: {len(chunks)}\n"
+            f"Status: Ready to query"
+        )
+    except Exception as e:
+        logger.exception("Indexing failed")
+        return f"❌ Indexing failed: {e}"
+def run_query(
+    query: str,
+    use_mmr: bool,
+    use_rewriting: bool,
+    top_k: int,
+    run_evaluation: bool,
+) -> Tuple[str, str, str]:
+    """Gradio handler: rewrite query → retrieve → generate answer → evaluate."""
+    if not query.strip():
+        return "❌ Please enter a question.", "", ""
+    if not is_index_populated():
+        return "❌ No index found. Please index a repository first.", "", ""
+    try:
+        effective_query = query
+        if use_rewriting:
+            gr.Info("🔄 Rewriting query...")
+            effective_query = rewrite_query(query, use_llm=False)
+        search_type = "mmr" if use_mmr else "similarity"
+        gr.Info(f"🔍 Retrieving with {search_type.upper()}...")
+        docs, scores = retrieve(effective_query, search_type=search_type, top_k=int(top_k))
+        context_display = format_chunks_for_display(docs, scores)
+        if effective_query != query:
+            context_display = f"🔄 Rewritten query: \"{effective_query}\"\n\n" + context_display
+        gr.Info("💬 Generating answer...")
+        answer, _source_files = generate_answer(query, docs)
+        metrics_display = ""
+        if run_evaluation:
+            gr.Info("📊 Running evaluation...")
+            retrieval_metrics = compute_retrieval_metrics(query, docs)
+            answer_scores = judge_answer(query, docs, answer)
+            metrics_display = format_metrics_for_display(retrieval_metrics, answer_scores)
+        else:
+            metrics_display = "ℹ️ Enable 'Run evaluation' to see metrics."
+        return answer, context_display, metrics_display
+    except Exception as e:
+        logger.exception("Query failed")
+        return f"❌ Error: {e}", "", ""
+def run_comparison(query: str, top_k: int) -> Tuple[str, str, str, str]:
+    """Gradio handler: run both similarity and MMR side-by-side."""
+    if not query.strip():
+        return "❌ Please enter a question.", "", "", ""
+    if not is_index_populated():
+        msg = "❌ No index found."
+        return msg, "", msg, ""
+    try:
+        k = int(top_k)
+        sim_docs, sim_scores = retrieve(query, search_type="similarity", top_k=k)
+        mmr_docs, mmr_scores = retrieve(query, search_type="mmr", top_k=k)
+        sim_answer, _ = generate_answer(query, sim_docs)
+        mmr_answer, _ = generate_answer(query, mmr_docs)
+        sim_context = format_chunks_for_display(sim_docs, sim_scores)
+        mmr_context = format_chunks_for_display(mmr_docs, mmr_scores)
+        return sim_answer, sim_context, mmr_answer, mmr_context
+    except Exception as e:
+        logger.exception("Comparison failed")
+        err = f"❌ Error: {e}"
+        return err, "", err, ""
+# ──────────────────────────────────────────────────────────────────────────────
+# Theme + Styling
+# ──────────────────────────────────────────────────────────────────────────────
+THEME = gr.themes.Soft(
+    primary_hue="indigo",
+    secondary_hue="cyan",
+    neutral_hue="slate",
+    font=gr.themes.GoogleFont("Inter"),
+)
+CSS = """
+:root {
+  --bg-0: #0b1020;
+  --bg-1: #11162a;
+  --bg-2: #151b31;
+  --card: rgba(17, 24, 39, 0.72);
+  --card-strong: rgba(15, 23, 42, 0.92);
+  --card-border: rgba(148, 163, 184, 0.14);
+  --text-main: #e5e7eb;
+  --text-soft: #94a3b8;
+  --accent: #8b5cf6;
+  --accent-2: #22c55e;
+  --accent-3: #38bdf8;
+  --danger: #f87171;
+  --shadow: 0 20px 60px rgba(0, 0, 0, 0.25);
+}
+html, body {
+  background:
+    radial-gradient(circle at top left, rgba(139,92,246,0.18), transparent 28%),
+    radial-gradient(circle at top right, rgba(56,189,248,0.14), transparent 22%),
+    linear-gradient(180deg, var(--bg-0), var(--bg-1) 45%, #0a0f1d 100%) !important;
+  color: var(--text-main) !important;
+}
+.gradio-container {
+  max-width: 1240px !important;
+  margin: 0 auto !important;
+}
+/* Main shell */
+#app-shell {
+  border: 1px solid var(--card-border);
+  background: linear-gradient(180deg, rgba(17,24,39,0.84), rgba(15,23,42,0.74));
+  box-shadow: var(--shadow);
+  border-radius: 28px;
+  padding: 22px;
+  backdrop-filter: blur(18px);
+}
+/* Hero */
+.hero-wrap {
+  display: grid;
+  grid-template-columns: 1.4fr 0.8fr;
+  gap: 18px;
+  align-items: stretch;
+  margin-bottom: 18px;
+}
+.hero-card, .mini-card, .section-card {
+  background: var(--card);
+  border: 1px solid var(--card-border);
+  border-radius: 24px;
+  box-shadow: 0 12px 30px rgba(0, 0, 0, 0.16);
+  backdrop-filter: blur(14px);
+}
+.hero-card {
+  padding: 24px 24px 22px;
+}
+.hero-kicker {
+  display: inline-flex;
+  align-items: center;
+  gap: 8px;
+  padding: 8px 12px;
+  border-radius: 999px;
+  background: rgba(139,92,246,0.14);
+  color: #d8b4fe;
+  font-size: 0.82rem;
+  font-weight: 600;
+  letter-spacing: 0.02em;
+  margin-bottom: 14px;
+}
+.hero-title {
+  margin: 0;
+  font-size: clamp(2rem, 3vw, 3.1rem);
+  line-height: 1.05;
+  letter-spacing: -0.03em;
+  color: #f8fafc;
+}
+.hero-subtitle {
+  margin-top: 12px;
+  color: var(--text-soft);
+  font-size: 1rem;
+  line-height: 1.65;
+  max-width: 68ch;
+}
+.hero-badges {
+  display: flex;
+  flex-wrap: wrap;
+  gap: 10px;
+  margin-top: 18px;
+}
+.badge-pill {
+  display: inline-flex;
+  align-items: center;
+  gap: 8px;
+  padding: 9px 12px;
+  border-radius: 999px;
+  font-size: 0.86rem;
+  color: #e2e8f0;
+  background: rgba(15,23,42,0.55);
+  border: 1px solid rgba(148,163,184,0.16);
+}
+.mini-card {
+  padding: 18px;
+  display: flex;
+  flex-direction: column;
+  justify-content: space-between;
+}
+.mini-card h4 {
+  margin: 0 0 8px;
+  color: #f8fafc;
+  font-size: 1rem;
+}
+.mini-card p {
+  margin: 0;
+  color: var(--text-soft);
+  line-height: 1.6;
+  font-size: 0.95rem;
+}
+.mini-grid {
+  display: grid;
+  grid-template-columns: 1fr 1fr;
+  gap: 10px;
+  margin-top: 14px;
+}
+.stat {
+  border-radius: 18px;
+  padding: 14px;
+  background: rgba(15,23,42,0.72);
+  border: 1px solid rgba(148,163,184,0.12);
+}
+.stat .label {
+  color: var(--text-soft);
+  font-size: 0.78rem;
+  margin-bottom: 6px;
+}
+.stat .value {
+  color: #f8fafc;
+  font-size: 1rem;
+  font-weight: 700;
+}
+/* Tabs */
+.tab-nav {
+  margin-top: 8px !important;
+}
+.gradio-tabs .tab-nav button {
+  border-radius: 999px !important;
+  border: 1px solid rgba(148,163,184,0.14) !important;
+  background: rgba(15,23,42,0.55) !important;
+  color: #cbd5e1 !important;
+  padding: 10px 14px !important;
+  transition: all 0.2s ease !important;
+}
+.gradio-tabs .tab-nav button.selected {
+  background: linear-gradient(135deg, rgba(139,92,246,0.95), rgba(59,130,246,0.85)) !important;
+  color: white !important;
+  box-shadow: 0 12px 24px rgba(91, 33, 182, 0.25) !important;
+}
+/* Sections and widgets */
+.section-card {
+  padding: 18px;
+  margin-bottom: 14px;
+}
+.section-title {
+  margin: 0 0 6px;
+  font-size: 1.05rem;
+  color: #f8fafc;
+  letter-spacing: -0.01em;
+}
+.section-desc {
+  margin: 0;
+  color: var(--text-soft);
+  font-size: 0.95rem;
+  line-height: 1.6;
+}
+textarea, input, .wrap, .prose, .markdown, .svelte-textbox, .svelte-slider, .svelte-checkbox {
+  font-family: Inter, ui-sans-serif, system-ui, -apple-system, BlinkMacSystemFont, "Segoe UI", sans-serif !important;
+}
+textarea, .gr-textbox textarea, .gr-textbox input, .gr-file, .gr-number input {
+  background: rgba(15,23,42,0.72) !important;
+  color: var(--text-main) !important;
+  border: 1px solid rgba(148,163,184,0.14) !important;
+  border-radius: 18px !important;
+}
+.gr-textbox label, .gr-slider label, .gr-checkbox label, .gr-file label {
+  color: #e2e8f0 !important;
+  font-weight: 600 !important;
+}
+.gr-button {
+  border-radius: 16px !important;
+  border: 1px solid rgba(255,255,255,0.08) !important;
+  padding: 12px 16px !important;
+  font-weight: 700 !important;
+  letter-spacing: 0.01em;
+}
+.gr-button.primary {
+  background: linear-gradient(135deg, #8b5cf6, #3b82f6) !important;
+  color: white !important;
+  box-shadow: 0 16px 30px rgba(59,130,246,0.22) !important;
+}
+.gr-button:hover {
+  transform: translateY(-1px);
+}
+/* Outputs */
+.answer-box, .metric-box, .chunk-box, .output-card {
+  border-radius: 22px !important;
+  border: 1px solid rgba(148,163,184,0.14) !important;
+  background: rgba(2, 6, 23, 0.48) !important;
+  box-shadow: 0 12px 30px rgba(0,0,0,0.14);
+}
+.answer-box {
+  padding: 16px !important;
+  line-height: 1.75 !important;
+}
+.answer-box h1, .answer-box h2, .answer-box h3, .answer-box h4 {
+  color: #f8fafc !important;
+  letter-spacing: -0.02em;
+}
+.answer-box p, .answer-box li {
+  color: #e2e8f0 !important;
+}
+.answer-box code, .chunk-box code, .metric-box code {
+  background: rgba(15,23,42,0.9) !important;
+  color: #e2e8f0 !important;
+  border-radius: 8px !important;
+  padding: 0.12rem 0.35rem !important;
+}
+.chunk-box, .metric-box {
+  padding: 14px !important;
+  white-space: pre-wrap !important;
+  color: #cbd5e1 !important;
+  line-height: 1.7 !important;
+}
+/* Make the built-in markdown areas feel cleaner */
+.prose, .markdown {
+  color: #e2e8f0 !important;
+}
+.prose h1, .prose h2, .prose h3, .markdown h1, .markdown h2, .markdown h3 {
+  color: #f8fafc !important;
+}
+footer { display: none !important; }
+/* Responsive */
+@media (max-width: 1000px) {
+  .hero-wrap { grid-template-columns: 1fr; }
+}
+"""
+# ──────────────────────────────────────────────────────────────────────────────
+# UI helpers
+# ──────────────────────────────────────────────────────────────────────────────
+def hero_panel() -> str:
+    return """
+<div class="hero-wrap">
+  <div class="hero-card">
+    <div class="hero-kicker">✨ DevDocs AI · Codebase RAG Assistant</div>
+    <h1 class="hero-title">A calm, premium workspace for exploring your codebase.</h1>
+    <p class="hero-subtitle">
+      Upload a repository ZIP, index it once, and ask natural-language questions with a cleaner
+      reading experience. The interface keeps the workflow fast while feeling intentionally designed,
+      not template-generated.
+    </p>
+    <div class="hero-badges">
+      <span class="badge-pill">⚡ Fast indexing flow</span>
+      <span class="badge-pill">🧠 Query rewriting</span>
+      <span class="badge-pill">🔎 Similarity + MMR</span>
+      <span class="badge-pill">📊 Built-in evaluation</span>
+    </div>
+  </div>
+  <div class="mini-card">
+    <div>
+      <h4>What this interface emphasizes</h4>
+      <p>
+        Clear hierarchy, softer contrast, rounded surfaces, better spacing, and output cards that are easier to scan.
+      </p>
+    </div>
+    <div class="mini-grid">
+      <div class="stat">
+        <div class="label">Primary feel</div>
+        <div class="value">Modern glass UI</div>
+      </div>
+      <div class="stat">
+        <div class="label">Typography</div>
+        <div class="value">Inter</div>
+      </div>
+      <div class="stat">
+        <div class="label">Tone</div>
+        <div class="value">Soft + premium</div>
+      </div>
+      <div class="stat">
+        <div class="label">Outputs</div>
+        <div class="value">Readable cards</div>
+      </div>
+    </div>
+  </div>
+</div>
+"""
+def section_block(title: str, desc: str) -> str:
+    return f"""
+<div class="section-card">
+  <div class="section-title">{title}</div>
+  <p class="section-desc">{desc}</p>
+</div>
+"""
+# ──────────────────────────────────────────────────────────────────────────────
+# Build UI
+# ─────────────────────────────────────────────────────────────��────────────────
+def build_ui() -> gr.Blocks:
+    with gr.Blocks(theme=THEME, css=CSS, title="DevDocs AI") as demo:
+        with gr.Column(elem_id="app-shell"):
+            gr.HTML(hero_panel())
+            with gr.Tabs(elem_classes=["tab-nav"]):
+                # ── Tab 1: Index ──────────────────────────────────────────────
+                with gr.Tab("📦 Index Repository"):
+                    gr.HTML(section_block(
+                        "Step 1 — Add your codebase",
+                        "Upload a ZIP file, extract it, chunk the files, and build the local vector index."
+                    ))
+                    with gr.Row():
+                        with gr.Column(scale=2):
+                            zip_input = gr.File(
+                                label="Upload ZIP file",
+                                file_types=[".zip"],
+                                type="filepath",
+                            )
+                            index_btn = gr.Button("🚀 Index Repository", variant="primary", size="lg")
+                        with gr.Column(scale=3):
+                            index_status = gr.Textbox(
+                                label="Indexing Status",
+                                lines=9,
+                                interactive=False,
+                                placeholder="Status will appear here after indexing...",
+                            )
+                    index_btn.click(
+                        fn=run_indexing,
+                        inputs=[zip_input],
+                        outputs=[index_status],
+                    )
+                # ── Tab 2: Query ──────────────────────────────────────────────
+                with gr.Tab("💬 Ask Questions"):
+                    gr.HTML(section_block(
+                        "Step 2 — Ask about the code",
+                        "Use retrieval settings to control how the assistant searches the indexed repository."
+                    ))
+                    with gr.Row():
+                        with gr.Column(scale=3):
+                            query_input = gr.Textbox(
+                                label="Your Question",
+                                placeholder="e.g. How does the authentication flow work?",
+                                lines=2,
+                            )
+                        with gr.Column(scale=1):
+                            top_k_slider = gr.Slider(
+                                minimum=1,
+                                maximum=15,
+                                value=DEFAULT_TOP_K,
+                                step=1,
+                                label="Top-K chunks",
+                            )
+                    with gr.Row():
+                        use_mmr_toggle = gr.Checkbox(label="Use MMR retrieval", value=False)
+                        use_rewrite_toggle = gr.Checkbox(label="Use query rewriting", value=False)
+                        run_eval_toggle = gr.Checkbox(label="Run evaluation (costs 1 LLM call)", value=True)
+                        query_btn = gr.Button("🔍 Ask", variant="primary")
+                    with gr.Row():
+                        with gr.Column(scale=2):
+                            gr.HTML('<div class="section-title">Answer</div>')
+                            answer_output = gr.Markdown(elem_classes=["answer-box"])
+                        with gr.Column(scale=1):
+                            metrics_output = gr.Textbox(
+                                label="📊 Evaluation Metrics",
+                                lines=18,
+                                interactive=False,
+                                elem_classes=["metric-box"],
+                            )
+                    gr.HTML('<div class="section-title">Retrieved Context</div>')
+                    context_output = gr.Textbox(
+                        label="",
+                        lines=15,
+                        interactive=False,
+                        elem_classes=["chunk-box"],
+                    )
+                    query_btn.click(
+                        fn=run_query,
+                        inputs=[query_input, use_mmr_toggle, use_rewrite_toggle, top_k_slider, run_eval_toggle],
+                        outputs=[answer_output, context_output, metrics_output],
+                    )
+                # ── Tab 3: Compare ────────────────────────────────────────────
+                with gr.Tab("⚖️ Compare: Similarity vs MMR"):
+                    gr.HTML(section_block(
+                        "Step 3 — Compare retrieval styles",
+                        "Run similarity and MMR side-by-side to inspect how the context and answer change."
+                    ))
+                    with gr.Row():
+                        cmp_query = gr.Textbox(
+                            label="Question",
+                            placeholder="e.g. Where is database initialisation handled?",
+                            lines=2,
+                            scale=4,
+                        )
+                        cmp_top_k = gr.Slider(
+                            minimum=1,
+                            maximum=10,
+                            value=4,
+                            step=1,
+                            label="Top-K",
+                            scale=1,
+                        )
+                    cmp_btn = gr.Button("⚖️ Compare", variant="primary")
+                    with gr.Row():
+                        with gr.Column():
+                            gr.HTML('<div class="section-title">Similarity Search</div>')
+                            sim_answer_out = gr.Markdown(elem_classes=["answer-box"])
+                            sim_context_out = gr.Textbox(
+                                lines=10,
+                                interactive=False,
+                                label="Chunks",
+                                elem_classes=["chunk-box"],
+                            )
+                        with gr.Column():
+                            gr.HTML('<div class="section-title">MMR Search</div>')
+                            mmr_answer_out = gr.Markdown(elem_classes=["answer-box"])
+                            mmr_context_out = gr.Textbox(
+                                lines=10,
+                                interactive=False,
+                                label="Chunks",
+                                elem_classes=["chunk-box"],
+                            )
+                    cmp_btn.click(
+                        fn=run_comparison,
+                        inputs=[cmp_query, cmp_top_k],
+                        outputs=[sim_answer_out, sim_context_out, mmr_answer_out, mmr_context_out],
+                    )
+            gr.Markdown(
+                """
+                <div style="margin-top: 18px; padding: 14px 6px 0; color: #94a3b8; font-size: 0.9rem; line-height: 1.7;">
+                    <strong style="color:#e2e8f0;">DevDocs AI</strong> · Embeddings: <code>all-MiniLM-L6-v2</code> ·
+                    LLM: <code>gpt-4.1-nano</code> · Vector DB: <code>ChromaDB</code>
+                </div>
+                """
+            )
+        return demo
+# if __name__ == "__main__":
+#     ui = build_ui()
+#     ui.launch(
+#         server_name="127.0.0.1",
+#         server_port=7860,
+#         share=False,
+#         show_error=True,
+#     )
+if __name__ == "__main__":
+    ui = build_ui()
+    ui.launch()

config.py ADDED Viewed

	@@ -0,0 +1,75 @@

+"""
+config.py — Centralised configuration for DevDocs AI.
+All tuneable parameters live here so the rest of the codebase imports from one place.
+"""
+import os
+from pathlib import Path
+from dotenv import load_dotenv
+# ─── Paths ────────────────────────────────────────────────────────────────────
+BASE_DIR = Path(__file__).parent
+DATA_DIR = BASE_DIR / "data"
+VECTOR_DB_DIR = DATA_DIR / "vector_db"
+UPLOAD_DIR = DATA_DIR / "uploads"
+DATA_DIR.mkdir(parents=True, exist_ok=True)
+VECTOR_DB_DIR.mkdir(parents=True, exist_ok=True)
+UPLOAD_DIR.mkdir(parents=True, exist_ok=True)
+# ─── Ingestion ────────────────────────────────────────────────────────────────
+ALLOWED_EXTENSIONS = {
+    # Python
+    ".py",
+    # JavaScript / TypeScript
+    ".js", ".ts", ".jsx", ".tsx", ".mjs", ".cjs",
+    # PHP
+    ".php", ".php3", ".php4", ".php5", ".phtml",
+    # Java / Kotlin
+    ".java", ".kt", ".kts",
+    # C / C++
+    ".c", ".cpp", ".h", ".hpp", ".cc",
+    # Systems
+    ".go", ".rs",
+    # Ruby
+    ".rb", ".rake",
+    # C# / .NET
+    ".cs",
+    # Shell
+    ".sh", ".bash", ".zsh",
+    # Docs / Config
+    ".md", ".txt", ".yaml", ".yml", ".toml", ".json",
+    # HTML / CSS (if you want frontend code)
+    ".html", ".css", ".scss",
+    # SQL
+    ".sql",
+}
+MAX_FILE_SIZE_MB = 2  # skip files larger than this
+# ─── Chunking ─────────────────────────────────────────────────────────────────
+CHUNK_SIZE = 400          # tokens (approx characters / 4)
+CHUNK_OVERLAP = 60        # token overlap between chunks
+CHUNK_SIZE_CHARS = CHUNK_SIZE * 4      # character approximation
+CHUNK_OVERLAP_CHARS = CHUNK_OVERLAP * 4
+# ─── Embeddings ───────────────────────────────────────────────────────────────
+EMBEDDING_MODEL = "all-MiniLM-L6-v2"
+EMBEDDING_DEVICE = "cpu"
+# ─── Chroma ───────────────────────────────────────────────────────────────────
+CHROMA_COLLECTION_NAME = "devdocs"
+# ─── Retrieval ────────────────────────────────────────────────────────────────
+DEFAULT_TOP_K = 5
+DEFAULT_SEARCH_TYPE = "similarity"   # "similarity" | "mmr"
+MMR_FETCH_K = 20                     # candidate pool for MMR
+MMR_LAMBDA_MULT = 0.5                # diversity vs relevance balance
+# ─── LLM ──────────────────────────────────────────────────────────────────────
+LLM_MODEL = "openai/gpt-4.1-nano"   # via litellm
+LLM_MAX_TOKENS = 1024
+LLM_TEMPERATURE = 0.1
+OPENAI_API_KEY = load_dotenv(dotenv_path=Path(__file__).parent / ".env")
+# ─── Evaluation ───────────────────────────────────────────────────────────────
+JUDGE_MODEL = "openai/gpt-4.1-nano"
+EVAL_TOP_K = 5

dev.zip ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:c43c3856f7b4b9f495e20dab4ab8ecb6d0aba1195fece864b364ecf8befa0b1c
+size 1629940

evaluation/__init__.py ADDED Viewed

	@@ -0,0 +1 @@


1	+ # metrics package

evaluation/judge.py ADDED Viewed

	@@ -0,0 +1,126 @@

+"""
+judge.py — LLM-as-a-judge answer quality evaluation.
+Scores the generated answer on four dimensions (each 1–5):
+  - Accuracy     : Is the answer factually correct given the context?
+  - Completeness : Does it fully address the question?
+  - Relevance    : Is the answer focused and on-topic?
+  - Groundedness : Is every claim supported by the retrieved context?
+Uses a single structured LLM call returning JSON to minimise cost.
+"""
+import json
+import logging
+from typing import List
+import litellm
+from pydantic import BaseModel, Field
+from config import JUDGE_MODEL, OPENAI_API_KEY
+logger = logging.getLogger(__name__)
+# ─── Pydantic output model ────────────────────────────────────────────────────
+class AnswerQualityScores(BaseModel):
+    """Structured LLM-judge evaluation scores."""
+    accuracy: int = Field(..., ge=1, le=5, description="Factual accuracy (1–5)")
+    completeness: int = Field(..., ge=1, le=5, description="How fully the question is answered (1–5)")
+    relevance: int = Field(..., ge=1, le=5, description="Relevance to the question (1–5)")
+    groundedness: int = Field(..., ge=1, le=5, description="Claims backed by retrieved context (1–5)")
+    overall: float = Field(..., description="Mean of the four scores")
+    reasoning: str = Field(..., description="One-sentence justification from the judge")
+_JUDGE_SYSTEM = """You are a strict, impartial evaluator of AI-generated answers about codebases.
+Given:
+- A user question
+- Retrieved code context
+- A generated answer
+Score the answer on FOUR criteria, each from 1 to 5:
+  accuracy     : Is every claim factually correct based on the context?
+  completeness : Does the answer fully address all parts of the question?
+  relevance    : Is the answer focused on the question without padding?
+  groundedness : Are all claims directly supported by the retrieved context?
+Respond ONLY with valid JSON matching exactly this schema (no extra keys):
+{
+  "accuracy": <int 1-5>,
+  "completeness": <int 1-5>,
+  "relevance": <int 1-5>,
+  "groundedness": <int 1-5>,
+  "reasoning": "<one sentence justification>"
+}"""
+def judge_answer(
+    query: str,
+    context_docs: List,
+    answer: str,
+) -> AnswerQualityScores:
+    """
+    Evaluate an LLM-generated answer using an LLM judge.
+    This consumes 1 LLM call. Results are returned as a Pydantic model.
+    Args:
+        query: The user's original question.
+        context_docs: LangChain Documents used as context.
+        answer: The generated answer to evaluate.
+    Returns:
+        AnswerQualityScores with per-dimension scores and overall mean.
+    """
+    if not OPENAI_API_KEY:
+        # Return neutral scores when no API key is configured.
+        return AnswerQualityScores(
+            accuracy=0, completeness=0, relevance=0, groundedness=0,
+            overall=0.0, reasoning="No API key — evaluation skipped."
+        )
+    context_text = "\n\n".join(
+        f"[{i+1}] {d.page_content[:400]}" for i, d in enumerate(context_docs)
+    )
+    user_msg = (
+        f"Question: {query}\n\n"
+        f"Retrieved Context:\n{context_text}\n\n"
+        f"Generated Answer:\n{answer}"
+    )
+    try:
+        response = litellm.completion(
+            model=JUDGE_MODEL,
+            messages=[
+                {"role": "system", "content": _JUDGE_SYSTEM},
+                {"role": "user", "content": user_msg},
+            ],
+            max_tokens=200,
+            temperature=0.0
+        )
+        raw = response.choices[0].message.content.strip()
+        # Strip potential markdown fences
+        raw = raw.strip().lstrip("```json").lstrip("```").rstrip("```").strip()
+        data = json.loads(raw)
+        scores_sum = data["accuracy"] + data["completeness"] + data["relevance"] + data["groundedness"]
+        return AnswerQualityScores(
+            accuracy=data["accuracy"],
+            completeness=data["completeness"],
+            relevance=data["relevance"],
+            groundedness=data["groundedness"],
+            overall=round(scores_sum / 4, 2),
+            reasoning=data.get("reasoning", ""),
+        )
+    except Exception as e:
+        logger.error(f"Judge evaluation failed: {e}")
+        return AnswerQualityScores(
+            accuracy=0, completeness=0, relevance=0, groundedness=0,
+            overall=0.0, reasoning=f"Evaluation failed: {e}"
+        )

evaluation/metrics.py ADDED Viewed

	@@ -0,0 +1,131 @@

+"""
+metrics.py — Retrieval quality metrics.
+Implements:
+  - Recall@K  : fraction of relevant docs retrieved in top-K
+  - MRR       : Mean Reciprocal Rank of the first relevant doc
+  - nDCG      : Normalized Discounted Cumulative Gain
+Relevance is determined by keyword matching between the query and chunk content.
+This is a proxy measure used when ground-truth labels are unavailable.
+"""
+import math
+import re
+import logging
+from typing import List
+from langchain_core.documents import Document
+from pydantic import BaseModel, Field
+logger = logging.getLogger(__name__)
+# ─── Pydantic output model ────────────────────────────────────────────────────
+class RetrievalMetrics(BaseModel):
+    """Structured container for retrieval evaluation scores."""
+    recall_at_k: float = Field(..., ge=0.0, le=1.0, description="Recall@K")
+    mrr: float = Field(..., ge=0.0, le=1.0, description="Mean Reciprocal Rank")
+    ndcg: float = Field(..., ge=0.0, le=1.0, description="nDCG@K")
+    top_k: int = Field(..., description="K used for evaluation")
+    num_relevant: int = Field(..., description="Number of docs judged relevant")
+# ─── Relevance oracle ─────────────────────────────────────────────────────────
+def _extract_keywords(text: str) -> set:
+    """Extract lowercase alphabetic tokens (length ≥ 3) from text."""
+    return set(re.findall(r"\b[a-zA-Z]{3,}\b", text.lower()))
+def _is_relevant(query: str, doc: Document, threshold: int = 2) -> bool:
+    """
+    Determine if a document is relevant to the query via keyword overlap.
+    Args:
+        query: User question.
+        doc: Retrieved document.
+        threshold: Minimum number of shared keywords to count as relevant.
+    Returns:
+        True if overlap ≥ threshold.
+    """
+    q_keywords = _extract_keywords(query)
+    d_keywords = _extract_keywords(doc.page_content)
+    overlap = len(q_keywords & d_keywords)
+    return overlap >= threshold
+# ─── Metric functions ─────────────────────────────────────────────────────────
+def _compute_relevance_flags(query: str, docs: List[Document]) -> List[int]:
+    """Return binary relevance list (1 = relevant, 0 = not)."""
+    return [1 if _is_relevant(query, doc) else 0 for doc in docs]
+def recall_at_k(relevance: List[int]) -> float:
+    """
+    Recall@K: fraction of retrieved docs that are relevant.
+    Since we have no total relevant pool, we treat the number of
+    relevant items in the retrieved set as the denominator baseline.
+    """
+    num_relevant = sum(relevance)
+    if num_relevant == 0:
+        return 0.0
+    return num_relevant / len(relevance)
+def mean_reciprocal_rank(relevance: List[int]) -> float:
+    """
+    MRR: 1/rank of the first relevant document.
+    Returns 0.0 if no relevant document is found.
+    """
+    for rank, rel in enumerate(relevance, 1):
+        if rel == 1:
+            return 1.0 / rank
+    return 0.0
+def ndcg_at_k(relevance: List[int]) -> float:
+    """
+    nDCG@K using binary relevance.
+    Args:
+        relevance: Binary relevance list ordered by retrieval rank.
+    Returns:
+        nDCG score in [0, 1].
+    """
+    def dcg(rels: List[int]) -> float:
+        return sum(r / math.log2(i + 2) for i, r in enumerate(rels))
+    actual_dcg = dcg(relevance)
+    ideal_dcg = dcg(sorted(relevance, reverse=True))
+    if ideal_dcg == 0:
+        return 0.0
+    return actual_dcg / ideal_dcg
+def compute_retrieval_metrics(query: str, docs: List[Document]) -> RetrievalMetrics:
+    """
+    Compute all retrieval metrics for a query–result pair.
+    Args:
+        query: User's natural language question.
+        docs: Retrieved documents in retrieval rank order.
+    Returns:
+        RetrievalMetrics Pydantic model.
+    """
+    relevance = _compute_relevance_flags(query, docs)
+    return RetrievalMetrics(
+        recall_at_k=round(recall_at_k(relevance), 4),
+        mrr=round(mean_reciprocal_rank(relevance), 4),
+        ndcg=round(ndcg_at_k(relevance), 4),
+        top_k=len(docs),
+        num_relevant=sum(relevance),
+    )

ingestion/__init__.py ADDED Viewed

	@@ -0,0 +1 @@


1	+ # ingestion package

ingestion/chunker.py ADDED Viewed

	@@ -0,0 +1,133 @@

+"""
+chunker.py — Code-aware document chunking.
+Strategy:
+  1. For Python files: split by top-level functions/classes using AST.
+  2. For all other files: fall back to character-level sliding window chunks.
+Each chunk is a LangChain Document with rich metadata.
+"""
+import ast
+import logging
+from typing import List, Dict
+from langchain_core.documents import Document
+from langchain_text_splitters import RecursiveCharacterTextSplitter
+from config import CHUNK_SIZE_CHARS, CHUNK_OVERLAP_CHARS
+logger = logging.getLogger(__name__)
+def _chunk_python_by_ast(content: str, file_path: str) -> List[Document]:
+    """
+    Parse Python source and extract top-level functions and classes as chunks.
+    Falls back to generic chunking if AST parsing fails.
+    Args:
+        content: Raw Python source code.
+        file_path: Source file path for metadata.
+    Returns:
+        List of Documents, one per function/class (or fallback chunks).
+    """
+    try:
+        tree = ast.parse(content)
+    except SyntaxError:
+        logger.warning(f"AST parse failed for {file_path}, using fallback chunker.")
+        return _chunk_generic(content, file_path)
+    lines = content.splitlines(keepends=True)
+    documents: List[Document] = []
+    for node in ast.walk(tree):
+        if not isinstance(node, (ast.FunctionDef, ast.AsyncFunctionDef, ast.ClassDef)):
+            continue
+        if not isinstance(node, ast.stmt):
+            continue  # skip nested; only top-level
+        start = node.lineno - 1
+        end = node.end_lineno
+        chunk_text = "".join(lines[start:end])
+        kind = "class" if isinstance(node, ast.ClassDef) else "function"
+        documents.append(Document(
+            page_content=chunk_text,
+            metadata={
+                "file_path": file_path,
+                "symbol_name": node.name,
+                "symbol_type": kind,
+                "start_line": node.lineno,
+                "end_line": node.end_lineno,
+            }
+        ))
+    if not documents:
+        # File has no top-level definitions (e.g. script) — use fallback
+        return _chunk_generic(content, file_path)
+    return documents
+def _chunk_generic(content: str, file_path: str, extension: str = "") -> List[Document]:
+    """
+    Generic recursive character splitter for non-Python or unparseable files.
+    Args:
+        content: Raw file content.
+        file_path: Source file path for metadata.
+        extension: File extension hint (unused currently, reserved).
+    Returns:
+        List of overlapping text chunk Documents.
+    """
+    splitter = RecursiveCharacterTextSplitter(
+        chunk_size=CHUNK_SIZE_CHARS,
+        chunk_overlap=CHUNK_OVERLAP_CHARS,
+        separators=["\n\n", "\n", " ", ""],
+    )
+    texts = splitter.split_text(content)
+    return [
+        Document(
+            page_content=text,
+            metadata={
+                "file_path": file_path,
+                "symbol_name": "",
+                "symbol_type": "chunk",
+                "chunk_index": i,
+            }
+        )
+        for i, text in enumerate(texts)
+    ]
+def chunk_documents(raw_docs: List[Dict]) -> List[Document]:
+    """
+    Dispatch each loaded file to the appropriate chunker.
+    Args:
+        raw_docs: List of dicts from loader.load_files().
+    Returns:
+        Flat list of LangChain Document objects ready for embedding.
+    """
+    all_chunks: List[Document] = []
+    for doc in raw_docs:
+        content = doc["content"]
+        file_path = doc["file_path"]
+        ext = doc.get("extension", "")
+        if not content.strip():
+            continue
+        if ext == ".py":
+            chunks = _chunk_python_by_ast(content, file_path)
+        else:
+            chunks = _chunk_generic(content, file_path, ext)
+        all_chunks.extend(chunks)
+    logger.info(f"Produced {len(all_chunks)} chunks from {len(raw_docs)} files.")
+    return all_chunks

ingestion/indexer.py ADDED Viewed

	@@ -0,0 +1,106 @@

+"""
+indexer.py — Embeds chunks and persists them in ChromaDB.
+Uses HuggingFace all-MiniLM-L6-v2 (free, 384-dim).
+ChromaDB is stored locally so embeddings are never recomputed
+unless the collection is explicitly cleared.
+"""
+import logging
+from typing import List
+from langchain_core.documents import Document
+from langchain_chroma import Chroma
+from langchain_huggingface import HuggingFaceEmbeddings
+from config import VECTOR_DB_DIR, EMBEDDING_MODEL, EMBEDDING_DEVICE, CHROMA_COLLECTION_NAME
+logger = logging.getLogger(__name__)
+# Module-level singleton so the embedding model is loaded only once per process.
+_embedding_model: HuggingFaceEmbeddings | None = None
+def get_embedding_model() -> HuggingFaceEmbeddings:
+    """
+    Return (or lazily create) the shared HuggingFace embedding model.
+    Returns:
+        HuggingFaceEmbeddings instance for all-MiniLM-L6-v2.
+    """
+    global _embedding_model
+    if _embedding_model is None:
+        logger.info(f"Loading embedding model: {EMBEDDING_MODEL}")
+        _embedding_model = HuggingFaceEmbeddings(
+            model_name=EMBEDDING_MODEL,
+            model_kwargs={"device": EMBEDDING_DEVICE},
+            encode_kwargs={"normalize_embeddings": True},
+        )
+    return _embedding_model
+def get_vectorstore() -> Chroma:
+    """
+    Open (or create) the persistent Chroma vector store.
+    Returns:
+        Chroma instance backed by the local vector_db directory.
+    """
+    return Chroma(
+        collection_name=CHROMA_COLLECTION_NAME,
+        embedding_function=get_embedding_model(),
+        persist_directory=str(VECTOR_DB_DIR),
+    )
+def index_documents(chunks: List[Document]) -> Chroma:
+    """
+    Embed and insert document chunks into ChromaDB.
+    Existing documents in the collection are cleared before re-indexing
+    so that re-uploading a ZIP starts fresh.
+    Args:
+        chunks: LangChain Documents produced by the chunker.
+    Returns:
+        The populated Chroma vector store.
+    """
+    if not chunks:
+        raise ValueError("No chunks to index.")
+    embeddings = get_embedding_model()
+    # Clear previous collection to avoid stale data on re-index.
+    vectorstore = Chroma(
+        collection_name=CHROMA_COLLECTION_NAME,
+        embedding_function=embeddings,
+        persist_directory=str(VECTOR_DB_DIR),
+    )
+    vectorstore.delete_collection()
+    # Recreate and populate.
+    vectorstore = Chroma.from_documents(
+        documents=chunks,
+        embedding=embeddings,
+        collection_name=CHROMA_COLLECTION_NAME,
+        persist_directory=str(VECTOR_DB_DIR),
+    )
+    logger.info(f"Indexed {len(chunks)} chunks into Chroma collection '{CHROMA_COLLECTION_NAME}'.")
+    return vectorstore
+def is_index_populated() -> bool:
+    """
+    Check whether the Chroma collection contains any documents.
+    Returns:
+        True if at least one document is stored, False otherwise.
+    """
+    try:
+        vs = get_vectorstore()
+        count = vs._collection.count()
+        return count > 0
+    except Exception:
+        return False

ingestion/loader.py ADDED Viewed

	@@ -0,0 +1,85 @@

+"""
+loader.py — Handles ZIP extraction and file loading.
+Responsibilities:
+  - Extract uploaded ZIP archives
+  - Filter files by allowed extensions
+  - Read file contents safely
+  - Return a list of raw document dicts
+"""
+import zipfile
+import os
+import logging
+from pathlib import Path
+from typing import List, Dict
+from config import ALLOWED_EXTENSIONS, MAX_FILE_SIZE_MB, UPLOAD_DIR
+logger = logging.getLogger(__name__)
+def extract_zip(zip_path: str) -> Path:
+    """
+    Extract a ZIP archive to a unique subdirectory under UPLOAD_DIR.
+    Args:
+        zip_path: Path to the uploaded .zip file.
+    Returns:
+        Path to the extraction directory.
+    """
+    zip_path = Path(zip_path)
+    extract_dir = UPLOAD_DIR / zip_path.stem
+    extract_dir.mkdir(parents=True, exist_ok=True)
+    with zipfile.ZipFile(zip_path, "r") as zf:
+        zf.extractall(extract_dir)
+    logger.info(f"Extracted ZIP to: {extract_dir}")
+    return extract_dir
+def load_files(extract_dir: Path) -> List[Dict]:
+    """
+    Walk the extraction directory and load allowed source files.
+    Each returned dict contains:
+        - content (str): raw file text
+        - file_path (str): relative path within the archive
+        - extension (str): file extension
+    Args:
+        extract_dir: Directory containing extracted files.
+    Returns:
+        List of raw document dicts.
+    """
+    documents: List[Dict] = []
+    max_bytes = MAX_FILE_SIZE_MB * 1024 * 1024
+    for root, _dirs, files in os.walk(extract_dir):
+        for filename in files:
+            full_path = Path(root) / filename
+            ext = full_path.suffix.lower()
+            if ext not in ALLOWED_EXTENSIONS:
+                continue
+            if full_path.stat().st_size > max_bytes:
+                logger.warning(f"Skipping large file: {full_path}")
+                continue
+            try:
+                content = full_path.read_text(encoding="utf-8", errors="replace")
+                relative_path = str(full_path.relative_to(extract_dir))
+                documents.append({
+                    "content": content,
+                    "file_path": relative_path,
+                    "extension": ext,
+                })
+            except Exception as e:
+                logger.warning(f"Failed to read {full_path}: {e}")
+    logger.info(f"Loaded {len(documents)} files from {extract_dir}")
+    return documents

llm/__init__.py ADDED Viewed

	@@ -0,0 +1 @@


1	+ # llm package

llm/generator.py ADDED Viewed

	@@ -0,0 +1,105 @@

+"""
+generator.py — LLM-based answer generation from retrieved context.
+Uses litellm so the model can be swapped by changing config.LLM_MODEL.
+The prompt is designed to:
+  - Ground the answer strictly in retrieved context
+  - Reference source files by name
+  - Decline gracefully when context is insufficient
+"""
+import logging
+from typing import List, Tuple
+from langchain_core.documents import Document
+import litellm
+from config import LLM_MODEL, LLM_MAX_TOKENS, LLM_TEMPERATURE, OPENAI_API_KEY
+logger = logging.getLogger(__name__)
+_SYSTEM_PROMPT = """You are DevDocs AI, an expert assistant that answers questions about codebases.
+Rules:
+1. Answer ONLY using the provided code context. Do NOT hallucinate.
+2. If the context is insufficient, say so clearly.
+3. Always cite the source file(s) at the end of your answer under a "Sources:" heading.
+4. Be concise and precise. Use code snippets when helpful.
+5. Format code blocks with triple backticks and the appropriate language tag.
+"""
+def _build_context_block(docs: List[Document]) -> str:
+    """
+    Format retrieved documents into a structured context string for the prompt.
+    Args:
+        docs: Retrieved LangChain Documents.
+    Returns:
+        Formatted context string.
+    """
+    parts = []
+    for i, doc in enumerate(docs, 1):
+        meta = doc.metadata
+        file_path = meta.get("file_path", "unknown")
+        symbol = meta.get("symbol_name", "")
+        symbol_type = meta.get("symbol_type", "chunk")
+        header = f"[{i}] File: {file_path}"
+        if symbol:
+            header += f" | {symbol_type}: {symbol}"
+        parts.append(f"{header}\n```\n{doc.page_content.strip()}\n```")
+    return "\n\n".join(parts)
+def generate_answer(
+    query: str,
+    docs: List[Document],
+) -> Tuple[str, List[str]]:
+    """
+    Generate a grounded answer from retrieved documents.
+    Args:
+        query: The user's natural language question.
+        docs: Retrieved Document chunks (context).
+    Returns:
+        Tuple of (answer_text, source_file_list).
+    Raises:
+        RuntimeError: If the LLM call fails.
+    """
+    if not OPENAI_API_KEY:
+        return (
+            "⚠️ No OpenAI API key configured. Set the OPENAI_API_KEY environment variable.",
+            [],
+        )
+    context_block = _build_context_block(docs)
+    source_files = list({doc.metadata.get("file_path", "") for doc in docs})
+    user_message = (
+        f"Question: {query}\n\n"
+        f"Context (retrieved code):\n{context_block}"
+    )
+    try:
+        response = litellm.completion(
+            model=LLM_MODEL,
+            messages=[
+                {"role": "system", "content": _SYSTEM_PROMPT},
+                {"role": "user", "content": user_message},
+            ],
+            max_tokens=LLM_MAX_TOKENS,
+            temperature=LLM_TEMPERATURE
+        )
+        answer = response.choices[0].message.content.strip()
+        logger.info(f"Generated answer ({len(answer)} chars) for: '{query[:60]}'")
+        return answer, source_files
+    except Exception as e:
+        logger.error(f"LLM generation failed: {e}")
+        raise RuntimeError(f"LLM generation failed: {e}") from e

requirements.txt ADDED Viewed

	@@ -0,0 +1,25 @@

+# ── Core LangChain (split into focused packages as of v0.2+) ──────────────────
+langchain>=0.3.0
+langchain-core>=0.3.0
+langchain-text-splitters>=0.3.0
+langchain-community>=0.3.0
+langchain-chroma>=0.1.4
+# ── Embeddings (FREE, local) ──────────────────────────────────────────────────
+langchain-huggingface>=0.1.0    # replaces langchain_community.embeddings.HuggingFaceEmbeddings
+# Core
+gradio>=4.36.0
+langchain>=0.2.0
+langchain-community>=0.2.0
+chromadb>=0.5.0
+sentence-transformers>=2.7.0
+litellm>=1.40.0
+pydantic>=2.0.0
+# File handling
+python-dotenv>=1.0.0
+# Optional: faster tokenisation for chunking
+tiktoken>=0.7.0

retrieval/__init__.py ADDED Viewed

	@@ -0,0 +1 @@


1	+ # retrieval package

retrieval/query_rewriter.py ADDED Viewed

	@@ -0,0 +1,103 @@

+"""
+query_rewriter.py — Lightweight query reformulation before retrieval.
+Two modes:
+  1. Rule-based (free): simple heuristic expansions (default, zero cost).
+  2. LLM-based (optional): one cheap LLM call to reformulate the query.
+The LLM path is only invoked when explicitly requested to keep costs minimal.
+"""
+import re
+import logging
+import litellm
+from config import LLM_MODEL, LLM_TEMPERATURE, OPENAI_API_KEY
+logger = logging.getLogger(__name__)
+# Heuristic keyword expansions (extend as needed).
+_EXPANSIONS = {
+    r"\bauth\b": "authentication authorization",
+    r"\bdb\b": "database",
+    r"\bapi\b": "API endpoint route handler",
+    r"\bconfig\b": "configuration settings",
+    r"\berror\b": "error exception handling",
+    r"\btest\b": "unit test test case",
+    r"\bdeploy\b": "deployment CI CD pipeline",
+}
+def rule_based_rewrite(query: str) -> str:
+    """
+    Apply simple regex-based expansions to common abbreviations.
+    Args:
+        query: Original user query.
+    Returns:
+        Slightly expanded query string.
+    """
+    rewritten = query
+    for pattern, expansion in _EXPANSIONS.items():
+        rewritten = re.sub(pattern, expansion, rewritten, flags=re.IGNORECASE)
+    if rewritten != query:
+        logger.debug(f"Rule-based rewrite: '{query}' → '{rewritten}'")
+    return rewritten
+def llm_rewrite(query: str) -> str:
+    """
+    Use a cheap LLM call to reformulate the query for better retrieval.
+    This is optional and costs ~1 LLM call per query.
+    Args:
+        query: Original user query.
+    Returns:
+        Reformulated query optimised for semantic code search.
+    """
+    if not OPENAI_API_KEY:
+        logger.warning("No API key set; falling back to rule-based rewrite.")
+        return rule_based_rewrite(query)
+    system_prompt = (
+        "You are a search query optimizer for code repositories. "
+        "Rewrite the user's question into a concise, keyword-rich query "
+        "that will best match relevant code chunks. "
+        "Output ONLY the rewritten query — no explanation."
+    )
+    try:
+        response = litellm.completion(
+            model=LLM_MODEL,
+            messages=[
+                {"role": "system", "content": system_prompt},
+                {"role": "user", "content": query},
+            ],
+            max_tokens=80,
+            temperature=LLM_TEMPERATURE,
+            api_key=OPENAI_API_KEY,
+        )
+        rewritten = response.choices[0].message.content.strip()
+        logger.info(f"LLM rewrite: '{query}' → '{rewritten}'")
+        return rewritten
+    except Exception as e:
+        logger.warning(f"LLM rewrite failed ({e}); falling back to rule-based.")
+        return rule_based_rewrite(query)
+def rewrite_query(query: str, use_llm: bool = False) -> str:
+    """
+    Entry point for query rewriting.
+    Args:
+        query: Raw user question.
+        use_llm: If True, invoke LLM rewrite (costs 1 LLM call).
+    Returns:
+        Rewritten query string.
+    """
+    if use_llm:
+        return llm_rewrite(query)
+    return rule_based_rewrite(query)

retrieval/retriever.py ADDED Viewed

	@@ -0,0 +1,64 @@

+"""
+retriever.py — Configurable retrieval over the Chroma vector store.
+Supports:
+  - Similarity search (cosine distance ranking)
+  - MMR (Maximum Marginal Relevance) for diversity-aware retrieval
+Returns LangChain Documents with scores where applicable.
+"""
+import logging
+from typing import List, Tuple
+from langchain_core.documents import Document
+from langchain_community.vectorstores import Chroma
+from config import DEFAULT_TOP_K, MMR_FETCH_K, MMR_LAMBDA_MULT
+from ingestion.indexer import get_vectorstore
+logger = logging.getLogger(__name__)
+def retrieve(
+    query: str,
+    search_type: str = "similarity",
+    top_k: int = DEFAULT_TOP_K,
+) -> Tuple[List[Document], List[float]]:
+    """
+    Retrieve the most relevant document chunks for a query.
+    Args:
+        query: Natural language question from the user.
+        search_type: "similarity" or "mmr".
+        top_k: Number of chunks to return.
+    Returns:
+        Tuple of (documents, scores).
+        Scores are cosine-similarity floats for similarity search;
+        a list of zeros for MMR (Chroma does not expose MMR scores).
+    Raises:
+        RuntimeError: If the vector store is empty.
+    """
+    vectorstore: Chroma = get_vectorstore()
+    if vectorstore._collection.count() == 0:
+        raise RuntimeError("Vector store is empty. Please index a repository first.")
+    if search_type == "mmr":
+        docs = vectorstore.max_marginal_relevance_search(
+            query=query,
+            k=top_k,
+            fetch_k=max(MMR_FETCH_K, top_k * 4),
+            lambda_mult=MMR_LAMBDA_MULT,
+        )
+        scores = [0.0] * len(docs)
+    else:
+        results = vectorstore.similarity_search_with_score(query=query, k=top_k)
+        docs = [d for d, _ in results]
+        # Chroma returns L2 distance; convert to similarity (0–1) for clarity.
+        scores = [max(0.0, 1.0 - s) for _, s in results]
+    logger.info(f"[{search_type.upper()}] Retrieved {len(docs)} chunks for: '{query[:60]}'")
+    return docs, scores

utils/__init__.py ADDED Viewed

	@@ -0,0 +1 @@


1	+ # utils package

utils/helpers.py ADDED Viewed

	@@ -0,0 +1,100 @@

+"""
+helpers.py — Shared utility functions used across the project.
+"""
+import logging
+import sys
+from pathlib import Path
+from typing import List
+from langchain_core.documents import Document
+def setup_logging(level: int = logging.INFO) -> None:
+    """Configure root logger with a clean, consistent format."""
+    logging.basicConfig(
+        level=level,
+        format="%(asctime)s | %(levelname)-8s | %(name)s — %(message)s",
+        datefmt="%H:%M:%S",
+        handlers=[logging.StreamHandler(sys.stdout)],
+    )
+def format_chunks_for_display(docs: List[Document], scores: List[float]) -> str:
+    """
+    Format retrieved chunks into a human-readable string for the Gradio UI.
+    Args:
+        docs: Retrieved LangChain Documents.
+        scores: Corresponding similarity scores.
+    Returns:
+        Formatted multi-line string.
+    """
+    parts = []
+    for i, (doc, score) in enumerate(zip(docs, scores), 1):
+        meta = doc.metadata
+        file_path = meta.get("file_path", "unknown")
+        symbol = meta.get("symbol_name", "")
+        symbol_type = meta.get("symbol_type", "chunk")
+        score_str = f"{score:.3f}" if score > 0 else "N/A (MMR)"
+        header = f"━━━ [{i}] {file_path}"
+        if symbol:
+            header += f" › {symbol_type}:{symbol}"
+        header += f"  (score: {score_str}) ━━━"
+        parts.append(f"{header}\n{doc.page_content.strip()}")
+    return "\n\n".join(parts)
+def format_metrics_for_display(retrieval_metrics, answer_scores) -> str:
+    """
+    Format all evaluation metrics into a readable dashboard string.
+    Args:
+        retrieval_metrics: RetrievalMetrics Pydantic model.
+        answer_scores: AnswerQualityScores Pydantic model.
+    Returns:
+        Formatted metrics string.
+    """
+    lines = [
+        "╔══════════════════════════════════════╗",
+        "║       EVALUATION METRICS PANEL       ║",
+        "╠══════════════════════════════════════╣",
+        "║  RETRIEVAL METRICS                   ║",
+        f"║  Recall@{retrieval_metrics.top_k:<2}       : {retrieval_metrics.recall_at_k:.4f}          ║",
+        f"║  MRR              : {retrieval_metrics.mrr:.4f}          ║",
+        f"║  nDCG@{retrieval_metrics.top_k:<2}         : {retrieval_metrics.ndcg:.4f}          ║",
+        f"║  Relevant chunks  : {retrieval_metrics.num_relevant}/{retrieval_metrics.top_k}              ║",
+        "╠══════════════════════════════════════╣",
+        "║  ANSWER QUALITY (LLM Judge)          ║",
+        f"║  Accuracy         : {answer_scores.accuracy}/5              ║",
+        f"║  Completeness     : {answer_scores.completeness}/5              ║",
+        f"║  Relevance        : {answer_scores.relevance}/5              ║",
+        f"║  Groundedness     : {answer_scores.groundedness}/5              ║",
+        f"║  Overall Score    : {answer_scores.overall:.2f}/5.00         ║",
+        "╠══════════════════════════════════════╣",
+        f"║  Reasoning: {answer_scores.reasoning[:38]:<38}",
+        "╚══════════════════════════════════════╝",
+    ]
+    return "\n".join(lines)
+def save_temp_file(file_bytes: bytes, filename: str) -> Path:
+    """
+    Save raw bytes to the uploads directory.
+    Args:
+        file_bytes: Raw file content.
+        filename: Target filename.
+    Returns:
+        Path to the saved file.
+    """
+    from config import UPLOAD_DIR
+    dest = UPLOAD_DIR / filename
+    dest.write_bytes(file_bytes)
+    return dest