DeveloperDocs_RAG / README.md
Aishwarya30998's picture
Add Hugging Face Spaces configuration metadata
753f466
---
title: DeveloperDocs RAG
emoji: ๐Ÿง 
colorFrom: blue
colorTo: green
sdk: docker
app_file: app.py
pinned: false
---
> Production-grade RAG system that answers questions using official techstack documentation (eg:fastapi)
[![Deployed on HuggingFace](https://img.shields.io/badge/๐Ÿค—-HuggingFace%20Spaces-blue)](https://huggingface.co/spaces)
[![Docker](https://img.shields.io/badge/Docker-Ready-2496ED?logo=docker&logoColor=white)](https://www.docker.com/)
[![Python 3.10+](https://img.shields.io/badge/Python-3.10+-3776AB?logo=python&logoColor=white)](https://www.python.org/)
## ๐ŸŽฏ What This Project Demonstrates
This is a **production-style RAG (Retrieval-Augmented Generation)** system that showcases:
- โœ… **Professional documentation ingestion pipeline** with chunking strategies
- โœ… **Semantic search** using vector embeddings (ChromaDB)
- โœ… **Source attribution** with clickable citations
- โœ… **RAG evaluation metrics** (RAGAS framework)
- โœ… **Dockerized deployment** ready for cloud platforms
- โœ… **Production-grade error handling** and logging
## ๐Ÿ—๏ธ Architecture
```
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ User โ”‚
โ”‚ Question โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”˜
โ”‚
โ–ผ
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ 1. Query Embedding โ”‚
โ”‚ (sentence-transformers) โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
โ”‚
โ–ผ
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ 2. Vector Search (ChromaDB) โ”‚
โ”‚ - Top 5 relevant chunks โ”‚
โ”‚ - Metadata: source, section โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
โ”‚
โ–ผ
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ 3. Context Assembly โ”‚
โ”‚ - Format chunks โ”‚
โ”‚ - Add instructions โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
โ”‚
โ–ผ
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ 4. LLM Generation (HF Inference) โ”‚
โ”‚ - Answer with citations โ”‚
โ”‚ - Code examples preserved โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
โ”‚
โ–ผ
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ 5. Response + Source Links โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
```
### Local Setup
```bash
# Clone the repository
git clone https://github.com/aishwarya30998/DeveloperDocs-AI-Copilot-RAG.git
cd DeveloperDocs-AI-Copilot-RAG
# Create virtual environment
python -m venv venv
source venv/bin/activate
# On Windows: venv\Scripts\activate
# Install dependencies
pip install -r requirements.txt
# create .env and add your HF_TOKEN
# Run the application
python app.py
```
Visit `http://localhost:7860` in your browser.
## ๐Ÿ“ฆ Project Structure
```
fastapi-docs-copilot/
โ”œโ”€โ”€ app.py # Gradio UI application
โ”œโ”€โ”€ Dockerfile # Container configuration
โ”œโ”€โ”€ docker-compose.yml # Local container orchestration
โ”œโ”€โ”€ requirements.txt # Python dependencies
โ”œโ”€โ”€ .env.example # Environment variables template
โ”‚
โ”œโ”€โ”€ src/
โ”‚ โ”œโ”€โ”€ __init__.py
โ”‚ โ”œโ”€โ”€ config.py # Configuration management
โ”‚ โ”œโ”€โ”€ chunking.py # Document chunking strategies
โ”‚ โ”œโ”€โ”€ embeddings.py # Embedding generation
โ”‚ โ”œโ”€โ”€ retriever.py # Vector search logic
โ”‚ โ”œโ”€โ”€ rag_pipeline.py # Main RAG orchestration
โ”‚ โ””โ”€โ”€ prompts.py # Prompt templates
โ”‚
โ”œโ”€โ”€ scripts/
โ”‚ โ”œโ”€โ”€ ingest_docs.py # Documentation ingestion
โ”‚ โ”œโ”€โ”€ evaluate_rag.py # RAG metrics evaluation
โ”‚ โ””โ”€โ”€ test_retrieval.py # Test retrieval quality
โ”‚
โ”œโ”€โ”€ data/
โ”‚ โ”œโ”€โ”€ raw/ # Downloaded documentation
โ”‚ โ”œโ”€โ”€ processed/ # Chunked documents
โ”‚ โ””โ”€โ”€ vectordb/ # ChromaDB storage
โ”‚
โ”œโ”€โ”€ tests/
โ”‚ โ”œโ”€โ”€ test_chunking.py
โ”‚ โ”œโ”€โ”€ test_retriever.py
โ”‚ โ””โ”€โ”€ test_rag_pipeline.py
โ”‚
โ””โ”€โ”€ evals/
โ”œโ”€โ”€ test_queries.json # Evaluation dataset
โ””โ”€โ”€ results/ # Evaluation outputs
```
## ๐ŸŽฏ Key Features
### 1. Smart Chunking
- **Semantic chunking** with overlap for context preservation
- **Metadata enrichment** (section titles, URLs, code blocks)
- **Configurable chunk sizes** (300-800 tokens)
### 2. Retrieval Quality
- **Hybrid search** (semantic + keyword)
- **Reranking** for improved relevance
- **Source attribution** with confidence scores
### 3. Answer Generation
- **Code-aware formatting** (preserves indentation)
- **Inline citations** with source links
- **Fallback handling** for low-confidence results
### 4. Production Features
- **Health check endpoint** (`/health`)
- **Query logging** for analytics
- **Rate limiting** (basic throttling)
- **Error recovery** with graceful degradation
## ๐Ÿ“Š RAG Evaluation
We use **RAGAS** framework to measure:
| Metric | Description | Target Score |
| --------------------- | --------------------------- | ------------ |
| **Faithfulness** | Answer accuracy vs. context | > 0.8 |
| **Answer Relevancy** | Response relevance to query | > 0.7 |
| **Context Precision** | Retrieval accuracy | > 0.75 |
| **Context Recall** | Context completeness | > 0.8 |
Run evaluations:
```bash
python evaluate_rag.py
```
## ๐Ÿณ Docker Deployment
### Build and run locally:
```bash
docker build -t developerdocs-rag
docker run -p 7860:7860 --name developerdocs-rag-container developerdocs-rag
```
### Deploy to HuggingFace Spaces:
1. Create a new Space on HuggingFace
2. Enable Docker SDK
3. Push this repository
4. Add `HF_TOKEN` as a Space secret
5. Deploy automatically
## ๐Ÿงช Testing
```bash
# Run all tests
# Test chunking strategy
pytest test_chunking.py -v
# Test retrieval quality
python test_retrieval.py
```
## ๐Ÿ“ˆ Performance Benchmarks
On HuggingFace Spaces (free tier):
- **Query latency**: ~2-3 seconds
- **Vector DB size**: ~150MB (FastAPI docs)
- **Memory usage**: ~800MB
- **Concurrent users**: 5-10
## ๐Ÿ› ๏ธ Technology Stack
| Component | Technology | Why? |
| -------------- | ---------------------------------------- | ---------------------------------- |
| **Embeddings** | `sentence-transformers/all-MiniLM-L6-v2` | Fast, lightweight, good quality |
| **Vector DB** | ChromaDB | Easy setup, persistent storage |
| **LLM** | HuggingFace Inference API (Mistral-7B) | Free tier, good code understanding |
| **Framework** | LangChain | Industry standard, modular |
| **UI** | Gradio | Rapid prototyping, HF integration |
| **Deployment** | Docker + HF Spaces | Free, scalable, shareable |
## ๐Ÿ”ฎ Future Enhancements
- [ ] Multi-documentation support (React, Django, etc.)
- [ ] Conversation memory for follow-up questions
- [ ] Advanced retrieval (HyDE, Multi-Query)
- [ ] User feedback loop for continuous improvement
- [ ] Analytics dashboard for query patterns
## ๐Ÿ“ License
MIT License - feel free to use for your portfolio!
## ๐Ÿค Contributing
This is a portfolio project, but suggestions are welcome via issues.
## ๐Ÿ“ง Contact
Built by Aishwarya as a portfolio demonstration of production RAG systems.
- Portfolio: https://aishwarya30998.github.io/projects.html
- LinkedIn: https://www.linkedin.com/in/aishwarya-pentyala/
---
โญ If this helped you understand production RAG, give it a star!