Spaces:

Aishwarya30998
/

DeveloperDocs_RAG

Sleeping

App Files Files Community

DeveloperDocs_RAG / README.md

Aishwarya30998

Add Hugging Face Spaces configuration metadata

753f466 5 days ago

preview code

raw

history blame contribute delete

8.57 kB

metadata

title: DeveloperDocs RAG
emoji: 🧠
colorFrom: blue
colorTo: green
sdk: docker
app_file: app.py
pinned: false

Production-grade RAG system that answers questions using official techstack documentation (eg:fastapi)

🎯 What This Project Demonstrates

This is a production-style RAG (Retrieval-Augmented Generation) system that showcases:

✅ Professional documentation ingestion pipeline with chunking strategies
✅ Semantic search using vector embeddings (ChromaDB)
✅ Source attribution with clickable citations
✅ RAG evaluation metrics (RAGAS framework)
✅ Dockerized deployment ready for cloud platforms
✅ Production-grade error handling and logging

🏗️ Architecture

┌─────────────┐
│   User      │
│  Question   │
└──────┬──────┘
       │
       ▼
┌─────────────────────────────────────┐
│  1. Query Embedding                 │
│     (sentence-transformers)         │
└──────────┬──────────────────────────┘
           │
           ▼
┌─────────────────────────────────────┐
│  2. Vector Search (ChromaDB)        │
│     - Top 5 relevant chunks         │
│     - Metadata: source, section     │
└──────────┬──────────────────────────┘
           │
           ▼
┌─────────────────────────────────────┐
│  3. Context Assembly                │
│     - Format chunks                 │
│     - Add instructions              │
└──────────┬──────────────────────────┘
           │
           ▼
┌─────────────────────────────────────┐
│  4. LLM Generation (HF Inference)   │
│     - Answer with citations         │
│     - Code examples preserved       │
└──────────┬──────────────────────────┘
           │
           ▼
┌─────────────────────────────────────┐
│  5. Response + Source Links         │
└─────────────────────────────────────┘

Local Setup

# Clone the repository
git clone https://github.com/aishwarya30998/DeveloperDocs-AI-Copilot-RAG.git
cd DeveloperDocs-AI-Copilot-RAG

# Create virtual environment
python -m venv venv
source venv/bin/activate
# On Windows: venv\Scripts\activate

# Install dependencies
pip install -r requirements.txt


# create .env and add your HF_TOKEN


# Run the application
python app.py

Visit http://localhost:7860 in your browser.

📦 Project Structure

fastapi-docs-copilot/
├── app.py                      # Gradio UI application
├── Dockerfile                  # Container configuration
├── docker-compose.yml          # Local container orchestration
├── requirements.txt            # Python dependencies
├── .env.example               # Environment variables template
│
├── src/
│   ├── __init__.py
│   ├── config.py              # Configuration management
│   ├── chunking.py            # Document chunking strategies
│   ├── embeddings.py          # Embedding generation
│   ├── retriever.py           # Vector search logic
│   ├── rag_pipeline.py        # Main RAG orchestration
│   └── prompts.py             # Prompt templates
│
├── scripts/
│   ├── ingest_docs.py         # Documentation ingestion
│   ├── evaluate_rag.py        # RAG metrics evaluation
│   └── test_retrieval.py      # Test retrieval quality
│
├── data/
│   ├── raw/                   # Downloaded documentation
│   ├── processed/             # Chunked documents
│   └── vectordb/              # ChromaDB storage
│
├── tests/
│   ├── test_chunking.py
│   ├── test_retriever.py
│   └── test_rag_pipeline.py
│
└── evals/
    ├── test_queries.json      # Evaluation dataset
    └── results/               # Evaluation outputs

🎯 Key Features

1. Smart Chunking

Semantic chunking with overlap for context preservation
Metadata enrichment (section titles, URLs, code blocks)
Configurable chunk sizes (300-800 tokens)

2. Retrieval Quality

Hybrid search (semantic + keyword)
Reranking for improved relevance
Source attribution with confidence scores

3. Answer Generation

Code-aware formatting (preserves indentation)
Inline citations with source links
Fallback handling for low-confidence results

4. Production Features

Health check endpoint (/health)
Query logging for analytics
Rate limiting (basic throttling)
Error recovery with graceful degradation

📊 RAG Evaluation

We use RAGAS framework to measure:

Metric	Description	Target Score
Faithfulness	Answer accuracy vs. context	> 0.8
Answer Relevancy	Response relevance to query	> 0.7
Context Precision	Retrieval accuracy	> 0.75
Context Recall	Context completeness	> 0.8

Run evaluations:

python evaluate_rag.py

🐳 Docker Deployment

Build and run locally:

docker build -t developerdocs-rag
docker run -p 7860:7860 --name developerdocs-rag-container developerdocs-rag

Deploy to HuggingFace Spaces:

Create a new Space on HuggingFace
Enable Docker SDK
Push this repository
Add HF_TOKEN as a Space secret
Deploy automatically

🧪 Testing

# Run all tests


# Test chunking strategy
pytest test_chunking.py -v

# Test retrieval quality
python test_retrieval.py

📈 Performance Benchmarks

On HuggingFace Spaces (free tier):

Query latency: ~2-3 seconds
Vector DB size: ~150MB (FastAPI docs)
Memory usage: ~800MB
Concurrent users: 5-10

🛠️ Technology Stack

Component	Technology	Why?
Embeddings	`sentence-transformers/all-MiniLM-L6-v2`	Fast, lightweight, good quality
Vector DB	ChromaDB	Easy setup, persistent storage
LLM	HuggingFace Inference API (Mistral-7B)	Free tier, good code understanding
Framework	LangChain	Industry standard, modular
UI	Gradio	Rapid prototyping, HF integration
Deployment	Docker + HF Spaces	Free, scalable, shareable

🔮 Future Enhancements

Multi-documentation support (React, Django, etc.)
Conversation memory for follow-up questions
Advanced retrieval (HyDE, Multi-Query)
User feedback loop for continuous improvement
Analytics dashboard for query patterns

📝 License

MIT License - feel free to use for your portfolio!

🤝 Contributing

This is a portfolio project, but suggestions are welcome via issues.

📧 Contact

Built by Aishwarya as a portfolio demonstration of production RAG systems.

Portfolio: https://aishwarya30998.github.io/projects.html
LinkedIn: https://www.linkedin.com/in/aishwarya-pentyala/

⭐ If this helped you understand production RAG, give it a star!