Spaces:
Sleeping
Sleeping
| title: DeveloperDocs RAG | |
| emoji: ๐ง | |
| colorFrom: blue | |
| colorTo: green | |
| sdk: docker | |
| app_file: app.py | |
| pinned: false | |
| > Production-grade RAG system that answers questions using official techstack documentation (eg:fastapi) | |
| [](https://huggingface.co/spaces) | |
| [](https://www.docker.com/) | |
| [](https://www.python.org/) | |
| ## ๐ฏ What This Project Demonstrates | |
| This is a **production-style RAG (Retrieval-Augmented Generation)** system that showcases: | |
| - โ **Professional documentation ingestion pipeline** with chunking strategies | |
| - โ **Semantic search** using vector embeddings (ChromaDB) | |
| - โ **Source attribution** with clickable citations | |
| - โ **RAG evaluation metrics** (RAGAS framework) | |
| - โ **Dockerized deployment** ready for cloud platforms | |
| - โ **Production-grade error handling** and logging | |
| ## ๐๏ธ Architecture | |
| ``` | |
| โโโโโโโโโโโโโโโ | |
| โ User โ | |
| โ Question โ | |
| โโโโโโโโฌโโโโโโโ | |
| โ | |
| โผ | |
| โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ | |
| โ 1. Query Embedding โ | |
| โ (sentence-transformers) โ | |
| โโโโโโโโโโโโฌโโโโโโโโโโโโโโโโโโโโโโโโโโโ | |
| โ | |
| โผ | |
| โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ | |
| โ 2. Vector Search (ChromaDB) โ | |
| โ - Top 5 relevant chunks โ | |
| โ - Metadata: source, section โ | |
| โโโโโโโโโโโโฌโโโโโโโโโโโโโโโโโโโโโโโโโโโ | |
| โ | |
| โผ | |
| โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ | |
| โ 3. Context Assembly โ | |
| โ - Format chunks โ | |
| โ - Add instructions โ | |
| โโโโโโโโโโโโฌโโโโโโโโโโโโโโโโโโโโโโโโโโโ | |
| โ | |
| โผ | |
| โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ | |
| โ 4. LLM Generation (HF Inference) โ | |
| โ - Answer with citations โ | |
| โ - Code examples preserved โ | |
| โโโโโโโโโโโโฌโโโโโโโโโโโโโโโโโโโโโโโโโโโ | |
| โ | |
| โผ | |
| โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ | |
| โ 5. Response + Source Links โ | |
| โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ | |
| ``` | |
| ### Local Setup | |
| ```bash | |
| # Clone the repository | |
| git clone https://github.com/aishwarya30998/DeveloperDocs-AI-Copilot-RAG.git | |
| cd DeveloperDocs-AI-Copilot-RAG | |
| # Create virtual environment | |
| python -m venv venv | |
| source venv/bin/activate | |
| # On Windows: venv\Scripts\activate | |
| # Install dependencies | |
| pip install -r requirements.txt | |
| # create .env and add your HF_TOKEN | |
| # Run the application | |
| python app.py | |
| ``` | |
| Visit `http://localhost:7860` in your browser. | |
| ## ๐ฆ Project Structure | |
| ``` | |
| fastapi-docs-copilot/ | |
| โโโ app.py # Gradio UI application | |
| โโโ Dockerfile # Container configuration | |
| โโโ docker-compose.yml # Local container orchestration | |
| โโโ requirements.txt # Python dependencies | |
| โโโ .env.example # Environment variables template | |
| โ | |
| โโโ src/ | |
| โ โโโ __init__.py | |
| โ โโโ config.py # Configuration management | |
| โ โโโ chunking.py # Document chunking strategies | |
| โ โโโ embeddings.py # Embedding generation | |
| โ โโโ retriever.py # Vector search logic | |
| โ โโโ rag_pipeline.py # Main RAG orchestration | |
| โ โโโ prompts.py # Prompt templates | |
| โ | |
| โโโ scripts/ | |
| โ โโโ ingest_docs.py # Documentation ingestion | |
| โ โโโ evaluate_rag.py # RAG metrics evaluation | |
| โ โโโ test_retrieval.py # Test retrieval quality | |
| โ | |
| โโโ data/ | |
| โ โโโ raw/ # Downloaded documentation | |
| โ โโโ processed/ # Chunked documents | |
| โ โโโ vectordb/ # ChromaDB storage | |
| โ | |
| โโโ tests/ | |
| โ โโโ test_chunking.py | |
| โ โโโ test_retriever.py | |
| โ โโโ test_rag_pipeline.py | |
| โ | |
| โโโ evals/ | |
| โโโ test_queries.json # Evaluation dataset | |
| โโโ results/ # Evaluation outputs | |
| ``` | |
| ## ๐ฏ Key Features | |
| ### 1. Smart Chunking | |
| - **Semantic chunking** with overlap for context preservation | |
| - **Metadata enrichment** (section titles, URLs, code blocks) | |
| - **Configurable chunk sizes** (300-800 tokens) | |
| ### 2. Retrieval Quality | |
| - **Hybrid search** (semantic + keyword) | |
| - **Reranking** for improved relevance | |
| - **Source attribution** with confidence scores | |
| ### 3. Answer Generation | |
| - **Code-aware formatting** (preserves indentation) | |
| - **Inline citations** with source links | |
| - **Fallback handling** for low-confidence results | |
| ### 4. Production Features | |
| - **Health check endpoint** (`/health`) | |
| - **Query logging** for analytics | |
| - **Rate limiting** (basic throttling) | |
| - **Error recovery** with graceful degradation | |
| ## ๐ RAG Evaluation | |
| We use **RAGAS** framework to measure: | |
| | Metric | Description | Target Score | | |
| | --------------------- | --------------------------- | ------------ | | |
| | **Faithfulness** | Answer accuracy vs. context | > 0.8 | | |
| | **Answer Relevancy** | Response relevance to query | > 0.7 | | |
| | **Context Precision** | Retrieval accuracy | > 0.75 | | |
| | **Context Recall** | Context completeness | > 0.8 | | |
| Run evaluations: | |
| ```bash | |
| python evaluate_rag.py | |
| ``` | |
| ## ๐ณ Docker Deployment | |
| ### Build and run locally: | |
| ```bash | |
| docker build -t developerdocs-rag | |
| docker run -p 7860:7860 --name developerdocs-rag-container developerdocs-rag | |
| ``` | |
| ### Deploy to HuggingFace Spaces: | |
| 1. Create a new Space on HuggingFace | |
| 2. Enable Docker SDK | |
| 3. Push this repository | |
| 4. Add `HF_TOKEN` as a Space secret | |
| 5. Deploy automatically | |
| ## ๐งช Testing | |
| ```bash | |
| # Run all tests | |
| # Test chunking strategy | |
| pytest test_chunking.py -v | |
| # Test retrieval quality | |
| python test_retrieval.py | |
| ``` | |
| ## ๐ Performance Benchmarks | |
| On HuggingFace Spaces (free tier): | |
| - **Query latency**: ~2-3 seconds | |
| - **Vector DB size**: ~150MB (FastAPI docs) | |
| - **Memory usage**: ~800MB | |
| - **Concurrent users**: 5-10 | |
| ## ๐ ๏ธ Technology Stack | |
| | Component | Technology | Why? | | |
| | -------------- | ---------------------------------------- | ---------------------------------- | | |
| | **Embeddings** | `sentence-transformers/all-MiniLM-L6-v2` | Fast, lightweight, good quality | | |
| | **Vector DB** | ChromaDB | Easy setup, persistent storage | | |
| | **LLM** | HuggingFace Inference API (Mistral-7B) | Free tier, good code understanding | | |
| | **Framework** | LangChain | Industry standard, modular | | |
| | **UI** | Gradio | Rapid prototyping, HF integration | | |
| | **Deployment** | Docker + HF Spaces | Free, scalable, shareable | | |
| ## ๐ฎ Future Enhancements | |
| - [ ] Multi-documentation support (React, Django, etc.) | |
| - [ ] Conversation memory for follow-up questions | |
| - [ ] Advanced retrieval (HyDE, Multi-Query) | |
| - [ ] User feedback loop for continuous improvement | |
| - [ ] Analytics dashboard for query patterns | |
| ## ๐ License | |
| MIT License - feel free to use for your portfolio! | |
| ## ๐ค Contributing | |
| This is a portfolio project, but suggestions are welcome via issues. | |
| ## ๐ง Contact | |
| Built by Aishwarya as a portfolio demonstration of production RAG systems. | |
| - Portfolio: https://aishwarya30998.github.io/projects.html | |
| - LinkedIn: https://www.linkedin.com/in/aishwarya-pentyala/ | |
| --- | |
| โญ If this helped you understand production RAG, give it a star! | |