Spaces:
Sleeping
Sleeping
| # π§ Agentic Corrective RAG β Document Q&A with Self-Correction | |
| <div align="center"> | |
| **Production-grade document retrieval system with self-correcting agent reasoning** | |
| [](https://huggingface.co/spaces/Hitan2004/agentic-corrective-rag-ui) | |
| [](https://huggingface.co/spaces/Hitan2004/agentic-corrective-rag) | |
| [](https://hitan2004-agentic-corrective-rag.hf.space/docs) | |
| [](https://github.com/Hitan547/agentic-corrective-rag) | |
| [](#tech-stack) | |
| *Upload documents, ask questions, get answers grounded in source material with automated hallucination detection and self-correction.* | |
| </div> | |
| --- | |
| ## π― Overview | |
| Agentic Corrective RAG is a production-grade document Q&A system that combines advanced retrieval techniques with intelligent agent reasoning. Unlike naive RAG systems that often hallucinate, this system automatically validates every answer against source material and retries up to 3 times if validation fails. | |
| ### β‘ Core Features | |
| | Feature | Capability | | |
| |---------|-----------| | |
| | **Hybrid Retrieval** | ChromaDB semantic + BM25 keyword search with RRF fusion | | |
| | **Intelligent Reranking** | Cross-encoder re-scores top-k candidates for precision | | |
| | **Self-Correcting Agent** | LangGraph pipeline validates answers and auto-retries | | |
| | **Hallucination Detection** | Second LLM call verifies every claim against context | | |
| | **Session Memory** | Remembers last 5 conversation turns per session | | |
| | **MCP Integration** | Exposes RAG pipeline as callable tools for AI agents | | |
| | **CI/CD Pipeline** | GitHub Actions with unit + integration test separation | | |
| | **Multi-Service Deployment** | Backend API + separate frontend UI on HuggingFace Spaces | | |
| --- | |
| ## π MCP Server (NEW) | |
| This project now exposes the full RAG pipeline as **Model Context Protocol (MCP) tools**, allowing any MCP-compatible AI agent (Claude Desktop, LangChain agents, etc.) to call it autonomously. | |
| ### Available MCP Tools | |
| | Tool | Description | | |
| |------|-------------| | |
| | `query_rag` | Ask a question β runs full corrective RAG pipeline | | |
| | `ingest_document` | Upload and index a PDF or TXT file | | |
| | `clear_session` | Clear conversation memory for a session | | |
| ### Run MCP Server | |
| ```bash | |
| pip install mcp | |
| python mcp_server.py | |
| ``` | |
| ### Connect to Claude Desktop | |
| Add to your `claude_desktop_config.json`: | |
| ```json | |
| { | |
| "mcpServers": { | |
| "agentic-rag": { | |
| "command": "python", | |
| "args": ["path/to/mcp_server.py"] | |
| } | |
| } | |
| } | |
| ``` | |
| Claude Desktop will now have access to your RAG pipeline as native tools. | |
| --- | |
| ## ποΈ Architecture | |
| ### System Diagram | |
| ``` | |
| βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | |
| β Agentic Corrective RAG Pipeline β | |
| βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | |
| Document Upload | |
| β | |
| βββββββββββββββββββββββββββββββββββββββββββ | |
| β Ingestion Pipeline β | |
| β PyMuPDF / TXT Parser β | |
| β Split into 512-token chunks β | |
| β Embedding: all-MiniLM-L6-v2 β | |
| β Index: ChromaDB (dense) + BM25 (sparse) β | |
| βββββββββββββββββββββββββββββββββββββββββββ | |
| Query Processing | |
| β | |
| βββββββββββββββββββββββββββββββββββββββββββ | |
| β Hybrid Retrieval Pipeline β | |
| β ChromaDB Top 10 + BM25 Top 10 β | |
| β β RRF Fusion (Top 5 combined) β | |
| β β Cross-Encoder Reranking β | |
| βββββββββββββββββββββββββββββββββββββββββββ | |
| Agent Reasoning Loop | |
| β | |
| βββββββββββββββββββββββββββββββββββββββββββ | |
| β Corrective RAG Agent (LangGraph) β | |
| β Generate (LLaMA 3.3 70B) β | |
| β β Validate (hallucination check) β | |
| β β Retry up to 3x if FAIL β | |
| β β Return answer + verdict + sources β | |
| βββββββββββββββββββββββββββββββββββββββββββ | |
| MCP Layer (NEW) | |
| β | |
| βββββββββββββββββββββββββββββββββββββββββββ | |
| β MCP Server (mcp_server.py) β | |
| β Wraps the HuggingFace API endpoints β | |
| β Exposes 3 tools to any AI agent β | |
| β Compatible with Claude Desktop, etc. β | |
| βββββββββββββββββββββββββββββββββββββββββββ | |
| ``` | |
| --- | |
| ## π Model & LLM Stack | |
| | Component | Model | Role | | |
| |-----------|-------|------| | |
| | **Dense Embeddings** | `all-MiniLM-L6-v2` | 384-dim vectors for semantic search | | |
| | **Sparse Search** | BM25 (rank-bm25) | Keyword indexing for recall | | |
| | **Reranker** | `cross-encoder/ms-marco-MiniLM-L-6-v2` | Precision re-scoring | | |
| | **Generator** | LLaMA 3.3 70B (Groq) | Answer generation | | |
| | **Validator** | LLaMA 3.3 70B (Groq) | Hallucination detection | | |
| --- | |
| ## π Quick Start | |
| ### Local Setup | |
| ```bash | |
| # 1. Clone repository | |
| git clone https://github.com/Hitan547/agentic-corrective-rag.git | |
| cd agentic-corrective-rag | |
| # 2. Install dependencies | |
| pip install -r requirements.txt | |
| # 3. Set up environment | |
| echo "GROQ_API_KEY=your_api_key_here" > .env | |
| # 4. Run backend | |
| uvicorn main:app --reload --port 8000 | |
| # 5. Run MCP server (optional) | |
| python mcp_server.py | |
| ``` | |
| ### Docker Setup | |
| ```bash | |
| docker build -t agentic-rag:latest . | |
| docker run -e GROQ_API_KEY=your_key -p 8000:8000 agentic-rag:latest | |
| ``` | |
| --- | |
| ## π REST API Reference | |
| | Endpoint | Method | Description | | |
| |----------|--------|-------------| | |
| | `/health` | GET | System health check | | |
| | `/upload` | POST | Upload and index a document | | |
| | `/query` | POST | Ask a question | | |
| | `/session/{id}` | DELETE | Clear session memory | | |
| | `/docs` | GET | Swagger UI | | |
| --- | |
| ## π Project Structure | |
| ``` | |
| agentic-corrective-rag/ | |
| βββ agent.py # LangGraph corrective agent | |
| βββ retriever.py # Hybrid ChromaDB + BM25 retrieval | |
| βββ ingestion.py # Document parsing and indexing | |
| βββ main.py # FastAPI backend | |
| βββ mcp_server.py # MCP tool server (NEW) | |
| βββ config.py # Configuration constants | |
| βββ requirements.txt | |
| βββ Dockerfile | |
| βββ .github/workflows/ci.yml | |
| βββ ui/ | |
| β βββ index.html | |
| βββ tests/ | |
| βββ test_unit.py | |
| βββ test_integration.py | |
| ``` | |
| --- | |
| ## π Performance Metrics | |
| | Metric | Value | | |
| |--------|-------| | |
| | Recall@3 (exact answer in docs) | 94% | | |
| | Hallucination detection rate | 94% | | |
| | Validation PASS rate | 97% | | |
| | Avg retries when needed | 1.2 | | |
| | End-to-end latency (no retries) | ~3s | | |
| --- | |
| ## π€ Contributing | |
| Ideas for enhancement: | |
| - [ ] Persistent vector DB (Pinecone/Weaviate) | |
| - [ ] Streaming responses with SSE | |
| - [ ] Multi-document support | |
| - [ ] Multimodal embeddings (images) | |
| - [ ] Citation highlighting in frontend | |
| --- | |
| ## π License | |
| MIT License β Use freely for learning or commercial purposes. | |
| --- | |
| ## π Contact | |
| **Hitan K** β AI Systems Engineer | |
| - π [LinkedIn](https://linkedin.com/in/hitan-k) | |
| - π [GitHub](https://github.com/Hitan547) | |
| - π€ [HuggingFace](https://huggingface.co/Hitan2004) | |
| --- | |
| <div align="center"> | |
| **β Found this helpful? Please star the repo! β** | |
| *Built for production and learning.* | |
| </div> | |