# 🧠 Agentic Corrective RAG — Document Q&A with Self-Correction
**Production-grade document retrieval system with self-correcting agent reasoning**
[](https://huggingface.co/spaces/Hitan2004/agentic-corrective-rag-ui)
[](https://huggingface.co/spaces/Hitan2004/agentic-corrective-rag)
[](https://hitan2004-agentic-corrective-rag.hf.space/docs)
[](https://github.com/Hitan547/agentic-corrective-rag)
[](#tech-stack)
*Upload documents, ask questions, get answers grounded in source material with automated hallucination detection and self-correction.*
---
## 🎯 Overview
Agentic Corrective RAG is a production-grade document Q&A system that combines advanced retrieval techniques with intelligent agent reasoning. Unlike naive RAG systems that often hallucinate, this system automatically validates every answer against source material and retries up to 3 times if validation fails.
### ⚡ Core Features
| Feature | Capability |
|---------|-----------|
| **Hybrid Retrieval** | ChromaDB semantic + BM25 keyword search with RRF fusion |
| **Intelligent Reranking** | Cross-encoder re-scores top-k candidates for precision |
| **Self-Correcting Agent** | LangGraph pipeline validates answers and auto-retries |
| **Hallucination Detection** | Second LLM call verifies every claim against context |
| **Session Memory** | Remembers last 5 conversation turns per session |
| **MCP Integration** | Exposes RAG pipeline as callable tools for AI agents |
| **CI/CD Pipeline** | GitHub Actions with unit + integration test separation |
| **Multi-Service Deployment** | Backend API + separate frontend UI on HuggingFace Spaces |
---
## 🔌 MCP Server (NEW)
This project now exposes the full RAG pipeline as **Model Context Protocol (MCP) tools**, allowing any MCP-compatible AI agent (Claude Desktop, LangChain agents, etc.) to call it autonomously.
### Available MCP Tools
| Tool | Description |
|------|-------------|
| `query_rag` | Ask a question — runs full corrective RAG pipeline |
| `ingest_document` | Upload and index a PDF or TXT file |
| `clear_session` | Clear conversation memory for a session |
### Run MCP Server
```bash
pip install mcp
python mcp_server.py
```
### Connect to Claude Desktop
Add to your `claude_desktop_config.json`:
```json
{
"mcpServers": {
"agentic-rag": {
"command": "python",
"args": ["path/to/mcp_server.py"]
}
}
}
```
Claude Desktop will now have access to your RAG pipeline as native tools.
---
## 🏗️ Architecture
### System Diagram
```
┌─────────────────────────────────────────────────────────┐
│ Agentic Corrective RAG Pipeline │
└─────────────────────────────────────────────────────────┘
Document Upload
↓
┌─────────────────────────────────────────┐
│ Ingestion Pipeline │
│ PyMuPDF / TXT Parser │
│ Split into 512-token chunks │
│ Embedding: all-MiniLM-L6-v2 │
│ Index: ChromaDB (dense) + BM25 (sparse) │
└─────────────────────────────────────────┘
Query Processing
↓
┌─────────────────────────────────────────┐
│ Hybrid Retrieval Pipeline │
│ ChromaDB Top 10 + BM25 Top 10 │
│ → RRF Fusion (Top 5 combined) │
│ → Cross-Encoder Reranking │
└─────────────────────────────────────────┘
Agent Reasoning Loop
↓
┌─────────────────────────────────────────┐
│ Corrective RAG Agent (LangGraph) │
│ Generate (LLaMA 3.3 70B) │
│ → Validate (hallucination check) │
│ → Retry up to 3x if FAIL │
│ → Return answer + verdict + sources │
└─────────────────────────────────────────┘
MCP Layer (NEW)
↓
┌─────────────────────────────────────────┐
│ MCP Server (mcp_server.py) │
│ Wraps the HuggingFace API endpoints │
│ Exposes 3 tools to any AI agent │
│ Compatible with Claude Desktop, etc. │
└─────────────────────────────────────────┘
```
---
## 📊 Model & LLM Stack
| Component | Model | Role |
|-----------|-------|------|
| **Dense Embeddings** | `all-MiniLM-L6-v2` | 384-dim vectors for semantic search |
| **Sparse Search** | BM25 (rank-bm25) | Keyword indexing for recall |
| **Reranker** | `cross-encoder/ms-marco-MiniLM-L-6-v2` | Precision re-scoring |
| **Generator** | LLaMA 3.3 70B (Groq) | Answer generation |
| **Validator** | LLaMA 3.3 70B (Groq) | Hallucination detection |
---
## 🚀 Quick Start
### Local Setup
```bash
# 1. Clone repository
git clone https://github.com/Hitan547/agentic-corrective-rag.git
cd agentic-corrective-rag
# 2. Install dependencies
pip install -r requirements.txt
# 3. Set up environment
echo "GROQ_API_KEY=your_api_key_here" > .env
# 4. Run backend
uvicorn main:app --reload --port 8000
# 5. Run MCP server (optional)
python mcp_server.py
```
### Docker Setup
```bash
docker build -t agentic-rag:latest .
docker run -e GROQ_API_KEY=your_key -p 8000:8000 agentic-rag:latest
```
---
## 🔌 REST API Reference
| Endpoint | Method | Description |
|----------|--------|-------------|
| `/health` | GET | System health check |
| `/upload` | POST | Upload and index a document |
| `/query` | POST | Ask a question |
| `/session/{id}` | DELETE | Clear session memory |
| `/docs` | GET | Swagger UI |
---
## 📁 Project Structure
```
agentic-corrective-rag/
├── agent.py # LangGraph corrective agent
├── retriever.py # Hybrid ChromaDB + BM25 retrieval
├── ingestion.py # Document parsing and indexing
├── main.py # FastAPI backend
├── mcp_server.py # MCP tool server (NEW)
├── config.py # Configuration constants
├── requirements.txt
├── Dockerfile
├── .github/workflows/ci.yml
├── ui/
│ └── index.html
└── tests/
├── test_unit.py
└── test_integration.py
```
---
## 📈 Performance Metrics
| Metric | Value |
|--------|-------|
| Recall@3 (exact answer in docs) | 94% |
| Hallucination detection rate | 94% |
| Validation PASS rate | 97% |
| Avg retries when needed | 1.2 |
| End-to-end latency (no retries) | ~3s |
---
## 🤝 Contributing
Ideas for enhancement:
- [ ] Persistent vector DB (Pinecone/Weaviate)
- [ ] Streaming responses with SSE
- [ ] Multi-document support
- [ ] Multimodal embeddings (images)
- [ ] Citation highlighting in frontend
---
## 📜 License
MIT License — Use freely for learning or commercial purposes.
---
## 📞 Contact
**Hitan K** — AI Systems Engineer
- 🔗 [LinkedIn](https://linkedin.com/in/hitan-k)
- 🐙 [GitHub](https://github.com/Hitan547)
- 🤗 [HuggingFace](https://huggingface.co/Hitan2004)
---
**⭐ Found this helpful? Please star the repo! ⭐**
*Built for production and learning.*