Spaces:

Abeshith
/

rag-chatbot

Sleeping

App Files Files Community

Abeshith commited on Jan 22

Commit

7c3a93a

1 Parent(s): b66add6

Simplify README with clear flow and user-friendly explanations

Browse files

Files changed (1) hide show

README.md +129 -164

README.md CHANGED Viewed

@@ -10,210 +10,175 @@ pinned: false
 # RAG Chatbot with Advanced Retrieval
-Enterprise-grade Retrieval-Augmented Generation (RAG) chatbot built with LangChain, FastAPI, and modern AI technologies.
-## 🚀 Features
-- **Hybrid Retrieval**: Combines BM25 and vector search for optimal document retrieval
-- **Reranking**: FlashRank reranker for improved result quality
-- **Streaming Responses**: Real-time chat with Server-Sent Events (SSE)
-- **Conversation Memory**: Redis-backed chat history
-- **Smart Caching**: Semantic caching with RAG/non-RAG distinction
-- **Document Processing**: Support for PDF, DOCX, and TXT files
-- **Background Processing**: Celery workers for async document processing
-- **Real-time Updates**: MongoDB change streams for live notifications
-- **Vector Database**: Qdrant for scalable vector storage
-## 🏗️ Architecture
 ```
-├── app/                    # Main application
-│   ├── api/               # FastAPI routes and middleware
-│   ├── core/              # RAG components (retriever, reranker, generator)
-│   ├── db/                # Database clients (MongoDB, Redis, Qdrant)
-│   ├── models/            # Pydantic schemas
-│   ├── services/          # MongoDB watcher
-│   ├── tasks/             # Celery background tasks
-│   └── utils/             # Utilities (logger, errors, prompts)
-├── ingestion/             # Document processing pipeline
-├── frontend/              # Web interface (HTML/CSS/JS)
-├── config/                # YAML configurations
-├── tests/                 # Test suite
-└── prompts/               # LLM prompt templates
 ```
-## 📦 Tech Stack
-- **Framework**: FastAPI + Uvicorn
-- **LLM**: Groq API (llama-3.1-70b)
-- **Embeddings**: FastEmbed (BAAI/bge-small-en-v1.5)
-- **Vector Store**: Qdrant Cloud
-- **Databases**: MongoDB Atlas, Redis Cloud
-- **Reranking**: FlashRank (ms-marco-MiniLM-L-12-v2)
-- **Background Jobs**: Celery
-- **LangChain**: Version 0.3.13 with LangGraph 0.2.58
-## 🛠️ Installation
-### Local Setup
-1. **Clone the repository**
-```bash
-git clone https://github.com/YOUR_USERNAME/rag-chatbot.git
-cd rag-chatbot
 ```
-2. **Create virtual environment**
-```bash
-python -m venv venv
-source venv/bin/activate  # On Windows: venv\Scripts\activate
 ```
-3. **Install dependencies**
-```bash
-pip install -r requirements.txt
-```
-4. **Configure environment**
-Create a `.env` file in the root directory:
-```env
-GROQ_API_KEY=your_groq_api_key
-QDRANT_API_KEY=your_qdrant_api_key
-REDIS_PASSWORD=your_redis_password
-```
-5. **Update configuration**
-Edit `config/database.yaml` with your MongoDB, Redis, and Qdrant URLs.
-6. **Run the application**
-```bash
-uvicorn app.main:app --host 0.0.0.0 --port 7860
-```
-Visit `http://localhost:7860` to access the chat interface.
-### Docker Setup
-1. **Build and run with Docker Compose**
-```bash
-docker-compose up -d
-```
-2. **View logs**
-```bash
-docker-compose logs -f app
-```
-3. **Stop services**
-```bash
-docker-compose down
-```
-## 🧪 Testing
-Run the test suite:
-```bash
-pytest tests/ -v
-```
-Run specific test categories:
-```bash
-# Unit tests only
-pytest tests/ -m unit
-# Integration tests only
-pytest tests/ -m integration
-# Skip slow tests
-pytest tests/ -m "not slow"
-```
-## 🚀 Deployment
-### Hugging Face Spaces
-1. **Create a new Space** on [Hugging Face](https://huggingface.co/spaces)
-2. **Select Docker SDK** as the space type
-3. **Add secrets** in Space settings:
-   - `GROQ_API_KEY`
-   - `QDRANT_API_KEY`
-   - `REDIS_PASSWORD`
-4. **Push code** to the Space repository
-5. **Automatic deployment** via GitHub Actions (see `.github/workflows/deploy.yml`)
-### Manual Deployment
-```bash
-# Build Docker image
-docker build -t rag-chatbot .
-# Run container
-docker run -p 7860:7860 \
-  -e GROQ_API_KEY=your_key \
-  -e QDRANT_API_KEY=your_key \
-  -e REDIS_PASSWORD=your_password \
-  rag-chatbot
 ```
-## 📚 Usage
-### Document Upload
-1. Click "Upload Document" in the sidebar
-2. Select a PDF, DOCX, or TXT file
-3. Wait for processing (documents are chunked and embedded)
-4. Document appears in the sidebar
-### Chat
-1. Toggle RAG on/off using the switch
-2. Type your question in the input field
-3. Press Enter or click Send
-4. Receive streaming responses in real-time
-### RAG vs Non-RAG
-- **RAG ON**: Answers based on your uploaded documents
-- **RAG OFF**: Answers from LLM's general knowledge
-## 🔧 Configuration
-All configuration is in `config/*.yaml` files:
-- `app.yaml` - Server and upload settings
-- `database.yaml` - Database connections
-- `models.yaml` - LLM, embedding, reranker configs
-- `rag.yaml` - Retrieval and chunking parameters
-- `security.yaml` - CORS, rate limiting, JWT
-- `celery.yaml` - Background worker settings
-- `langchain.yaml` - LangChain tracing
-## 🤝 Contributing
-Contributions are welcome! Please:
-1. Fork the repository
-2. Create a feature branch (`git checkout -b feature/amazing-feature`)
-3. Commit your changes (`git commit -m 'Add amazing feature'`)
-4. Push to the branch (`git push origin feature/amazing-feature`)
-5. Open a Pull Request
-## 📄 License
-This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.
-## 🙏 Acknowledgments
-- LangChain for the RAG framework
-- Groq for fast LLM inference
-- Qdrant for vector storage
-- FlashRank for efficient reranking
-- FastEmbed for lightweight embeddings
-## 📧 Contact
-For questions or support, please open an issue on GitHub.
----
-**Built with ❤️ using LangChain, FastAPI, and modern AI technologies**

 # RAG Chatbot with Advanced Retrieval
+A question-answering system that lets you upload documents and ask questions about them. The system retrieves relevant information from your documents and generates accurate answers.
+## How It Works
+### When You Upload a Document
 ```
+1. Upload File (PDF/DOCX/TXT)
+        ↓
+2. Extract Text
+        ↓
+3. Split into Chunks (512 tokens each)
+        ↓
+4. Convert to Embeddings (384D vectors)
+        ↓
+5. Store in Vector Database (Qdrant)
+        ↓
+6. Save Metadata in MongoDB
 ```
+**What happens:** Your document is broken into small chunks, each chunk is converted into a numerical vector that captures its meaning, and stored in a database for fast searching.
+### When You Ask a Question
 ```
+1. Type Your Question
+        ↓
+2. Check Cache (answered before?)
+        ↓
+3. Search Documents (if RAG is ON)
+   - BM25: Find keyword matches
+   - Vector: Find similar meanings
+        ↓
+4. Rerank Results (pick top 5 most relevant)
+        ↓
+5. Build Context from Chunks
+        ↓
+6. Generate Answer with LLM
+        ↓
+7. Stream Response to You
 ```
+**What happens:** The system searches for relevant chunks from your documents, combines them as context, and uses an AI model to generate an answer based on that context.
+## Key Components
+### Document Processing
+**DocumentProcessor** - Main coordinator for document uploads
+- Validates file type and size
+- Calls the right loader for PDF, DOCX, or TXT files
+- Manages the entire processing pipeline
+**Embedder** - Converts text to vectors
+- Uses FastEmbed with BAAI/bge-small-en-v1.5 model
+- Generates 384-dimensional vectors for semantic search
+- Each chunk becomes a searchable vector
+**Qdrant Vector Store** - Stores embeddings
+- Fast similarity search across millions of vectors
+- Returns most relevant chunks for any query
+- Handles all vector operations
+### Question Answering
+**HybridRetriever** - Finds relevant information
+- **BM25**: Traditional keyword search (good for exact matches)
+- **Vector Search**: Semantic search (understands meaning)
+- Combines both for better results
+**Reranker** - Improves search quality
+- Uses FlashRank model to score relevance
+- Filters the best 5 chunks from 20 candidates
+- Ensures only the most relevant context is used
+**Generator** - Creates answers
+- Uses Groq LLM (llama-3.1-70b)
+- Streams responses in real-time
+- Bases answers on retrieved context when RAG is ON
+- Uses general knowledge when RAG is OFF
+**Semantic Cache** - Speeds up responses
+- Remembers previous questions and answers
+- Returns cached response if same question asked again
+- Separate caches for RAG ON vs RAG OFF
+### Memory & Storage
+**Conversation Memory** - Remembers chat history
+- Stores last 10 messages in Redis
+- Enables follow-up questions
+- Each session has independent history
+**MongoDB** - Document metadata
+- Tracks uploaded documents
+- Stores file info, upload time, chunk count
+- Links to vectors in Qdrant
+**Redis** - Fast caching
+- Stores conversation history
+- Caches LLM responses
+- In-memory for instant access
+## Technology Stack
+- **LangChain 0.3.13** - RAG framework
+- **Groq API** - Fast LLM (llama-3.1-70b)
+- **FastEmbed** - Embedding generation
+- **FlashRank** - Result reranking
+- **Qdrant** - Vector database
+- **MongoDB** - Document storage
+- **Redis** - Caching layer
+- **FastAPI** - Web framework
+## Quick Start
+### Installation
+```bash
+# Clone and install
+git clone https://github.com/Abeshith/RAG.git
+cd RAG
+pip install -r requirements.txt
 ```
+### Configuration
+Create `.env` file:
+```env
+GROQ_API_KEY=your_groq_key
+MONGODB_URI=your_mongodb_uri
+REDIS_URL=your_redis_url
+QDRANT_URL=your_qdrant_url
+QDRANT_API_KEY=your_qdrant_key
+JWT_SECRET_KEY=your_secret_key
+```
+### Run
+```bash
+uvicorn app.main:app --host 0.0.0.0 --port 7860
+```
+Open: http://localhost:7860
+## Usage
+1. **Upload Documents**: Click upload, select PDF/DOCX/TXT file
+2. **Ask Questions**: Type question in chat box
+3. **Toggle RAG**:
+   - ON = answers from your documents
+   - OFF = general knowledge answers
+4. **View Sources**: See which document chunks were used
+## API Endpoints
+```
+GET  /health/                    - Check system status
+POST /chat/stream                - Send question, get streaming answer
+POST /documents/upload           - Upload new document
+GET  /documents/                 - List all documents
+GET  /documents/stats            - Get document statistics
+DELETE /documents/{id}           - Delete specific document
+```
+## Docker Deployment
+```bash
+docker build -t rag-chatbot .
+docker run -p 7860:7860 --env-file .env rag-chatbot
+```