Spaces:
Sleeping
Sleeping
File size: 8,563 Bytes
a943b87 552c957 a943b87 192b2d2 afad319 192b2d2 afad319 192b2d2 552c957 192b2d2 afad319 552c957 afad319 552c957 afad319 552c957 afad319 552c957 afad319 552c957 afad319 552c957 afad319 552c957 afad319 552c957 afad319 552c957 afad319 552c957 afad319 552c957 192b2d2 552c957 192b2d2 552c957 192b2d2 552c957 192b2d2 552c957 192b2d2 552c957 a943b87 552c957 192b2d2 3852fcf a943b87 552c957 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 |
---
title: RAG System with PDF Documents
emoji: π€
colorFrom: blue
colorTo: purple
sdk: docker
sdk_version: latest
app_file: app.py
pinned: false
app_port: 8501
---
# π€ Conversational AI RAG System
A comprehensive Retrieval-Augmented Generation (RAG) system with advanced guard rails, built with Streamlit, FAISS, and Hugging Face models.
## π Features
- **Hybrid Search**: Combines dense (FAISS) and sparse (BM25) retrieval for optimal results
- **Advanced Guard Rails**: Comprehensive safety and security measures
- **Multiple Models**: Support for Qwen 2.5 1.5B and distilgpt2 fallback
- **PDF Processing**: Intelligent document chunking and processing
- **Real-time Monitoring**: Performance metrics and system health checks
- **Docker Support**: Containerized deployment with Docker Compose
- **Hugging Face Spaces Ready**: Optimized for HF Spaces deployment
## ποΈ Architecture
```
βββββββββββββββββββ βββββββββββββββββββ βββββββββββββββββββ
β Streamlit UI βββββΆβ RAG System βββββΆβ Guard Rails β
βββββββββββββββββββ βββββββββββββββββββ βββββββββββββββββββ
β
βΌ
βββββββββββββββββββ βββββββββββββββββββ βββββββββββββββββββ
β PDF Processor β β FAISS Index β β Language Model β
βββββββββββββββββββ βββββββββββββββββββ βββββββββββββββββββ
```
## π οΈ Technology Stack
### Core Technologies
- **π Vector Database**: FAISS for efficient similarity search
- **π Sparse Retrieval**: BM25 for keyword-based search
- **π§ Embedding Model**: all-MiniLM-L6-v2 for document embeddings
- **π€ Generative Model**: Qwen 2.5 1.5B for answer generation
- **π UI Framework**: Streamlit for interactive interface
- **π³ Containerization**: Docker for deployment
### Supporting Libraries
- **π Data Processing**: Pandas, NumPy for data manipulation
- **π PDF Handling**: PyPDF for document processing
- **π§ ML Utilities**: Scikit-learn for preprocessing
- **π Logging**: Loguru for structured logging
- **β‘ Optimization**: Accelerate for model optimization
## π Quick Start
### Local Development
1. **Clone and Setup**:
```bash
git clone <repository-url>
cd convAI
pip install -r requirements.txt
```
2. **Run the Application**:
```bash
streamlit run app.py
```
3. **Upload PDFs and Start Chatting**!
### Docker Deployment
1. **Build and Run**:
```bash
docker-compose up --build
```
2. **Access at**: http://localhost:8501
## π Hugging Face Spaces Deployment
This application is optimized for deployment on Hugging Face Spaces. The system automatically:
- Uses `/tmp` directories for cache storage (writable in HF Spaces)
- Configures environment variables for HF Spaces compatibility
- Handles permission issues automatically
- Optimizes model loading for HF Spaces environment
### HF Spaces Configuration
The application includes:
- **Cache Management**: All model caches stored in `/tmp` directories
- **Permission Handling**: Automatic fallback to writable directories
- **Environment Detection**: Adapts to HF Spaces runtime environment
- **Resource Optimization**: Efficient memory and CPU usage
### Deploy to HF Spaces
1. **Create a new Space** on Hugging Face
2. **Choose Docker** as the SDK
3. **Upload all files** from this repository
4. **The system will automatically**:
- Set up cache directories in `/tmp`
- Download and cache models
- Initialize the RAG system with guard rails
- Start the Streamlit interface
### HF Spaces Environment Variables
The system automatically configures:
```bash
HF_HOME=/tmp/huggingface
TRANSFORMERS_CACHE=/tmp/huggingface/transformers
TORCH_HOME=/tmp/torch
XDG_CACHE_HOME=/tmp
HF_HUB_CACHE=/tmp/huggingface/hub
```
## π Usage Guide
### Document Upload
- **Automatic Loading**: PDF documents in the container are loaded automatically
- **Manual Upload**: Use the sidebar to upload additional PDF documents
- **Supported Formats**: PDF files with text content
### Search Methods
- **π Hybrid**: Combines vector similarity and keyword matching (recommended)
- **π― Dense**: Uses only vector similarity search
- **π Sparse**: Uses only keyword-based BM25 search
### Query Interface
- **Natural Language**: Ask questions in plain English
- **Context Awareness**: System uses retrieved documents for context
- **Confidence Scores**: See how confident the system is in its answers
- **Source Citations**: View which documents were used for the answer
## βοΈ Configuration
### Environment Variables
```bash
# Model Configuration
EMBEDDING_MODEL=all-MiniLM-L6-v2
GENERATIVE_MODEL=Qwen/Qwen2.5-1.5B-Instruct
# Chunk Sizes
CHUNK_SIZES=100,400
# Vector Store Path
VECTOR_STORE_PATH=./vector_store
# Streamlit Configuration
STREAMLIT_SERVER_PORT=8501
STREAMLIT_SERVER_ADDRESS=0.0.0.0
```
### Performance Tuning
- **Chunk Sizes**: Adjust for different document types (smaller for technical docs, larger for narratives)
- **Top-k Results**: Increase for more comprehensive answers, decrease for faster responses
- **Model Selection**: Choose between Qwen 2.5 1.5B and distilgpt2 based on performance needs
## π Performance
### Optimization Features
- **Parallel Processing**: Documents are loaded concurrently for faster initialization
- **Optimized Search**: Hybrid retrieval combines the best of vector and keyword search
- **Memory Efficient**: Uses CPU-optimized models for deployment compatibility
- **Caching**: FAISS index and metadata are cached for faster subsequent queries
### Expected Performance
- **Document Loading**: ~2-5 seconds per PDF (depending on size)
- **Query Response**: ~1-3 seconds for typical questions
- **Memory Usage**: ~2-4GB RAM for typical document collections
- **Storage**: ~100MB per 1000 document chunks
## π§ Development
### Project Structure
```
convAI/
βββ app.py # Main Streamlit application
βββ rag_system.py # Core RAG system implementation
βββ pdf_processor.py # PDF processing utilities
βββ requirements.txt # Python dependencies
βββ Dockerfile # Container configuration
βββ docker-compose.yml # Multi-container setup
βββ README.md # This file
βββ DEPLOYMENT_GUIDE.md # Detailed deployment instructions
βββ test_deployment.py # Deployment testing script
βββ test_docker.py # Docker testing script
βββ src/
βββ streamlit_app.py # Sample Streamlit app
```
### Testing
```bash
# Test deployment readiness
python test_deployment.py
# Test Docker configuration
python test_docker.py
# Run local tests
streamlit run app.py
```
## π Troubleshooting
### Common Issues
1. **Model Loading Errors**
- Check internet connectivity for model downloads
- Verify sufficient disk space
- Try the fallback model (distilgpt2)
2. **Memory Issues**
- Reduce chunk sizes
- Use smaller embedding models
- Limit the number of documents
3. **Performance Issues**
- Adjust top-k parameter
- Use sparse search for keyword-heavy queries
- Consider hardware upgrades
4. **Docker Issues**
- Check Docker installation
- Verify port availability
- Check container logs
### Getting Help
- Check the logs in your Space's "Logs" tab
- Review the deployment guide for common solutions
- Create an issue in the project repository
## π€ Contributing
We welcome contributions! Please see our contributing guidelines for:
- Code style and standards
- Testing requirements
- Documentation updates
- Feature requests and bug reports
## π License
This project is licensed under the MIT License - see the LICENSE file for details.
## π Acknowledgments
- **Hugging Face** for providing the platform and models
- **FAISS** team for the efficient vector search library
- **Streamlit** team for the excellent web framework
- **OpenAI** for inspiring the RAG architecture
---
*Built with β€οΈ for efficient document question-answering*
**Ready to explore your documents? Start asking questions! π**
|