Spaces:
Sleeping
Sleeping
| title: RAG System with PDF Documents | |
| emoji: π€ | |
| colorFrom: blue | |
| colorTo: purple | |
| sdk: docker | |
| sdk_version: latest | |
| app_file: app.py | |
| pinned: false | |
| app_port: 8501 | |
| # π€ Conversational AI RAG System | |
| A comprehensive Retrieval-Augmented Generation (RAG) system with advanced guard rails, built with Streamlit, FAISS, and Hugging Face models. | |
| ## π Features | |
| - **Hybrid Search**: Combines dense (FAISS) and sparse (BM25) retrieval for optimal results | |
| - **Advanced Guard Rails**: Comprehensive safety and security measures | |
| - **Multiple Models**: Support for Qwen 2.5 1.5B and distilgpt2 fallback | |
| - **PDF Processing**: Intelligent document chunking and processing | |
| - **Real-time Monitoring**: Performance metrics and system health checks | |
| - **Docker Support**: Containerized deployment with Docker Compose | |
| - **Hugging Face Spaces Ready**: Optimized for HF Spaces deployment | |
| ## ποΈ Architecture | |
| ``` | |
| βββββββββββββββββββ βββββββββββββββββββ βββββββββββββββββββ | |
| β Streamlit UI βββββΆβ RAG System βββββΆβ Guard Rails β | |
| βββββββββββββββββββ βββββββββββββββββββ βββββββββββββββββββ | |
| β | |
| βΌ | |
| βββββββββββββββββββ βββββββββββββββββββ βββββββββββββββββββ | |
| β PDF Processor β β FAISS Index β β Language Model β | |
| βββββββββββββββββββ βββββββββββββββββββ βββββββββββββββββββ | |
| ``` | |
| ## π οΈ Technology Stack | |
| ### Core Technologies | |
| - **π Vector Database**: FAISS for efficient similarity search | |
| - **π Sparse Retrieval**: BM25 for keyword-based search | |
| - **π§ Embedding Model**: all-MiniLM-L6-v2 for document embeddings | |
| - **π€ Generative Model**: Qwen 2.5 1.5B for answer generation | |
| - **π UI Framework**: Streamlit for interactive interface | |
| - **π³ Containerization**: Docker for deployment | |
| ### Supporting Libraries | |
| - **π Data Processing**: Pandas, NumPy for data manipulation | |
| - **π PDF Handling**: PyPDF for document processing | |
| - **π§ ML Utilities**: Scikit-learn for preprocessing | |
| - **π Logging**: Loguru for structured logging | |
| - **β‘ Optimization**: Accelerate for model optimization | |
| ## π Quick Start | |
| ### Local Development | |
| 1. **Clone and Setup**: | |
| ```bash | |
| git clone <repository-url> | |
| cd convAI | |
| pip install -r requirements.txt | |
| ``` | |
| 2. **Run the Application**: | |
| ```bash | |
| streamlit run app.py | |
| ``` | |
| 3. **Upload PDFs and Start Chatting**! | |
| ### Docker Deployment | |
| 1. **Build and Run**: | |
| ```bash | |
| docker-compose up --build | |
| ``` | |
| 2. **Access at**: http://localhost:8501 | |
| ## π Hugging Face Spaces Deployment | |
| This application is optimized for deployment on Hugging Face Spaces. The system automatically: | |
| - Uses `/tmp` directories for cache storage (writable in HF Spaces) | |
| - Configures environment variables for HF Spaces compatibility | |
| - Handles permission issues automatically | |
| - Optimizes model loading for HF Spaces environment | |
| ### HF Spaces Configuration | |
| The application includes: | |
| - **Cache Management**: All model caches stored in `/tmp` directories | |
| - **Permission Handling**: Automatic fallback to writable directories | |
| - **Environment Detection**: Adapts to HF Spaces runtime environment | |
| - **Resource Optimization**: Efficient memory and CPU usage | |
| ### Deploy to HF Spaces | |
| 1. **Create a new Space** on Hugging Face | |
| 2. **Choose Docker** as the SDK | |
| 3. **Upload all files** from this repository | |
| 4. **The system will automatically**: | |
| - Set up cache directories in `/tmp` | |
| - Download and cache models | |
| - Initialize the RAG system with guard rails | |
| - Start the Streamlit interface | |
| ### HF Spaces Environment Variables | |
| The system automatically configures: | |
| ```bash | |
| HF_HOME=/tmp/huggingface | |
| TRANSFORMERS_CACHE=/tmp/huggingface/transformers | |
| TORCH_HOME=/tmp/torch | |
| XDG_CACHE_HOME=/tmp | |
| HF_HUB_CACHE=/tmp/huggingface/hub | |
| ``` | |
| ## π Usage Guide | |
| ### Document Upload | |
| - **Automatic Loading**: PDF documents in the container are loaded automatically | |
| - **Manual Upload**: Use the sidebar to upload additional PDF documents | |
| - **Supported Formats**: PDF files with text content | |
| ### Search Methods | |
| - **π Hybrid**: Combines vector similarity and keyword matching (recommended) | |
| - **π― Dense**: Uses only vector similarity search | |
| - **π Sparse**: Uses only keyword-based BM25 search | |
| ### Query Interface | |
| - **Natural Language**: Ask questions in plain English | |
| - **Context Awareness**: System uses retrieved documents for context | |
| - **Confidence Scores**: See how confident the system is in its answers | |
| - **Source Citations**: View which documents were used for the answer | |
| ## βοΈ Configuration | |
| ### Environment Variables | |
| ```bash | |
| # Model Configuration | |
| EMBEDDING_MODEL=all-MiniLM-L6-v2 | |
| GENERATIVE_MODEL=Qwen/Qwen2.5-1.5B-Instruct | |
| # Chunk Sizes | |
| CHUNK_SIZES=100,400 | |
| # Vector Store Path | |
| VECTOR_STORE_PATH=./vector_store | |
| # Streamlit Configuration | |
| STREAMLIT_SERVER_PORT=8501 | |
| STREAMLIT_SERVER_ADDRESS=0.0.0.0 | |
| ``` | |
| ### Performance Tuning | |
| - **Chunk Sizes**: Adjust for different document types (smaller for technical docs, larger for narratives) | |
| - **Top-k Results**: Increase for more comprehensive answers, decrease for faster responses | |
| - **Model Selection**: Choose between Qwen 2.5 1.5B and distilgpt2 based on performance needs | |
| ## π Performance | |
| ### Optimization Features | |
| - **Parallel Processing**: Documents are loaded concurrently for faster initialization | |
| - **Optimized Search**: Hybrid retrieval combines the best of vector and keyword search | |
| - **Memory Efficient**: Uses CPU-optimized models for deployment compatibility | |
| - **Caching**: FAISS index and metadata are cached for faster subsequent queries | |
| ### Expected Performance | |
| - **Document Loading**: ~2-5 seconds per PDF (depending on size) | |
| - **Query Response**: ~1-3 seconds for typical questions | |
| - **Memory Usage**: ~2-4GB RAM for typical document collections | |
| - **Storage**: ~100MB per 1000 document chunks | |
| ## π§ Development | |
| ### Project Structure | |
| ``` | |
| convAI/ | |
| βββ app.py # Main Streamlit application | |
| βββ rag_system.py # Core RAG system implementation | |
| βββ pdf_processor.py # PDF processing utilities | |
| βββ requirements.txt # Python dependencies | |
| βββ Dockerfile # Container configuration | |
| βββ docker-compose.yml # Multi-container setup | |
| βββ README.md # This file | |
| βββ DEPLOYMENT_GUIDE.md # Detailed deployment instructions | |
| βββ test_deployment.py # Deployment testing script | |
| βββ test_docker.py # Docker testing script | |
| βββ src/ | |
| βββ streamlit_app.py # Sample Streamlit app | |
| ``` | |
| ### Testing | |
| ```bash | |
| # Test deployment readiness | |
| python test_deployment.py | |
| # Test Docker configuration | |
| python test_docker.py | |
| # Run local tests | |
| streamlit run app.py | |
| ``` | |
| ## π Troubleshooting | |
| ### Common Issues | |
| 1. **Model Loading Errors** | |
| - Check internet connectivity for model downloads | |
| - Verify sufficient disk space | |
| - Try the fallback model (distilgpt2) | |
| 2. **Memory Issues** | |
| - Reduce chunk sizes | |
| - Use smaller embedding models | |
| - Limit the number of documents | |
| 3. **Performance Issues** | |
| - Adjust top-k parameter | |
| - Use sparse search for keyword-heavy queries | |
| - Consider hardware upgrades | |
| 4. **Docker Issues** | |
| - Check Docker installation | |
| - Verify port availability | |
| - Check container logs | |
| ### Getting Help | |
| - Check the logs in your Space's "Logs" tab | |
| - Review the deployment guide for common solutions | |
| - Create an issue in the project repository | |
| ## π€ Contributing | |
| We welcome contributions! Please see our contributing guidelines for: | |
| - Code style and standards | |
| - Testing requirements | |
| - Documentation updates | |
| - Feature requests and bug reports | |
| ## π License | |
| This project is licensed under the MIT License - see the LICENSE file for details. | |
| ## π Acknowledgments | |
| - **Hugging Face** for providing the platform and models | |
| - **FAISS** team for the efficient vector search library | |
| - **Streamlit** team for the excellent web framework | |
| - **OpenAI** for inspiring the RAG architecture | |
| --- | |
| *Built with β€οΈ for efficient document question-answering* | |
| **Ready to explore your documents? Start asking questions! π** | |