Spaces:

sinhapiyush86
/

convAI

Sleeping

App Files Files Community

convAI / README.md

sinhapiyush86

Upload 15 files

afad319 verified 6 months ago

preview code

raw

history blame contribute delete

8.56 kB

	---
	title: RAG System with PDF Documents
	emoji: 🤖
	colorFrom: blue
	colorTo: purple
	sdk: docker
	sdk_version: latest
	app_file: app.py
	pinned: false
	app_port: 8501
	---

	# 🤖 Conversational AI RAG System

	A comprehensive Retrieval-Augmented Generation (RAG) system with advanced guard rails, built with Streamlit, FAISS, and Hugging Face models.

	## 🚀 Features

	- Hybrid Search: Combines dense (FAISS) and sparse (BM25) retrieval for optimal results
	- Advanced Guard Rails: Comprehensive safety and security measures
	- Multiple Models: Support for Qwen 2.5 1.5B and distilgpt2 fallback
	- PDF Processing: Intelligent document chunking and processing
	- Real-time Monitoring: Performance metrics and system health checks
	- Docker Support: Containerized deployment with Docker Compose
	- Hugging Face Spaces Ready: Optimized for HF Spaces deployment

	## 🏗️ Architecture

	```
	┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
	│ Streamlit UI │───▶│ RAG System │───▶│ Guard Rails │
	└─────────────────┘ └─────────────────┘ └─────────────────┘
	│
	▼
	┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
	│ PDF Processor │ │ FAISS Index │ │ Language Model │
	└─────────────────┘ └─────────────────┘ └─────────────────┘
	```

	## 🛠️ Technology Stack

	### Core Technologies
	- 🔍 Vector Database: FAISS for efficient similarity search
	- 📝 Sparse Retrieval: BM25 for keyword-based search
	- 🧠 Embedding Model: all-MiniLM-L6-v2 for document embeddings
	- 🤖 Generative Model: Qwen 2.5 1.5B for answer generation
	- 🌐 UI Framework: Streamlit for interactive interface
	- 🐳 Containerization: Docker for deployment

	### Supporting Libraries
	- 📊 Data Processing: Pandas, NumPy for data manipulation
	- 📄 PDF Handling: PyPDF for document processing
	- 🔧 ML Utilities: Scikit-learn for preprocessing
	- 📝 Logging: Loguru for structured logging
	- ⚡ Optimization: Accelerate for model optimization

	## 🚀 Quick Start

	### Local Development

	1. Clone and Setup:
	```bash
	git clone <repository-url>
	cd convAI
	pip install -r requirements.txt
	```

	2. Run the Application:
	```bash
	streamlit run app.py
	```

	3. Upload PDFs and Start Chatting!

	### Docker Deployment

	1. Build and Run:
	```bash
	docker-compose up --build
	```

	2. Access at: http://localhost:8501

	## 🌟 Hugging Face Spaces Deployment

	This application is optimized for deployment on Hugging Face Spaces. The system automatically:

	- Uses `/tmp` directories for cache storage (writable in HF Spaces)
	- Configures environment variables for HF Spaces compatibility
	- Handles permission issues automatically
	- Optimizes model loading for HF Spaces environment

	### HF Spaces Configuration

	The application includes:
	- Cache Management: All model caches stored in `/tmp` directories
	- Permission Handling: Automatic fallback to writable directories
	- Environment Detection: Adapts to HF Spaces runtime environment
	- Resource Optimization: Efficient memory and CPU usage

	### Deploy to HF Spaces

	1. Create a new Space on Hugging Face
	2. Choose Docker as the SDK
	3. Upload all files from this repository
	4. The system will automatically:
	- Set up cache directories in `/tmp`
	- Download and cache models
	- Initialize the RAG system with guard rails
	- Start the Streamlit interface

	### HF Spaces Environment Variables

	The system automatically configures:
	```bash
	HF_HOME=/tmp/huggingface
	TRANSFORMERS_CACHE=/tmp/huggingface/transformers
	TORCH_HOME=/tmp/torch
	XDG_CACHE_HOME=/tmp
	HF_HUB_CACHE=/tmp/huggingface/hub
	```

	## 📖 Usage Guide

	### Document Upload
	- Automatic Loading: PDF documents in the container are loaded automatically
	- Manual Upload: Use the sidebar to upload additional PDF documents
	- Supported Formats: PDF files with text content

	### Search Methods
	- 🔀 Hybrid: Combines vector similarity and keyword matching (recommended)
	- 🎯 Dense: Uses only vector similarity search
	- 📝 Sparse: Uses only keyword-based BM25 search

	### Query Interface
	- Natural Language: Ask questions in plain English
	- Context Awareness: System uses retrieved documents for context
	- Confidence Scores: See how confident the system is in its answers
	- Source Citations: View which documents were used for the answer

	## ⚙️ Configuration

	### Environment Variables
	```bash
	# Model Configuration
	EMBEDDING_MODEL=all-MiniLM-L6-v2
	GENERATIVE_MODEL=Qwen/Qwen2.5-1.5B-Instruct

	# Chunk Sizes
	CHUNK_SIZES=100,400

	# Vector Store Path
	VECTOR_STORE_PATH=./vector_store

	# Streamlit Configuration
	STREAMLIT_SERVER_PORT=8501
	STREAMLIT_SERVER_ADDRESS=0.0.0.0
	```

	### Performance Tuning
	- Chunk Sizes: Adjust for different document types (smaller for technical docs, larger for narratives)
	- Top-k Results: Increase for more comprehensive answers, decrease for faster responses
	- Model Selection: Choose between Qwen 2.5 1.5B and distilgpt2 based on performance needs

	## 📊 Performance

	### Optimization Features
	- Parallel Processing: Documents are loaded concurrently for faster initialization
	- Optimized Search: Hybrid retrieval combines the best of vector and keyword search
	- Memory Efficient: Uses CPU-optimized models for deployment compatibility
	- Caching: FAISS index and metadata are cached for faster subsequent queries

	### Expected Performance
	- Document Loading: ~2-5 seconds per PDF (depending on size)
	- Query Response: ~1-3 seconds for typical questions
	- Memory Usage: ~2-4GB RAM for typical document collections
	- Storage: ~100MB per 1000 document chunks

	## 🔧 Development

	### Project Structure
	```
	convAI/
	├── app.py # Main Streamlit application
	├── rag_system.py # Core RAG system implementation
	├── pdf_processor.py # PDF processing utilities
	├── requirements.txt # Python dependencies
	├── Dockerfile # Container configuration
	├── docker-compose.yml # Multi-container setup
	├── README.md # This file
	├── DEPLOYMENT_GUIDE.md # Detailed deployment instructions
	├── test_deployment.py # Deployment testing script
	├── test_docker.py # Docker testing script
	└── src/
	└── streamlit_app.py # Sample Streamlit app
	```

	### Testing
	```bash
	# Test deployment readiness
	python test_deployment.py

	# Test Docker configuration
	python test_docker.py

	# Run local tests
	streamlit run app.py
	```

	## 🐛 Troubleshooting

	### Common Issues

	1. Model Loading Errors
	- Check internet connectivity for model downloads
	- Verify sufficient disk space
	- Try the fallback model (distilgpt2)

	2. Memory Issues
	- Reduce chunk sizes
	- Use smaller embedding models
	- Limit the number of documents

	3. Performance Issues
	- Adjust top-k parameter
	- Use sparse search for keyword-heavy queries
	- Consider hardware upgrades

	4. Docker Issues
	- Check Docker installation
	- Verify port availability
	- Check container logs

	### Getting Help
	- Check the logs in your Space's "Logs" tab
	- Review the deployment guide for common solutions
	- Create an issue in the project repository

	## 🤝 Contributing

	We welcome contributions! Please see our contributing guidelines for:
	- Code style and standards
	- Testing requirements
	- Documentation updates
	- Feature requests and bug reports

	## 📄 License

	This project is licensed under the MIT License - see the LICENSE file for details.

	## 🙏 Acknowledgments

	- Hugging Face for providing the platform and models
	- FAISS team for the efficient vector search library
	- Streamlit team for the excellent web framework
	- OpenAI for inspiring the RAG architecture

	---

	Built with ❤️ for efficient document question-answering

	Ready to explore your documents? Start asking questions! 🚀