File size: 8,563 Bytes
a943b87
 
 
 
 
 
 
 
 
552c957
a943b87
192b2d2
afad319
192b2d2
afad319
192b2d2
552c957
192b2d2
afad319
 
 
 
 
 
 
552c957
 
 
 
 
afad319
 
 
 
552c957
afad319
 
552c957
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
afad319
552c957
afad319
552c957
 
 
 
afad319
552c957
afad319
 
552c957
 
 
afad319
 
 
552c957
afad319
552c957
 
afad319
 
 
552c957
afad319
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
552c957
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
192b2d2
552c957
 
192b2d2
552c957
 
 
 
192b2d2
552c957
 
 
 
192b2d2
552c957
192b2d2
552c957
a943b87
 
 
552c957
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
192b2d2
 
3852fcf
a943b87
552c957
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
---
title: RAG System with PDF Documents
emoji: πŸ€–
colorFrom: blue
colorTo: purple
sdk: docker
sdk_version: latest
app_file: app.py
pinned: false
app_port: 8501
---

# πŸ€– Conversational AI RAG System

A comprehensive Retrieval-Augmented Generation (RAG) system with advanced guard rails, built with Streamlit, FAISS, and Hugging Face models.

## πŸš€ Features

- **Hybrid Search**: Combines dense (FAISS) and sparse (BM25) retrieval for optimal results
- **Advanced Guard Rails**: Comprehensive safety and security measures
- **Multiple Models**: Support for Qwen 2.5 1.5B and distilgpt2 fallback
- **PDF Processing**: Intelligent document chunking and processing
- **Real-time Monitoring**: Performance metrics and system health checks
- **Docker Support**: Containerized deployment with Docker Compose
- **Hugging Face Spaces Ready**: Optimized for HF Spaces deployment

## πŸ—οΈ Architecture

```
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚   Streamlit UI  │───▢│   RAG System    │───▢│  Guard Rails    β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                              β”‚
                              β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  PDF Processor  β”‚    β”‚   FAISS Index   β”‚    β”‚  Language Model β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
```

## πŸ› οΈ Technology Stack

### Core Technologies
- **πŸ” Vector Database**: FAISS for efficient similarity search
- **πŸ“ Sparse Retrieval**: BM25 for keyword-based search
- **🧠 Embedding Model**: all-MiniLM-L6-v2 for document embeddings
- **πŸ€– Generative Model**: Qwen 2.5 1.5B for answer generation
- **🌐 UI Framework**: Streamlit for interactive interface
- **🐳 Containerization**: Docker for deployment

### Supporting Libraries
- **πŸ“Š Data Processing**: Pandas, NumPy for data manipulation
- **πŸ“„ PDF Handling**: PyPDF for document processing
- **πŸ”§ ML Utilities**: Scikit-learn for preprocessing
- **πŸ“ Logging**: Loguru for structured logging
- **⚑ Optimization**: Accelerate for model optimization

## πŸš€ Quick Start

### Local Development

1. **Clone and Setup**:
```bash
git clone <repository-url>
cd convAI
pip install -r requirements.txt
```

2. **Run the Application**:
```bash
streamlit run app.py
```

3. **Upload PDFs and Start Chatting**!

### Docker Deployment

1. **Build and Run**:
```bash
docker-compose up --build
```

2. **Access at**: http://localhost:8501

## 🌟 Hugging Face Spaces Deployment

This application is optimized for deployment on Hugging Face Spaces. The system automatically:

- Uses `/tmp` directories for cache storage (writable in HF Spaces)
- Configures environment variables for HF Spaces compatibility
- Handles permission issues automatically
- Optimizes model loading for HF Spaces environment

### HF Spaces Configuration

The application includes:
- **Cache Management**: All model caches stored in `/tmp` directories
- **Permission Handling**: Automatic fallback to writable directories
- **Environment Detection**: Adapts to HF Spaces runtime environment
- **Resource Optimization**: Efficient memory and CPU usage

### Deploy to HF Spaces

1. **Create a new Space** on Hugging Face
2. **Choose Docker** as the SDK
3. **Upload all files** from this repository
4. **The system will automatically**:
   - Set up cache directories in `/tmp`
   - Download and cache models
   - Initialize the RAG system with guard rails
   - Start the Streamlit interface

### HF Spaces Environment Variables

The system automatically configures:
```bash
HF_HOME=/tmp/huggingface
TRANSFORMERS_CACHE=/tmp/huggingface/transformers
TORCH_HOME=/tmp/torch
XDG_CACHE_HOME=/tmp
HF_HUB_CACHE=/tmp/huggingface/hub
```

## πŸ“– Usage Guide

### Document Upload
- **Automatic Loading**: PDF documents in the container are loaded automatically
- **Manual Upload**: Use the sidebar to upload additional PDF documents
- **Supported Formats**: PDF files with text content

### Search Methods
- **πŸ”€ Hybrid**: Combines vector similarity and keyword matching (recommended)
- **🎯 Dense**: Uses only vector similarity search
- **πŸ“ Sparse**: Uses only keyword-based BM25 search

### Query Interface
- **Natural Language**: Ask questions in plain English
- **Context Awareness**: System uses retrieved documents for context
- **Confidence Scores**: See how confident the system is in its answers
- **Source Citations**: View which documents were used for the answer

## βš™οΈ Configuration

### Environment Variables
```bash
# Model Configuration
EMBEDDING_MODEL=all-MiniLM-L6-v2
GENERATIVE_MODEL=Qwen/Qwen2.5-1.5B-Instruct

# Chunk Sizes
CHUNK_SIZES=100,400

# Vector Store Path
VECTOR_STORE_PATH=./vector_store

# Streamlit Configuration
STREAMLIT_SERVER_PORT=8501
STREAMLIT_SERVER_ADDRESS=0.0.0.0
```

### Performance Tuning
- **Chunk Sizes**: Adjust for different document types (smaller for technical docs, larger for narratives)
- **Top-k Results**: Increase for more comprehensive answers, decrease for faster responses
- **Model Selection**: Choose between Qwen 2.5 1.5B and distilgpt2 based on performance needs

## πŸ“Š Performance

### Optimization Features
- **Parallel Processing**: Documents are loaded concurrently for faster initialization
- **Optimized Search**: Hybrid retrieval combines the best of vector and keyword search
- **Memory Efficient**: Uses CPU-optimized models for deployment compatibility
- **Caching**: FAISS index and metadata are cached for faster subsequent queries

### Expected Performance
- **Document Loading**: ~2-5 seconds per PDF (depending on size)
- **Query Response**: ~1-3 seconds for typical questions
- **Memory Usage**: ~2-4GB RAM for typical document collections
- **Storage**: ~100MB per 1000 document chunks

## πŸ”§ Development

### Project Structure
```
convAI/
β”œβ”€β”€ app.py                 # Main Streamlit application
β”œβ”€β”€ rag_system.py          # Core RAG system implementation
β”œβ”€β”€ pdf_processor.py       # PDF processing utilities
β”œβ”€β”€ requirements.txt       # Python dependencies
β”œβ”€β”€ Dockerfile            # Container configuration
β”œβ”€β”€ docker-compose.yml    # Multi-container setup
β”œβ”€β”€ README.md             # This file
β”œβ”€β”€ DEPLOYMENT_GUIDE.md   # Detailed deployment instructions
β”œβ”€β”€ test_deployment.py    # Deployment testing script
β”œβ”€β”€ test_docker.py        # Docker testing script
└── src/
    └── streamlit_app.py  # Sample Streamlit app
```

### Testing
```bash
# Test deployment readiness
python test_deployment.py

# Test Docker configuration
python test_docker.py

# Run local tests
streamlit run app.py
```

## πŸ› Troubleshooting

### Common Issues

1. **Model Loading Errors**
   - Check internet connectivity for model downloads
   - Verify sufficient disk space
   - Try the fallback model (distilgpt2)

2. **Memory Issues**
   - Reduce chunk sizes
   - Use smaller embedding models
   - Limit the number of documents

3. **Performance Issues**
   - Adjust top-k parameter
   - Use sparse search for keyword-heavy queries
   - Consider hardware upgrades

4. **Docker Issues**
   - Check Docker installation
   - Verify port availability
   - Check container logs

### Getting Help
- Check the logs in your Space's "Logs" tab
- Review the deployment guide for common solutions
- Create an issue in the project repository

## 🀝 Contributing

We welcome contributions! Please see our contributing guidelines for:
- Code style and standards
- Testing requirements
- Documentation updates
- Feature requests and bug reports

## πŸ“„ License

This project is licensed under the MIT License - see the LICENSE file for details.

## πŸ™ Acknowledgments

- **Hugging Face** for providing the platform and models
- **FAISS** team for the efficient vector search library
- **Streamlit** team for the excellent web framework
- **OpenAI** for inspiring the RAG architecture

---

*Built with ❀️ for efficient document question-answering*

**Ready to explore your documents? Start asking questions! πŸš€**