sinhapiyush86 commited on
Commit
552c957
Β·
verified Β·
1 Parent(s): a943b87

Upload README.md

Browse files
Files changed (1) hide show
  1. README.md +216 -20
README.md CHANGED
@@ -7,47 +7,243 @@ sdk: docker
7
  sdk_version: latest
8
  app_file: app.py
9
  pinned: false
 
10
  ---
11
 
12
- # RAG System - Hugging Face Spaces
13
 
14
- A comprehensive Retrieval-Augmented Generation (RAG) system that processes PDF documents and answers questions using advanced AI models.
15
 
16
- ## Features
17
 
18
- - **PDF Processing**: Automatically loads and processes PDF documents
19
- - **Hybrid Search**: Combines FAISS vector search with BM25 keyword search
20
- - **Multiple Retrieval Methods**: Hybrid, dense, and sparse retrieval options
21
- - **Advanced AI Models**: Uses Qwen 2.5 1.5B for response generation
22
- - **Real-time Chat Interface**: Interactive Streamlit-based UI
23
- - **Parallel Document Loading**: Fast document processing with concurrent loading
 
24
 
25
- ## How to Use
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
26
 
27
  1. **Wait for Initialization**: The system automatically loads pre-configured PDF documents
28
  2. **Ask Questions**: Use the chat interface to ask questions about the documents
29
  3. **Choose Method**: Select from hybrid, dense, or sparse retrieval methods
30
  4. **View Results**: See answers with confidence scores and search results
31
 
32
- ## Technology Stack
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
33
 
34
- - **Vector Database**: FAISS for efficient similarity search
35
- - **Sparse Retrieval**: BM25 for keyword-based search
36
- - **Embedding Model**: all-MiniLM-L6-v2 for document embeddings
37
- - **Generative Model**: Qwen 2.5 1.5B for answer generation
38
- - **UI Framework**: Streamlit for interactive interface
39
- - **Containerization**: Docker for deployment
40
 
41
- ## Configuration
 
 
 
42
 
43
- The system is pre-configured with RIL quarterly reports and automatically loads them on startup. Users can also upload additional PDF documents through the interface.
 
 
 
44
 
45
- ## Performance
46
 
 
47
  - **Parallel Processing**: Documents are loaded concurrently for faster initialization
48
  - **Optimized Search**: Hybrid retrieval combines the best of vector and keyword search
49
  - **Memory Efficient**: Uses CPU-optimized models for deployment compatibility
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
50
 
51
  ---
52
 
53
  *Built with ❀️ for efficient document question-answering*
 
 
 
7
  sdk_version: latest
8
  app_file: app.py
9
  pinned: false
10
+ app_port: 8501
11
  ---
12
 
13
+ # πŸ€– RAG System - Hugging Face Spaces
14
 
15
+ A comprehensive **Retrieval-Augmented Generation (RAG)** system that processes PDF documents and answers questions using advanced AI models. This system combines the power of vector search, keyword matching, and large language models to provide intelligent document question-answering capabilities.
16
 
17
+ ## πŸš€ Features
18
 
19
+ ### Core Functionality
20
+ - **πŸ“„ PDF Processing**: Automatically loads and processes PDF documents with intelligent text extraction
21
+ - **πŸ” Hybrid Search**: Combines FAISS vector search with BM25 keyword search for optimal retrieval
22
+ - **🎯 Multiple Retrieval Methods**: Choose from hybrid, dense, or sparse retrieval options
23
+ - **πŸ€– Advanced AI Models**: Uses Qwen 2.5 1.5B for intelligent response generation
24
+ - **πŸ’¬ Real-time Chat Interface**: Interactive Streamlit-based UI with conversation history
25
+ - **⚑ Parallel Document Loading**: Fast document processing with concurrent loading
26
 
27
+ ### Technical Features
28
+ - **πŸ”’ Thread Safety**: Safe concurrent document loading with proper locking
29
+ - **πŸ’Ύ Persistent Storage**: Automatic index saving and loading across sessions
30
+ - **🎯 Smart Fallbacks**: Graceful model loading with alternative options
31
+ - **πŸ“Š Performance Metrics**: Response times, confidence scores, and search result analysis
32
+ - **πŸ›‘οΈ Error Handling**: Robust error handling and user feedback
33
+
34
+ ## πŸ—οΈ Architecture
35
+
36
+ The RAG system follows a modular, scalable architecture:
37
+
38
+ ```
39
+ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
40
+ β”‚ PDF Documents β”‚ β”‚ User Interface β”‚ β”‚ Search Engine β”‚
41
+ β”‚ β”‚ β”‚ (Streamlit) β”‚ β”‚ β”‚
42
+ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”˜
43
+ β”‚ β”‚ β”‚
44
+ β–Ό β–Ό β–Ό
45
+ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
46
+ β”‚ PDF Processor β”‚ β”‚ RAG System β”‚ β”‚ Vector Store β”‚
47
+ β”‚ - Text Extract β”‚ β”‚ - Orchestration β”‚ β”‚ (FAISS) β”‚
48
+ β”‚ - Cleaning β”‚ β”‚ - Response Gen β”‚ β”‚ β”‚
49
+ β”‚ - Chunking β”‚ β”‚ - Thread Safety β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
50
+ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
51
+ β”‚
52
+ β–Ό
53
+ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
54
+ β”‚ Language Model β”‚
55
+ β”‚ (Qwen 2.5 1.5B) β”‚
56
+ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
57
+ ```
58
+
59
+ ## πŸ› οΈ Technology Stack
60
+
61
+ ### Core Technologies
62
+ - **πŸ” Vector Database**: FAISS for efficient similarity search
63
+ - **πŸ“ Sparse Retrieval**: BM25 for keyword-based search
64
+ - **🧠 Embedding Model**: all-MiniLM-L6-v2 for document embeddings
65
+ - **πŸ€– Generative Model**: Qwen 2.5 1.5B for answer generation
66
+ - **🌐 UI Framework**: Streamlit for interactive interface
67
+ - **🐳 Containerization**: Docker for deployment
68
+
69
+ ### Supporting Libraries
70
+ - **πŸ“Š Data Processing**: Pandas, NumPy for data manipulation
71
+ - **πŸ“„ PDF Handling**: PyPDF for document processing
72
+ - **πŸ”§ ML Utilities**: Scikit-learn for preprocessing
73
+ - **πŸ“ Logging**: Loguru for structured logging
74
+ - **⚑ Optimization**: Accelerate for model optimization
75
+
76
+ ## πŸš€ Quick Start
77
+
78
+ ### 1. Using the Web Interface
79
 
80
  1. **Wait for Initialization**: The system automatically loads pre-configured PDF documents
81
  2. **Ask Questions**: Use the chat interface to ask questions about the documents
82
  3. **Choose Method**: Select from hybrid, dense, or sparse retrieval methods
83
  4. **View Results**: See answers with confidence scores and search results
84
 
85
+ ### 2. Local Development
86
+
87
+ ```bash
88
+ # Clone the repository
89
+ git clone <repository-url>
90
+ cd convAI
91
+
92
+ # Install dependencies
93
+ pip install -r requirements.txt
94
+
95
+ # Run the application
96
+ streamlit run app.py
97
+ ```
98
+
99
+ ### 3. Docker Deployment
100
+
101
+ ```bash
102
+ # Build and run with Docker Compose
103
+ docker-compose up --build
104
+
105
+ # Or build and run manually
106
+ docker build -t rag-system .
107
+ docker run -p 8501:8501 rag-system
108
+ ```
109
+
110
+ ## πŸ“– Usage Guide
111
+
112
+ ### Document Upload
113
+ - **Automatic Loading**: PDF documents in the container are loaded automatically
114
+ - **Manual Upload**: Use the sidebar to upload additional PDF documents
115
+ - **Supported Formats**: PDF files with text content
116
+
117
+ ### Search Methods
118
+ - **πŸ”€ Hybrid**: Combines vector similarity and keyword matching (recommended)
119
+ - **🎯 Dense**: Uses only vector similarity search
120
+ - **πŸ“ Sparse**: Uses only keyword-based BM25 search
121
+
122
+ ### Query Interface
123
+ - **Natural Language**: Ask questions in plain English
124
+ - **Context Awareness**: System uses retrieved documents for context
125
+ - **Confidence Scores**: See how confident the system is in its answers
126
+ - **Source Citations**: View which documents were used for the answer
127
+
128
+ ## βš™οΈ Configuration
129
+
130
+ ### Environment Variables
131
+ ```bash
132
+ # Model Configuration
133
+ EMBEDDING_MODEL=all-MiniLM-L6-v2
134
+ GENERATIVE_MODEL=Qwen/Qwen2.5-1.5B-Instruct
135
+
136
+ # Chunk Sizes
137
+ CHUNK_SIZES=100,400
138
 
139
+ # Vector Store Path
140
+ VECTOR_STORE_PATH=./vector_store
 
 
 
 
141
 
142
+ # Streamlit Configuration
143
+ STREAMLIT_SERVER_PORT=8501
144
+ STREAMLIT_SERVER_ADDRESS=0.0.0.0
145
+ ```
146
 
147
+ ### Performance Tuning
148
+ - **Chunk Sizes**: Adjust for different document types (smaller for technical docs, larger for narratives)
149
+ - **Top-k Results**: Increase for more comprehensive answers, decrease for faster responses
150
+ - **Model Selection**: Choose between Qwen 2.5 1.5B and distilgpt2 based on performance needs
151
 
152
+ ## πŸ“Š Performance
153
 
154
+ ### Optimization Features
155
  - **Parallel Processing**: Documents are loaded concurrently for faster initialization
156
  - **Optimized Search**: Hybrid retrieval combines the best of vector and keyword search
157
  - **Memory Efficient**: Uses CPU-optimized models for deployment compatibility
158
+ - **Caching**: FAISS index and metadata are cached for faster subsequent queries
159
+
160
+ ### Expected Performance
161
+ - **Document Loading**: ~2-5 seconds per PDF (depending on size)
162
+ - **Query Response**: ~1-3 seconds for typical questions
163
+ - **Memory Usage**: ~2-4GB RAM for typical document collections
164
+ - **Storage**: ~100MB per 1000 document chunks
165
+
166
+ ## πŸ”§ Development
167
+
168
+ ### Project Structure
169
+ ```
170
+ convAI/
171
+ β”œβ”€β”€ app.py # Main Streamlit application
172
+ β”œβ”€β”€ rag_system.py # Core RAG system implementation
173
+ β”œβ”€β”€ pdf_processor.py # PDF processing utilities
174
+ β”œβ”€β”€ requirements.txt # Python dependencies
175
+ β”œβ”€β”€ Dockerfile # Container configuration
176
+ β”œβ”€β”€ docker-compose.yml # Multi-container setup
177
+ β”œβ”€β”€ README.md # This file
178
+ β”œβ”€β”€ DEPLOYMENT_GUIDE.md # Detailed deployment instructions
179
+ β”œβ”€β”€ test_deployment.py # Deployment testing script
180
+ β”œβ”€β”€ test_docker.py # Docker testing script
181
+ └── src/
182
+ └── streamlit_app.py # Sample Streamlit app
183
+ ```
184
+
185
+ ### Testing
186
+ ```bash
187
+ # Test deployment readiness
188
+ python test_deployment.py
189
+
190
+ # Test Docker configuration
191
+ python test_docker.py
192
+
193
+ # Run local tests
194
+ streamlit run app.py
195
+ ```
196
+
197
+ ## πŸ› Troubleshooting
198
+
199
+ ### Common Issues
200
+
201
+ 1. **Model Loading Errors**
202
+ - Check internet connectivity for model downloads
203
+ - Verify sufficient disk space
204
+ - Try the fallback model (distilgpt2)
205
+
206
+ 2. **Memory Issues**
207
+ - Reduce chunk sizes
208
+ - Use smaller embedding models
209
+ - Limit the number of documents
210
+
211
+ 3. **Performance Issues**
212
+ - Adjust top-k parameter
213
+ - Use sparse search for keyword-heavy queries
214
+ - Consider hardware upgrades
215
+
216
+ 4. **Docker Issues**
217
+ - Check Docker installation
218
+ - Verify port availability
219
+ - Check container logs
220
+
221
+ ### Getting Help
222
+ - Check the logs in your Space's "Logs" tab
223
+ - Review the deployment guide for common solutions
224
+ - Create an issue in the project repository
225
+
226
+ ## 🀝 Contributing
227
+
228
+ We welcome contributions! Please see our contributing guidelines for:
229
+ - Code style and standards
230
+ - Testing requirements
231
+ - Documentation updates
232
+ - Feature requests and bug reports
233
+
234
+ ## πŸ“„ License
235
+
236
+ This project is licensed under the MIT License - see the LICENSE file for details.
237
+
238
+ ## πŸ™ Acknowledgments
239
+
240
+ - **Hugging Face** for providing the platform and models
241
+ - **FAISS** team for the efficient vector search library
242
+ - **Streamlit** team for the excellent web framework
243
+ - **OpenAI** for inspiring the RAG architecture
244
 
245
  ---
246
 
247
  *Built with ❀️ for efficient document question-answering*
248
+
249
+ **Ready to explore your documents? Start asking questions! πŸš€**