sinhapiyush86 commited on
Commit
a943b87
Β·
verified Β·
1 Parent(s): 192b2d2

Upload README.md

Browse files
Files changed (1) hide show
  1. README.md +38 -230
README.md CHANGED
@@ -1,245 +1,53 @@
1
- # RAG System for Hugging Face Spaces
2
-
3
- A simplified Retrieval-Augmented Generation (RAG) system optimized for deployment on Hugging Face Spaces.
4
-
5
- ## πŸš€ Features
6
-
7
- - **FAISS Vector Search**: Fast similarity search using FAISS
8
- - **BM25 Keyword Search**: Traditional keyword-based retrieval
9
- - **Hybrid Search**: Combines both dense and sparse retrieval
10
- - **Qwen 2.5 1.5B**: Advanced language model for answer generation
11
- - **Streamlit UI**: Clean, interactive web interface
12
- - **PDF Processing**: Extract and process PDF documents
13
- - **Persistent Storage**: Saves embeddings and metadata locally
14
-
15
- ## πŸ“ Project Structure
16
-
17
- ```
18
- huggingface_deploy/
19
- β”œβ”€β”€ app.py # Main Streamlit application
20
- β”œβ”€β”€ rag_system.py # Simplified RAG system
21
- β”œβ”€β”€ pdf_processor.py # PDF processing utilities
22
- β”œβ”€β”€ requirements.txt # Python dependencies
23
- β”œβ”€β”€ README.md # This file
24
- └── vector_store/ # FAISS index and metadata (created automatically)
25
- ```
26
-
27
- ## πŸ› οΈ Technologies Used
28
-
29
- - **Streamlit**: Web interface
30
- - **FAISS**: Vector similarity search
31
- - **BM25**: Keyword-based retrieval
32
- - **Sentence Transformers**: Text embeddings
33
- - **Transformers**: Qwen 2.5 1.5B model
34
- - **PyPDF**: PDF text extraction
35
- - **PyTorch**: Deep learning framework
36
-
37
- ## πŸš€ Quick Start
38
-
39
- ### Local Development
40
-
41
- 1. **Install dependencies:**
42
- ```bash
43
- pip install -r requirements.txt
44
- ```
45
-
46
- 2. **Run the application:**
47
- ```bash
48
- streamlit run app.py
49
- ```
50
-
51
- 3. **Open in browser:**
52
- Navigate to `http://localhost:8501`
53
-
54
- ### Hugging Face Spaces Deployment
55
-
56
- 1. **Create a new Space:**
57
- - Go to [Hugging Face Spaces](https://huggingface.co/spaces)
58
- - Click "Create new Space"
59
- - Choose "Streamlit" as the SDK
60
- - Set visibility (public or private)
61
-
62
- 2. **Upload files:**
63
- - Upload all files from this directory to your Space
64
- - The Space will automatically install dependencies and run the app
65
-
66
- 3. **Access your app:**
67
- - Your RAG system will be available at your Space URL
68
-
69
- ## πŸ“– How to Use
70
-
71
- ### 1. Upload Documents
72
- - Use the sidebar to upload PDF documents
73
- - The system will automatically process and index the content
74
- - Multiple documents can be uploaded
75
-
76
- ### 2. Ask Questions
77
- - Type your question in the chat interface
78
- - Choose your preferred retrieval method:
79
- - **Hybrid**: Combines FAISS and BM25 (recommended)
80
- - **Dense**: Uses only FAISS vector similarity
81
- - **Sparse**: Uses only BM25 keyword matching
82
-
83
- ### 3. View Results
84
- - See the generated answer
85
- - View search results with confidence scores
86
- - Check response time and method used
87
-
88
- ## βš™οΈ Configuration
89
-
90
- ### Environment Variables
91
-
92
- You can customize the system by setting these environment variables:
93
-
94
- ```bash
95
- # Model configuration
96
- EMBEDDING_MODEL=all-MiniLM-L6-v2
97
- GENERATIVE_MODEL=Qwen/Qwen2.5-1.5B-Instruct
98
-
99
- # Chunk sizes for document processing
100
- CHUNK_SIZES=100,400
101
-
102
- # Vector store path
103
- VECTOR_STORE_PATH=./vector_store
104
- ```
105
-
106
- ### Model Options
107
-
108
- **Embedding Models:**
109
- - `all-MiniLM-L6-v2` (default, 384 dimensions)
110
- - `all-mpnet-base-v2` (768 dimensions)
111
- - `multi-qa-MiniLM-L6-cos-v1` (384 dimensions)
112
-
113
- **Generative Models:**
114
- - `Qwen/Qwen2.5-1.5B-Instruct` (default)
115
- - `distilgpt2` (fallback)
116
- - `microsoft/DialoGPT-medium`
117
-
118
- ## πŸ”§ Customization
119
-
120
- ### Adding New Models
121
-
122
- To use different models, modify the `SimpleRAGSystem` initialization in `app.py`:
123
-
124
- ```python
125
- st.session_state.rag_system = SimpleRAGSystem(
126
- embedding_model="your-embedding-model",
127
- generative_model="your-generative-model"
128
- )
129
- ```
130
-
131
- ### Custom Chunk Sizes
132
-
133
- Modify the chunk sizes for different document types:
134
-
135
- ```python
136
- chunk_sizes = [50, 200, 800] # Smaller chunks for technical docs
137
- ```
138
-
139
- ### Custom Search Methods
140
-
141
- Add new search methods in `rag_system.py`:
142
-
143
- ```python
144
- def custom_search(self, query: str, top_k: int = 5):
145
- # Your custom search implementation
146
- pass
147
- ```
148
-
149
- ## πŸ“Š Performance Optimization
150
-
151
- ### Memory Usage
152
- - Use smaller embedding models for limited memory
153
- - Reduce chunk sizes for large documents
154
- - Enable model quantization
155
-
156
- ### Speed Optimization
157
- - Use GPU acceleration when available
158
- - Optimize FAISS index parameters
159
- - Cache embeddings for repeated queries
160
-
161
- ### Storage
162
- - FAISS index and metadata are saved locally
163
- - Consider cloud storage for production deployments
164
-
165
- ## πŸ› Troubleshooting
166
-
167
- ### Common Issues
168
-
169
- 1. **Model Loading Errors**
170
- - Check internet connection for model downloads
171
- - Verify model names are correct
172
- - Ensure sufficient disk space
173
-
174
- 2. **Memory Issues**
175
- - Reduce batch sizes
176
- - Use smaller models
177
- - Enable gradient checkpointing
178
-
179
- 3. **PDF Processing Errors**
180
- - Verify PDF files are not corrupted
181
- - Check file permissions
182
- - Ensure PyPDF is properly installed
183
-
184
- ### Debug Mode
185
-
186
- Enable debug logging by adding to `app.py`:
187
-
188
- ```python
189
- import logging
190
- logging.basicConfig(level=logging.DEBUG)
191
- ```
192
-
193
- ## πŸ”’ Security Considerations
194
 
195
- - **File Upload**: Validate PDF files before processing
196
- - **Model Access**: Use appropriate model access tokens
197
- - **Data Privacy**: Consider data retention policies
198
- - **Rate Limiting**: Implement query rate limiting for production
199
 
200
- ## πŸ“ˆ Monitoring
201
 
202
- ### System Metrics
203
- - Document count and chunk count
204
- - Response times
205
- - Search result quality
206
- - Model performance
207
 
208
- ### Logs
209
- - Application logs in Streamlit
210
- - Model loading and inference logs
211
- - Error tracking and debugging
 
 
212
 
213
- ## 🀝 Contributing
214
 
215
- 1. Fork the repository
216
- 2. Create a feature branch
217
- 3. Make your changes
218
- 4. Test thoroughly
219
- 5. Submit a pull request
220
 
221
- ## πŸ“„ License
222
 
223
- This project is licensed under the MIT License - see the LICENSE file for details.
 
 
 
 
 
224
 
225
- ## πŸ†˜ Support
226
 
227
- For issues and questions:
228
- 1. Check the troubleshooting section
229
- 2. Review the logs for error messages
230
- 3. Create an issue on GitHub
231
- 4. Contact the maintainers
232
 
233
- ## 🎯 Roadmap
234
 
235
- - [ ] Add support for more document formats
236
- - [ ] Implement advanced search algorithms
237
- - [ ] Add model fine-tuning capabilities
238
- - [ ] Improve UI/UX design
239
- - [ ] Add export/import functionality
240
- - [ ] Implement user authentication
241
- - [ ] Add analytics dashboard
242
 
243
  ---
244
 
245
- **Happy RAG-ing! πŸš€**
 
1
+ ---
2
+ title: RAG System with PDF Documents
3
+ emoji: πŸ€–
4
+ colorFrom: blue
5
+ colorTo: purple
6
+ sdk: docker
7
+ sdk_version: latest
8
+ app_file: app.py
9
+ pinned: false
10
+ ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
11
 
12
+ # RAG System - Hugging Face Spaces
 
 
 
13
 
14
+ A comprehensive Retrieval-Augmented Generation (RAG) system that processes PDF documents and answers questions using advanced AI models.
15
 
16
+ ## Features
 
 
 
 
17
 
18
+ - **PDF Processing**: Automatically loads and processes PDF documents
19
+ - **Hybrid Search**: Combines FAISS vector search with BM25 keyword search
20
+ - **Multiple Retrieval Methods**: Hybrid, dense, and sparse retrieval options
21
+ - **Advanced AI Models**: Uses Qwen 2.5 1.5B for response generation
22
+ - **Real-time Chat Interface**: Interactive Streamlit-based UI
23
+ - **Parallel Document Loading**: Fast document processing with concurrent loading
24
 
25
+ ## How to Use
26
 
27
+ 1. **Wait for Initialization**: The system automatically loads pre-configured PDF documents
28
+ 2. **Ask Questions**: Use the chat interface to ask questions about the documents
29
+ 3. **Choose Method**: Select from hybrid, dense, or sparse retrieval methods
30
+ 4. **View Results**: See answers with confidence scores and search results
 
31
 
32
+ ## Technology Stack
33
 
34
+ - **Vector Database**: FAISS for efficient similarity search
35
+ - **Sparse Retrieval**: BM25 for keyword-based search
36
+ - **Embedding Model**: all-MiniLM-L6-v2 for document embeddings
37
+ - **Generative Model**: Qwen 2.5 1.5B for answer generation
38
+ - **UI Framework**: Streamlit for interactive interface
39
+ - **Containerization**: Docker for deployment
40
 
41
+ ## Configuration
42
 
43
+ The system is pre-configured with RIL quarterly reports and automatically loads them on startup. Users can also upload additional PDF documents through the interface.
 
 
 
 
44
 
45
+ ## Performance
46
 
47
+ - **Parallel Processing**: Documents are loaded concurrently for faster initialization
48
+ - **Optimized Search**: Hybrid retrieval combines the best of vector and keyword search
49
+ - **Memory Efficient**: Uses CPU-optimized models for deployment compatibility
 
 
 
 
50
 
51
  ---
52
 
53
+ *Built with ❀️ for efficient document question-answering*