minhvtt commited on
Commit
21d5352
·
verified ·
1 Parent(s): a577410

Delete ADVANCED_RAG_GUIDE.md

Browse files
Files changed (1) hide show
  1. ADVANCED_RAG_GUIDE.md +0 -256
ADVANCED_RAG_GUIDE.md DELETED
@@ -1,256 +0,0 @@
1
- # Advanced RAG Chatbot - User Guide
2
-
3
- ## What's New?
4
-
5
- ### 1. Multiple Images & Texts Support in `/index` API
6
-
7
- The `/index` endpoint now supports indexing multiple texts and images in a single request (max 10 each).
8
-
9
- **Before:**
10
- ```python
11
- # Old: Only 1 text and 1 image
12
- data = {
13
- 'id': 'doc1',
14
- 'text': 'Single text',
15
- }
16
- files = {'image': open('image.jpg', 'rb')}
17
- ```
18
-
19
- **After:**
20
- ```python
21
- # New: Multiple texts and images (max 10 each)
22
- data = {
23
- 'id': 'doc1',
24
- 'texts': ['Text 1', 'Text 2', 'Text 3'], # Up to 10
25
- }
26
- files = [
27
- ('images', open('image1.jpg', 'rb')),
28
- ('images', open('image2.jpg', 'rb')),
29
- ('images', open('image3.jpg', 'rb')), # Up to 10
30
- ]
31
- response = requests.post('http://localhost:8000/index', data=data, files=files)
32
- ```
33
-
34
- **Example with cURL:**
35
- ```bash
36
- curl -X POST "http://localhost:8000/index" \
37
- -F "id=event123" \
38
- -F "texts=Sự kiện âm nhạc tại Hà Nội" \
39
- -F "texts=Diễn ra vào ngày 20/10/2025" \
40
- -F "texts=Địa điểm: Trung tâm Hội nghị Quốc gia" \
41
- -F "images=@poster1.jpg" \
42
- -F "images=@poster2.jpg" \
43
- -F "images=@poster3.jpg"
44
- ```
45
-
46
- ### 2. Advanced RAG Pipeline in `/chat` API
47
-
48
- The chat endpoint now uses modern RAG techniques for better response quality:
49
-
50
- #### Key Improvements:
51
-
52
- 1. **Query Expansion**: Automatically expands your question with variations
53
- 2. **Multi-Query Retrieval**: Searches with multiple query variants
54
- 3. **Reranking**: Re-scores results for better relevance
55
- 4. **Contextual Compression**: Keeps only the most relevant parts
56
- 5. **Better Prompt Engineering**: Optimized prompts for LLM
57
-
58
- #### How to Use:
59
-
60
- **Basic Usage (Auto-enabled):**
61
- ```python
62
- import requests
63
-
64
- response = requests.post('http://localhost:8000/chat', json={
65
- 'message': 'Dao có nguy hiểm không?',
66
- 'use_rag': True,
67
- 'use_advanced_rag': True, # Default: True
68
- 'hf_token': 'hf_xxxxx'
69
- })
70
-
71
- result = response.json()
72
- print("Response:", result['response'])
73
- print("RAG Stats:", result['rag_stats']) # See pipeline statistics
74
- ```
75
-
76
- **Advanced Configuration:**
77
- ```python
78
- response = requests.post('http://localhost:8000/chat', json={
79
- 'message': 'Làm sao để tạo event mới?',
80
- 'use_rag': True,
81
- 'use_advanced_rag': True,
82
-
83
- # RAG Pipeline Options
84
- 'use_query_expansion': True, # Expand query with variations
85
- 'use_reranking': True, # Rerank results
86
- 'use_compression': True, # Compress context
87
- 'score_threshold': 0.5, # Min relevance score (0-1)
88
- 'top_k': 5, # Number of documents to retrieve
89
-
90
- # LLM Options
91
- 'max_tokens': 512,
92
- 'temperature': 0.7,
93
- 'hf_token': 'hf_xxxxx'
94
- })
95
- ```
96
-
97
- **Disable Advanced RAG (Use Basic):**
98
- ```python
99
- response = requests.post('http://localhost:8000/chat', json={
100
- 'message': 'Your question',
101
- 'use_rag': True,
102
- 'use_advanced_rag': False, # Use basic RAG
103
- })
104
- ```
105
-
106
- ## API Changes Summary
107
-
108
- ### `/index` Endpoint
109
-
110
- **Old Parameters:**
111
- - `id`: str (required)
112
- - `text`: str (required)
113
- - `image`: UploadFile (optional)
114
-
115
- **New Parameters:**
116
- - `id`: str (required)
117
- - `texts`: List[str] (optional, max 10)
118
- - `images`: List[UploadFile] (optional, max 10)
119
-
120
- **Response:**
121
- ```json
122
- {
123
- "success": true,
124
- "id": "doc123",
125
- "message": "Đã index thành công document doc123 với 3 texts và 2 images"
126
- }
127
- ```
128
-
129
- ### `/chat` Endpoint
130
-
131
- **New Parameters:**
132
- - `use_advanced_rag`: bool (default: True) - Enable advanced RAG
133
- - `use_query_expansion`: bool (default: True) - Expand query
134
- - `use_reranking`: bool (default: True) - Rerank results
135
- - `use_compression`: bool (default: True) - Compress context
136
- - `score_threshold`: float (default: 0.5) - Min relevance score
137
-
138
- **Response (New):**
139
- ```json
140
- {
141
- "response": "AI generated answer...",
142
- "context_used": [...],
143
- "timestamp": "2025-10-29T...",
144
- "rag_stats": {
145
- "original_query": "Your question",
146
- "expanded_queries": ["Query variant 1", "Query variant 2"],
147
- "initial_results": 10,
148
- "after_rerank": 5,
149
- "after_compression": 5
150
- }
151
- }
152
- ```
153
-
154
- ## Complete Examples
155
-
156
- ### Example 1: Index Multiple Social Media Posts
157
-
158
- ```python
159
- import requests
160
-
161
- # Index a social media event with multiple posts and images
162
- data = {
163
- 'id': 'event_festival_2025',
164
- 'texts': [
165
- 'Festival âm nhạc quốc tế Hà Nội 2025',
166
- 'Ngày 15-17 tháng 11 năm 2025',
167
- 'Địa điểm: Công viên Thống Nhất',
168
- 'Line-up: Sơn Tùng MTP, Đen Vâu, Hoàng Thùy Linh',
169
- 'Giá vé từ 500.000đ - 2.000.000đ'
170
- ]
171
- }
172
-
173
- files = [
174
- ('images', open('poster_festival.jpg', 'rb')),
175
- ('images', open('lineup.jpg', 'rb')),
176
- ('images', open('venue_map.jpg', 'rb'))
177
- ]
178
-
179
- response = requests.post('http://localhost:8000/index', data=data, files=files)
180
- print(response.json())
181
- ```
182
-
183
- ### Example 2: Advanced RAG Chat
184
-
185
- ```python
186
- import requests
187
-
188
- # Chat with advanced RAG
189
- chat_response = requests.post('http://localhost:8000/chat', json={
190
- 'message': 'Festival âm nhạc Hà Nội diễn ra khi nào và ở đâu?',
191
- 'use_rag': True,
192
- 'use_advanced_rag': True,
193
- 'top_k': 3,
194
- 'score_threshold': 0.6,
195
- 'hf_token': 'your_hf_token_here'
196
- })
197
-
198
- result = chat_response.json()
199
- print("Answer:", result['response'])
200
- print("\nRetrieved Context:")
201
- for ctx in result['context_used']:
202
- print(f"- [{ctx['id']}] Confidence: {ctx['confidence']:.2%}")
203
-
204
- print("\nRAG Pipeline Stats:")
205
- print(f"- Original query: {result['rag_stats']['original_query']}")
206
- print(f"- Query variants: {result['rag_stats']['expanded_queries']}")
207
- print(f"- Documents retrieved: {result['rag_stats']['initial_results']}")
208
- print(f"- After reranking: {result['rag_stats']['after_rerank']}")
209
- ```
210
-
211
- ## Performance Comparison
212
-
213
- | Feature | Basic RAG | Advanced RAG |
214
- |---------|-----------|--------------|
215
- | Query Understanding | Single query | Multiple query variants |
216
- | Retrieval Method | Direct vector search | Multi-query + hybrid |
217
- | Result Ranking | Score from DB | Reranked with semantic similarity |
218
- | Context Quality | Full text | Compressed, relevant parts only |
219
- | Response Accuracy | Good | Better |
220
- | Response Time | Faster | Slightly slower but better quality |
221
-
222
- ## When to Use What?
223
-
224
- **Use Basic RAG when:**
225
- - You need fast response time
226
- - Queries are straightforward
227
- - Context is already well-structured
228
-
229
- **Use Advanced RAG when:**
230
- - You need higher accuracy
231
- - Queries are complex or ambiguous
232
- - Context documents are long
233
- - You want better relevance
234
-
235
- ## Troubleshooting
236
-
237
- ### Error: "Tối đa 10 texts"
238
- You're sending more than 10 texts. Reduce to max 10.
239
-
240
- ### Error: "Tối đa 10 images"
241
- You're sending more than 10 images. Reduce to max 10.
242
-
243
- ### RAG stats show 0 results
244
- Your `score_threshold` might be too high. Try lowering it (e.g., 0.3-0.5).
245
-
246
- ## Next Steps
247
-
248
- To further improve RAG, consider:
249
-
250
- 1. **Add BM25 Hybrid Search**: Combine dense + sparse retrieval
251
- 2. **Use Cross-Encoder for Reranking**: Better than embedding similarity
252
- 3. **Implement Query Decomposition**: Break complex queries into sub-queries
253
- 4. **Add Citation/Source Tracking**: Show which document each fact comes from
254
- 5. **Integrate RAG-Anything**: For advanced multimodal document processing
255
-
256
- For RAG-Anything integration (more complex), see: https://github.com/HKUDS/RAG-Anything