Spaces:

minhvtt
/

ChatbotRAG

Sleeping

App Files Files Community

minhvtt commited on Nov 22, 2025

Commit

21d5352

verified ·

1 Parent(s): a577410

Delete ADVANCED_RAG_GUIDE.md

Browse files

Files changed (1) hide show

ADVANCED_RAG_GUIDE.md +0 -256

ADVANCED_RAG_GUIDE.md DELETED Viewed

@@ -1,256 +0,0 @@
-# Advanced RAG Chatbot - User Guide
-## What's New?
-### 1. Multiple Images & Texts Support in `/index` API
-The `/index` endpoint now supports indexing multiple texts and images in a single request (max 10 each).
-**Before:**
-```python
-# Old: Only 1 text and 1 image
-data = {
-    'id': 'doc1',
-    'text': 'Single text',
-}
-files = {'image': open('image.jpg', 'rb')}
-```
-**After:**
-```python
-# New: Multiple texts and images (max 10 each)
-data = {
-    'id': 'doc1',
-    'texts': ['Text 1', 'Text 2', 'Text 3'],  # Up to 10
-}
-files = [
-    ('images', open('image1.jpg', 'rb')),
-    ('images', open('image2.jpg', 'rb')),
-    ('images', open('image3.jpg', 'rb')),  # Up to 10
-]
-response = requests.post('http://localhost:8000/index', data=data, files=files)
-```
-**Example with cURL:**
-```bash
-curl -X POST "http://localhost:8000/index" \
-  -F "id=event123" \
-  -F "texts=Sự kiện âm nhạc tại Hà Nội" \
-  -F "texts=Diễn ra vào ngày 20/10/2025" \
-  -F "texts=Địa điểm: Trung tâm Hội nghị Quốc gia" \
-  -F "images=@poster1.jpg" \
-  -F "images=@poster2.jpg" \
-  -F "images=@poster3.jpg"
-```
-### 2. Advanced RAG Pipeline in `/chat` API
-The chat endpoint now uses modern RAG techniques for better response quality:
-#### Key Improvements:
-1. **Query Expansion**: Automatically expands your question with variations
-2. **Multi-Query Retrieval**: Searches with multiple query variants
-3. **Reranking**: Re-scores results for better relevance
-4. **Contextual Compression**: Keeps only the most relevant parts
-5. **Better Prompt Engineering**: Optimized prompts for LLM
-#### How to Use:
-**Basic Usage (Auto-enabled):**
-```python
-import requests
-response = requests.post('http://localhost:8000/chat', json={
-    'message': 'Dao có nguy hiểm không?',
-    'use_rag': True,
-    'use_advanced_rag': True,  # Default: True
-    'hf_token': 'hf_xxxxx'
-})
-result = response.json()
-print("Response:", result['response'])
-print("RAG Stats:", result['rag_stats'])  # See pipeline statistics
-```
-**Advanced Configuration:**
-```python
-response = requests.post('http://localhost:8000/chat', json={
-    'message': 'Làm sao để tạo event mới?',
-    'use_rag': True,
-    'use_advanced_rag': True,
-    # RAG Pipeline Options
-    'use_query_expansion': True,    # Expand query with variations
-    'use_reranking': True,          # Rerank results
-    'use_compression': True,        # Compress context
-    'score_threshold': 0.5,         # Min relevance score (0-1)
-    'top_k': 5,                     # Number of documents to retrieve
-    # LLM Options
-    'max_tokens': 512,
-    'temperature': 0.7,
-    'hf_token': 'hf_xxxxx'
-})
-```
-**Disable Advanced RAG (Use Basic):**
-```python
-response = requests.post('http://localhost:8000/chat', json={
-    'message': 'Your question',
-    'use_rag': True,
-    'use_advanced_rag': False,  # Use basic RAG
-})
-```
-## API Changes Summary
-### `/index` Endpoint
-**Old Parameters:**
-- `id`: str (required)
-- `text`: str (required)
-- `image`: UploadFile (optional)
-**New Parameters:**
-- `id`: str (required)
-- `texts`: List[str] (optional, max 10)
-- `images`: List[UploadFile] (optional, max 10)
-**Response:**
-```json
-{
-  "success": true,
-  "id": "doc123",
-  "message": "Đã index thành công document doc123 với 3 texts và 2 images"
-}
-```
-### `/chat` Endpoint
-**New Parameters:**
-- `use_advanced_rag`: bool (default: True) - Enable advanced RAG
-- `use_query_expansion`: bool (default: True) - Expand query
-- `use_reranking`: bool (default: True) - Rerank results
-- `use_compression`: bool (default: True) - Compress context
-- `score_threshold`: float (default: 0.5) - Min relevance score
-**Response (New):**
-```json
-{
-  "response": "AI generated answer...",
-  "context_used": [...],
-  "timestamp": "2025-10-29T...",
-  "rag_stats": {
-    "original_query": "Your question",
-    "expanded_queries": ["Query variant 1", "Query variant 2"],
-    "initial_results": 10,
-    "after_rerank": 5,
-    "after_compression": 5
-  }
-}
-```
-## Complete Examples
-### Example 1: Index Multiple Social Media Posts
-```python
-import requests
-# Index a social media event with multiple posts and images
-data = {
-    'id': 'event_festival_2025',
-    'texts': [
-        'Festival âm nhạc quốc tế Hà Nội 2025',
-        'Ngày 15-17 tháng 11 năm 2025',
-        'Địa điểm: Công viên Thống Nhất',
-        'Line-up: Sơn Tùng MTP, Đen Vâu, Hoàng Thùy Linh',
-        'Giá vé từ 500.000đ - 2.000.000đ'
-    ]
-}
-files = [
-    ('images', open('poster_festival.jpg', 'rb')),
-    ('images', open('lineup.jpg', 'rb')),
-    ('images', open('venue_map.jpg', 'rb'))
-]
-response = requests.post('http://localhost:8000/index', data=data, files=files)
-print(response.json())
-```
-### Example 2: Advanced RAG Chat
-```python
-import requests
-# Chat with advanced RAG
-chat_response = requests.post('http://localhost:8000/chat', json={
-    'message': 'Festival âm nhạc Hà Nội diễn ra khi nào và ở đâu?',
-    'use_rag': True,
-    'use_advanced_rag': True,
-    'top_k': 3,
-    'score_threshold': 0.6,
-    'hf_token': 'your_hf_token_here'
-})
-result = chat_response.json()
-print("Answer:", result['response'])
-print("\nRetrieved Context:")
-for ctx in result['context_used']:
-    print(f"- [{ctx['id']}] Confidence: {ctx['confidence']:.2%}")
-print("\nRAG Pipeline Stats:")
-print(f"- Original query: {result['rag_stats']['original_query']}")
-print(f"- Query variants: {result['rag_stats']['expanded_queries']}")
-print(f"- Documents retrieved: {result['rag_stats']['initial_results']}")
-print(f"- After reranking: {result['rag_stats']['after_rerank']}")
-```
-## Performance Comparison
-| Feature | Basic RAG | Advanced RAG |
-|---------|-----------|--------------|
-| Query Understanding | Single query | Multiple query variants |
-| Retrieval Method | Direct vector search | Multi-query + hybrid |
-| Result Ranking | Score from DB | Reranked with semantic similarity |
-| Context Quality | Full text | Compressed, relevant parts only |
-| Response Accuracy | Good | Better |
-| Response Time | Faster | Slightly slower but better quality |
-## When to Use What?
-**Use Basic RAG when:**
-- You need fast response time
-- Queries are straightforward
-- Context is already well-structured
-**Use Advanced RAG when:**
-- You need higher accuracy
-- Queries are complex or ambiguous
-- Context documents are long
-- You want better relevance
-## Troubleshooting
-### Error: "Tối đa 10 texts"
-You're sending more than 10 texts. Reduce to max 10.
-### Error: "Tối đa 10 images"
-You're sending more than 10 images. Reduce to max 10.
-### RAG stats show 0 results
-Your `score_threshold` might be too high. Try lowering it (e.g., 0.3-0.5).
-## Next Steps
-To further improve RAG, consider:
-1. **Add BM25 Hybrid Search**: Combine dense + sparse retrieval
-2. **Use Cross-Encoder for Reranking**: Better than embedding similarity
-3. **Implement Query Decomposition**: Break complex queries into sub-queries
-4. **Add Citation/Source Tracking**: Show which document each fact comes from
-5. **Integrate RAG-Anything**: For advanced multimodal document processing
-For RAG-Anything integration (more complex), see: https://github.com/HKUDS/RAG-Anything