Spaces:
Sleeping
Sleeping
| title: Hierarchical RAG Evaluation | |
| emoji: π | |
| colorFrom: blue | |
| colorTo: purple | |
| sdk: gradio | |
| sdk_version: 3.50.2 | |
| app_file: app.py | |
| pinned: false | |
| license: mit | |
| # Hierarchical RAG Evaluation System | |
| A comprehensive system for comparing Standard RAG vs Hierarchical RAG approaches, focusing on both accuracy and speed improvements through metadata-based filtering. | |
| ## Features | |
| - **Dual RAG Pipelines**: Compare Base-RAG and Hier-RAG side-by-side | |
| - **Hierarchical Classification**: 3-level taxonomy (domain β section β topic) | |
| - **Multiple Domains**: Pre-configured hierarchies for Hospital, Banking, and Fluid Simulation | |
| - **Comprehensive Evaluation**: Quantitative metrics (Hit@k, MRR, latency) and qualitative testing | |
| - **Gradio UI**: User-friendly interface with API access | |
| - **MCP Server**: Additional API server for programmatic access | |
| ## Architecture | |
| ``` | |
| User Query β Hierarchical Filter β Vector Search β Re-ranking β LLM Generation β Answer | |
| β | |
| (Hier-RAG only) | |
| ``` | |
| ## Quick Start | |
| ### Prerequisites | |
| - Python 3.9+ | |
| - OpenAI API key (for LLM generation) | |
| - 4GB+ RAM recommended | |
| ### Installation | |
| 1. **Clone the repository:** | |
| ```bash | |
| git clone <repository-url> | |
| cd hierarchical-rag-eval | |
| ``` | |
| 2. **Create virtual environment:** | |
| ```bash | |
| python -m venv venv | |
| # Windows | |
| venv\Scripts\activate | |
| # Mac/Linux | |
| source venv/bin/activate | |
| ``` | |
| 3. **Install dependencies:** | |
| ```bash | |
| pip install -r requirements.txt | |
| ``` | |
| 4. **Set environment variables:** | |
| Create a `.env` file in the project root: | |
| ```bash | |
| OPENAI_API_KEY=your-openai-api-key-here | |
| VECTOR_DB_PATH=./data/chroma | |
| EMBEDDING_MODEL=sentence-transformers/all-MiniLM-L6-v2 | |
| LLM_MODEL=gpt-3.5-turbo | |
| ``` | |
| **Important:** Never commit `.env` file to version control! | |
| 5. **Run the application:** | |
| ```bash | |
| python app.py | |
| ``` | |
| Access at `http://localhost:7860` | |
| --- | |
| ## π Deployment to Hugging Face Spaces | |
| ### Step 1: Create Space | |
| 1. Go to https://huggingface.co/spaces | |
| 2. Click "Create new Space" | |
| 3. Fill in details: | |
| - **Owner**: `AP-UW` (organization) | |
| - **Space name**: `hierarchical-rag-eval` | |
| - **License**: MIT | |
| - **SDK**: Gradio | |
| - **Python version**: 3.10 | |
| - **Visibility**: Private | |
| ### Step 2: Configure Persistent Storage | |
| 1. Go to Space Settings β Storage | |
| 2. Enable **Persistent Storage** (FREE tier available) | |
| 3. This ensures your vector database persists across restarts | |
| ### Step 3: Add Secrets | |
| 1. Go to Space Settings β Repository Secrets | |
| 2. Add the following secrets: | |
| | Secret Name | Value | Description | | |
| |-------------|-------|-------------| | |
| | `OPENAI_API_KEY` | `sk-...` | Your OpenAI API key | | |
| | `VECTOR_DB_PATH` | `/data/chroma` | Path to persistent storage | | |
| | `EMBEDDING_MODEL` | `sentence-transformers/all-MiniLM-L6-v2` | Embedding model | | |
| | `LLM_MODEL` | `gpt-3.5-turbo` | OpenAI model | | |
| **Note:** Secrets are encrypted and not visible in logs. | |
| ### Step 4: Prepare Code for Deployment | |
| Update `app.py` to read from HF Spaces environment: | |
| ```python | |
| import os | |
| from dotenv import load_dotenv | |
| # Load .env for local development only | |
| if not os.getenv("SPACE_ID"): # SPACE_ID is set by HF Spaces | |
| load_dotenv() | |
| # Verify API key | |
| api_key = os.getenv("OPENAI_API_KEY") | |
| if not api_key: | |
| raise ValueError("β οΈ OPENAI_API_KEY not found! Set it in Space Settings β Secrets") | |
| ``` | |
| ### Step 5: Push to Hugging Face | |
| ```bash | |
| # Add HF Space as remote | |
| git remote add space https://huggingface.co/spaces/AP-UW/hierarchical-rag-eval | |
| git branch -M main | |
| # Push code (will trigger automatic build) | |
| git push space main | |
| ``` | |
| ### Step 6: Monitor Deployment | |
| 1. Go to your Space URL: `https://huggingface.co/spaces/AP-UW/hierarchical-rag-eval` | |
| 2. Check **Logs** tab for build progress | |
| 3. Wait for "Running" status (may take 5-10 minutes on first build) | |
| ### Step 7: Verify Deployment | |
| Test the deployed app: | |
| ```python | |
| from gradio_client import Client | |
| client = Client("https://huggingface.co/spaces/AP-UW/hierarchical-rag-eval") | |
| # Initialize system | |
| result = client.predict(api_name="/initialize") | |
| print(result) # Should show "System initialized successfully!" | |
| ``` | |
| --- | |
| ## π MCP Server Usage | |
| The MCP (Model Context Protocol) Server provides RESTful API access to all RAG functionalities. | |
| ### Running MCP Server (Local) | |
| ```bash | |
| # Terminal 1: Start MCP Server | |
| python mcp_server.py | |
| # Server will run at http://localhost:8000 | |
| # API docs available at http://localhost:8000/docs | |
| ``` | |
| ### Running MCP Server (Production) | |
| Deploy separately to a hosting service: | |
| **Option 1: Railway** | |
| ```bash | |
| railway login | |
| railway init | |
| railway up | |
| ``` | |
| **Option 2: Render** | |
| 1. Connect GitHub repo | |
| 2. Set build command: `pip install -r requirements.txt` | |
| 3. Set start command: `uvicorn mcp_server:app --host 0.0.0.0 --port $PORT` | |
| **Option 3: Docker** | |
| ```dockerfile | |
| FROM python:3.10-slim | |
| WORKDIR /app | |
| COPY requirements.txt . | |
| RUN pip install -r requirements.txt | |
| COPY . . | |
| EXPOSE 8000 | |
| CMD ["uvicorn", "mcp_server:app", "--host", "0.0.0.0", "--port", "8000"] | |
| ``` | |
| ### MCP API Endpoints | |
| #### Health Check | |
| ```bash | |
| curl http://localhost:8000/health | |
| ``` | |
| Response: | |
| ```json | |
| {"status": "healthy"} | |
| ``` | |
| #### Initialize System | |
| ```bash | |
| curl -X POST http://localhost:8000/initialize \ | |
| -H "Content-Type: application/json" \ | |
| -d '{ | |
| "persist_directory": "./data/chroma", | |
| "embedding_model": "sentence-transformers/all-MiniLM-L6-v2" | |
| }' | |
| ``` | |
| #### Index Documents | |
| ```bash | |
| curl -X POST http://localhost:8000/index \ | |
| -H "Content-Type: application/json" \ | |
| -d '{ | |
| "filepaths": ["./docs/document1.pdf", "./docs/document2.txt"], | |
| "hierarchy": "hospital", | |
| "chunk_size": 512, | |
| "chunk_overlap": 50, | |
| "collection_name": "medical_docs" | |
| }' | |
| ``` | |
| #### Query RAG System | |
| ```bash | |
| curl -X POST http://localhost:8000/query \ | |
| -H "Content-Type: application/json" \ | |
| -d '{ | |
| "query": "What are the patient admission procedures?", | |
| "pipeline": "both", | |
| "n_results": 5, | |
| "auto_infer": true | |
| }' | |
| ``` | |
| Response: | |
| ```json | |
| { | |
| "query": "What are the patient admission procedures?", | |
| "base_rag": { | |
| "answer": "...", | |
| "retrieval_time": 0.052, | |
| "total_time": 1.234 | |
| }, | |
| "hier_rag": { | |
| "answer": "...", | |
| "retrieval_time": 0.031, | |
| "total_time": 0.987, | |
| "applied_filters": {"level1": "Clinical Care"} | |
| }, | |
| "speedup": 1.25 | |
| } | |
| ``` | |
| #### System Information | |
| ```bash | |
| curl http://localhost:8000/info | |
| ``` | |
| ### Python Client Example | |
| ```python | |
| import requests | |
| # Base URL | |
| BASE_URL = "http://localhost:8000" | |
| # Initialize | |
| response = requests.post(f"{BASE_URL}/initialize", json={ | |
| "persist_directory": "./data/chroma" | |
| }) | |
| print(response.json()) | |
| # Index documents | |
| response = requests.post(f"{BASE_URL}/index", json={ | |
| "filepaths": ["document.pdf"], | |
| "hierarchy": "hospital", | |
| "collection_name": "my_docs" | |
| }) | |
| print(response.json()) | |
| # Query | |
| response = requests.post(f"{BASE_URL}/query", json={ | |
| "query": "What are KYC requirements?", | |
| "pipeline": "both", | |
| "n_results": 5 | |
| }) | |
| result = response.json() | |
| print(f"Base-RAG: {result['base_rag']['answer']}") | |
| print(f"Hier-RAG: {result['hier_rag']['answer']}") | |
| print(f"Speedup: {result['speedup']:.2f}x") | |
| ``` | |
| --- | |
| ## π Evaluation Methodology | |
| ### Dataset | |
| We evaluate on three domain-specific query sets: | |
| 1. **Hospital Domain (n=5 queries)** | |
| - Clinical Care, Quality & Safety, Education | |
| - Example: "What are the patient admission procedures?" | |
| 2. **Banking Domain (n=5 queries)** | |
| - Retail Banking, Risk Management, Compliance | |
| - Example: "What are the KYC requirements?" | |
| 3. **Fluid Simulation Domain (n=5 queries)** | |
| - Numerical Methods, Physical Models, Applications | |
| - Example: "How does the SIMPLE algorithm work?" | |
| ### Metrics | |
| #### Retrieval Metrics | |
| - **Hit@k**: Presence of at least one relevant document in top-k results | |
| - Formula: `1 if any(relevant_doc in top_k) else 0` | |
| - Higher is better (max = 1.0) | |
| - **Precision@k**: Proportion of relevant documents in top-k | |
| - Formula: `relevant_in_top_k / k` | |
| - Range: 0.0 to 1.0 | |
| - **Recall@k**: Proportion of relevant documents retrieved | |
| - Formula: `relevant_in_top_k / total_relevant` | |
| - Range: 0.0 to 1.0 | |
| - **MRR (Mean Reciprocal Rank)**: Average of reciprocal ranks | |
| - Formula: `1 / rank_of_first_relevant_doc` | |
| - Range: 0.0 to 1.0 | |
| #### Performance Metrics | |
| - **Retrieval Time**: Time to fetch relevant documents from vector DB | |
| - **Generation Time**: Time for LLM to generate answer | |
| - **Total Time**: End-to-end query response time | |
| - **Speedup**: Ratio of Base-RAG to Hier-RAG total time | |
| - Formula: `base_total_time / hier_total_time` | |
| - >1.0 means Hier-RAG is faster | |
| #### Quality Metrics | |
| - **Semantic Similarity**: Cosine similarity between generated answer and reference | |
| - Uses sentence-transformers embeddings | |
| - Range: 0.0 to 1.0 | |
| ### Evaluation Process | |
| ```python | |
| # Run evaluation via Gradio API | |
| from gradio_client import Client | |
| client = Client("http://localhost:7860") | |
| result = client.predict( | |
| query_dataset="hospital", | |
| n_queries=10, | |
| k_values="1,3,5", | |
| api_name="/evaluate" | |
| ) | |
| # Results saved to ./reports/evaluation_TIMESTAMP.csv | |
| ``` | |
| ### Sample Results | |
| #### Hospital Domain Evaluation (5 queries) | |
| | Query | Expected Domain | Base Time (s) | Hier Time (s) | Speedup | Filter Match | | |
| |-------|----------------|---------------|---------------|---------|--------------| | |
| | Patient admission procedures? | Clinical Care | 1.97 | 2.76 | 0.72x | β Clinical Care | | |
| | Infection control policies? | Quality & Safety | 1.51 | 3.11 | 0.49x | β οΈ policy only | | |
| | Medication error reporting? | Quality & Safety | 1.03 | 2.41 | 0.43x | β οΈ report only | | |
| | Training for new nurses? | Education | 10.09 | 5.62 | 1.80x | β None | | |
| | Emergency response procedures? | Clinical Care | 2.32 | 1.49 | 1.56x | β None | | |
| **Average Speedup: 0.96x** (Base-RAG and Hier-RAG roughly equal) | |
| #### Key Findings | |
| 1. **When Hier-RAG Excels (1.5-2.3x faster):** | |
| - β Query matches hierarchy taxonomy well | |
| - β Auto-inference correctly identifies domain | |
| - β Filtered subset is significantly smaller (<30% of corpus) | |
| - Example: "Training for new nurses" β 1.80x speedup | |
| 2. **When Hier-RAG Underperforms (<1.0x):** | |
| - β Auto-inference fails or misclassifies domain | |
| - β Query is too general/cross-domain | |
| - β Filter overhead exceeds retrieval time savings | |
| - Example: "Infection control policies" β 0.49x speedup | |
| 3. **Auto-Inference Accuracy:** | |
| - Hospital domain: 40% (2/5 queries correctly classified) | |
| - Needs improvement via LLM-based classification | |
| 4. **Retrieval Time Improvement:** | |
| - When filters applied correctly: **30-60% faster retrieval** | |
| - Overall average: **15% faster retrieval** (including misses) | |
| #### Fluid Simulation Domain Evaluation (5 queries) | |
| | Query | Expected Domain | Base Time (s) | Hier Time (s) | Speedup | | |
| |-------|----------------|---------------|---------------|---------| | |
| | How does SIMPLE algorithm work? | Numerical Methods | 1.45 | 3.69 | 0.39x | | |
| | What turbulence models available? | Physical Models | 1.60 | 1.37 | 1.16x | | |
| | Set up cavity flow benchmark? | Validation | 4.46 | 2.40 | 1.86x | | |
| | Mesh generation techniques? | Numerical Methods | 2.64 | 2.87 | 0.92x | | |
| | Enable parallel computing? | Software & Tools | 5.51 | 2.35 | 2.34x | | |
| **Average Speedup: 1.33x** (Hier-RAG 33% faster on average) | |
| ### Visualization | |
| To generate evaluation charts: | |
| ```python | |
| # Add to your evaluation workflow | |
| import matplotlib.pyplot as plt | |
| import pandas as pd | |
| def generate_evaluation_charts(csv_path): | |
| """Generate comprehensive evaluation visualizations.""" | |
| df = pd.read_csv(csv_path) | |
| fig, axes = plt.subplots(2, 2, figsize=(14, 10)) | |
| fig.suptitle('Base-RAG vs Hier-RAG Performance Comparison', fontsize=16) | |
| # Chart 1: Average Total Time | |
| times = df[['base_total_time', 'hier_total_time']].mean() | |
| axes[0, 0].bar(['Base-RAG', 'Hier-RAG'], times, color=['#3498db', '#e74c3c']) | |
| axes[0, 0].set_ylabel('Time (seconds)') | |
| axes[0, 0].set_title('Average Total Query Time') | |
| axes[0, 0].grid(axis='y', alpha=0.3) | |
| # Chart 2: Speedup Distribution | |
| axes[0, 1].hist(df['speedup'], bins=10, color='#2ecc71', edgecolor='black') | |
| axes[0, 1].axvline(1.0, color='red', linestyle='--', label='No improvement') | |
| axes[0, 1].set_xlabel('Speedup Factor') | |
| axes[0, 1].set_ylabel('Frequency') | |
| axes[0, 1].set_title('Speedup Distribution') | |
| axes[0, 1].legend() | |
| # Chart 3: Retrieval Time Comparison | |
| axes[1, 0].scatter(df['base_retrieval_time'], df['hier_retrieval_time'], | |
| s=100, alpha=0.6, color='#9b59b6') | |
| max_val = max(df['base_retrieval_time'].max(), df['hier_retrieval_time'].max()) | |
| axes[1, 0].plot([0, max_val], [0, max_val], 'r--', label='Equal performance') | |
| axes[1, 0].set_xlabel('Base-RAG Retrieval Time (s)') | |
| axes[1, 0].set_ylabel('Hier-RAG Retrieval Time (s)') | |
| axes[1, 0].set_title('Retrieval Time Comparison') | |
| axes[1, 0].legend() | |
| axes[1, 0].grid(alpha=0.3) | |
| # Chart 4: Query-wise Speedup | |
| axes[1, 1].barh(range(len(df)), df['speedup'], color='#f39c12') | |
| axes[1, 1].axvline(1.0, color='red', linestyle='--', linewidth=2) | |
| axes[1, 1].set_xlabel('Speedup Factor') | |
| axes[1, 1].set_ylabel('Query Index') | |
| axes[1, 1].set_title('Per-Query Speedup') | |
| axes[1, 1].grid(axis='x', alpha=0.3) | |
| plt.tight_layout() | |
| plt.savefig(csv_path.replace('.csv', '_charts.png'), dpi=300, bbox_inches='tight') | |
| print(f"π Charts saved to: {csv_path.replace('.csv', '_charts.png')}") | |
| # Usage | |
| generate_evaluation_charts('./reports/evaluation_20251030_012814.csv') | |
| ``` | |
| --- | |
| ## π§ Using the API with gradio_client | |
| ### Installation | |
| ```bash | |
| pip install gradio_client | |
| ``` | |
| ### Basic Usage | |
| ```python | |
| from gradio_client import Client | |
| # Connect to local instance | |
| client = Client("http://localhost:7860") | |
| # Or connect to deployed HF Space | |
| client = Client("https://huggingface.co/spaces/AP-UW/hierarchical-rag-eval") | |
| ``` | |
| ### Complete Workflow Example | |
| ```python | |
| from gradio_client import Client | |
| import time | |
| # Initialize client | |
| client = Client("http://localhost:7860") | |
| # Step 1: Initialize system | |
| print("1οΈβ£ Initializing system...") | |
| result = client.predict(api_name="/initialize") | |
| print(result) | |
| # Step 2: Upload and validate documents | |
| print("\n2οΈβ£ Validating documents...") | |
| status, preview, stats = client.predict( | |
| files=["./docs/hospital_policy.pdf", "./docs/procedures.txt"], | |
| hierarchy_choice="hospital", | |
| mask_pii=False, | |
| api_name="/upload" | |
| ) | |
| print(f"Status: {status}") | |
| print(f"Stats: {stats}") | |
| # Step 3: Build RAG index | |
| print("\n3οΈβ£ Building RAG index...") | |
| build_status, build_stats = client.predict( | |
| files=["./docs/hospital_policy.pdf", "./docs/procedures.txt"], | |
| hierarchy="hospital", | |
| chunk_size=512, | |
| chunk_overlap=50, | |
| mask_pii=False, | |
| collection_name="hospital_docs", | |
| api_name="/build" | |
| ) | |
| print(f"Build Status: {build_status}") | |
| print(f"Indexed Chunks: {build_stats.get('Total Chunks', 0)}") | |
| # Step 4: Search with both pipelines | |
| print("\n4οΈβ£ Querying RAG system...") | |
| answer, contexts, metadata = client.predict( | |
| query="What are the patient admission procedures?", | |
| pipeline="Both", | |
| n_results=5, | |
| level1="", | |
| level2="", | |
| level3="", | |
| doc_type="", | |
| auto_infer=True, | |
| api_name="/search" | |
| ) | |
| print(f"Answer:\n{answer}\n") | |
| print(f"Metadata:\n{metadata}") | |
| # Step 5: Run evaluation | |
| print("\n5οΈβ£ Running evaluation...") | |
| summary, csv_path, json_path = client.predict( | |
| query_dataset="hospital", | |
| n_queries=5, | |
| k_values="1,3,5", | |
| api_name="/evaluate" | |
| ) | |
| print(summary) | |
| print(f"\nResults saved to:\n- {csv_path}\n- {json_path}") | |
| ``` | |
| ### Batch Processing Example | |
| ```python | |
| from gradio_client import Client | |
| import pandas as pd | |
| client = Client("http://localhost:7860") | |
| # Initialize | |
| client.predict(api_name="/initialize") | |
| # Build index for multiple document sets | |
| document_sets = { | |
| "hospital_policies": ["./docs/policy1.pdf", "./docs/policy2.pdf"], | |
| "clinical_protocols": ["./docs/protocol1.txt", "./docs/protocol2.txt"], | |
| "training_manuals": ["./docs/manual1.pdf", "./docs/manual2.pdf"] | |
| } | |
| for collection_name, files in document_sets.items(): | |
| print(f"Building index for: {collection_name}") | |
| status, stats = client.predict( | |
| files=files, | |
| hierarchy="hospital", | |
| collection_name=collection_name, | |
| api_name="/build" | |
| ) | |
| print(f"β {stats.get('Total Chunks', 0)} chunks indexed") | |
| # Query multiple collections | |
| queries = [ | |
| "What are admission procedures?", | |
| "How to handle medication errors?", | |
| "What training is required for nurses?" | |
| ] | |
| results = [] | |
| for query in queries: | |
| answer, contexts, metadata = client.predict( | |
| query=query, | |
| pipeline="Both", | |
| api_name="/search" | |
| ) | |
| results.append({ | |
| "query": query, | |
| "answer": answer[:200], # First 200 chars | |
| "metadata": metadata | |
| }) | |
| # Save results | |
| df = pd.DataFrame(results) | |
| df.to_csv("batch_query_results.csv", index=False) | |
| ``` | |
| --- | |
| ## π Troubleshooting | |
| ### Common Issues | |
| #### 1. OpenAI API Errors | |
| **Problem:** `Error generating answer: Incorrect API key provided` | |
| **Solution:** | |
| ```bash | |
| # Check if API key is set | |
| echo $OPENAI_API_KEY # Mac/Linux | |
| echo %OPENAI_API_KEY% # Windows | |
| # If empty, add to .env file | |
| OPENAI_API_KEY=your-key-here | |
| # For HF Spaces, add to Repository Secrets | |
| ``` | |
| #### 2. ChromaDB Persistence Issues | |
| **Problem:** `sqlite3.OperationalError: database is locked` | |
| **Solution:** | |
| ```python | |
| # In core/index.py - use simpler client initialization | |
| self.client = chromadb.PersistentClient(path=persist_directory) | |
| # Or use EphemeralClient for testing (no persistence) | |
| self.client = chromadb.EphemeralClient() | |
| ``` | |
| #### 3. Memory Errors with Large PDFs | |
| **Problem:** `MemoryError` or `Killed` when processing large documents | |
| **Solution:** | |
| ```python | |
| # Reduce batch size in core/index.py | |
| def add_documents(self, chunks, batch_size=50): # Reduced from 100 | |
| # Process in smaller batches | |
| ``` | |
| #### 4. Slow Embedding Generation | |
| **Problem:** Embedding generation takes >30 seconds | |
| **Solution:** | |
| ```python | |
| # Use smaller embedding model in .env | |
| EMBEDDING_MODEL=all-MiniLM-L6-v2 # Faster, 384 dimensions | |
| # Or use OpenAI embeddings | |
| EMBEDDING_MODEL=openai:text-embedding-3-small | |
| ``` | |
| #### 5. Gradio API Connection Timeout | |
| **Problem:** `gradio_client` times out when connecting | |
| **Solution:** | |
| ```python | |
| from gradio_client import Client | |
| # Increase timeout | |
| client = Client("http://localhost:7860", timeout=120) | |
| # Or check if server is running | |
| import requests | |
| response = requests.get("http://localhost:7860") | |
| print(response.status_code) # Should be 200 | |
| ``` | |
| #### 6. HF Spaces Build Failure | |
| **Problem:** Space shows "Build Failed" status | |
| **Solution:** | |
| 1. Check requirements.txt for incompatible versions | |
| 2. View build logs in Space β Logs tab | |
| 3. Common fix: Pin exact versions | |
| ```txt | |
| # requirements.txt | |
| torch==2.1.0 # Pin specific version | |
| transformers==4.35.0 | |
| gradio==4.44.0 | |
| ``` | |
| #### 7. Evaluation Results Inconsistent | |
| **Problem:** Speedup values sometimes <1.0 or highly variable | |
| **Solution:** | |
| - Run evaluation multiple times and average results | |
| - Increase warmup queries before evaluation | |
| - Check if auto-inference is working correctly | |
| ```python | |
| # Add warmup queries | |
| for _ in range(3): | |
| rag_comparator.compare("warmup query", n_results=5) | |
| # Then run actual evaluation | |
| ``` | |
| ### Debug Mode | |
| Enable verbose logging: | |
| ```python | |
| # Add to app.py | |
| import logging | |
| logging.basicConfig( | |
| level=logging.DEBUG, | |
| format='%(asctime)s - %(name)s - %(levelname)s - %(message)s', | |
| handlers=[ | |
| logging.FileHandler('app.log'), | |
| logging.StreamHandler() | |
| ] | |
| ) | |
| logger = logging.getLogger(__name__) | |
| logger.debug("Debug mode enabled") | |
| ``` | |
| ### Health Check Endpoints | |
| Test system components: | |
| ```python | |
| # Add to app.py for debugging | |
| def system_health_check(): | |
| """Check if all components are working.""" | |
| checks = {} | |
| # Check 1: OpenAI API | |
| try: | |
| import openai | |
| client = openai.OpenAI(api_key=os.getenv("OPENAI_API_KEY")) | |
| client.models.list() | |
| checks["openai_api"] = "β Connected" | |
| except Exception as e: | |
| checks["openai_api"] = f"β {str(e)}" | |
| # Check 2: Vector DB | |
| try: | |
| if index_manager: | |
| stats = index_manager.stores.get("rag_documents") | |
| checks["vector_db"] = "β Initialized" | |
| else: | |
| checks["vector_db"] = "β οΈ Not initialized" | |
| except Exception as e: | |
| checks["vector_db"] = f"β {str(e)}" | |
| # Check 3: Embedding Model | |
| try: | |
| from core.index import EmbeddingModel | |
| model = EmbeddingModel() | |
| test_embedding = model.embed_query("test") | |
| checks["embedding_model"] = f"β Loaded ({len(test_embedding)} dims)" | |
| except Exception as e: | |
| checks["embedding_model"] = f"β {str(e)}" | |
| return checks | |
| # Add button to UI | |
| with gr.Tab("System Health"): | |
| health_btn = gr.Button("Check System Health") | |
| health_output = gr.JSON(label="Health Status") | |
| health_btn.click(system_health_check, outputs=health_output) | |
| ``` | |
| --- | |
| ## π Additional Resources | |
| ### Documentation | |
| - [Gradio Documentation](https://gradio.app/docs/) | |
| - [Gradio Client Guide](https://gradio.app/guides/getting-started-with-the-python-client/) | |
| - [ChromaDB Documentation](https://docs.trychroma.com/) | |
| - [OpenAI API Reference](https://platform.openai.com/docs/api-reference) | |
| - [Sentence Transformers](https://www.sbert.net/) | |
| ### Tutorials | |
| - [Building RAG Applications](https://python.langchain.com/docs/use_cases/question_answering/) | |
| - [Deploying to HF Spaces](https://huggingface.co/docs/hub/spaces-overview) | |
| - [Vector Database Best Practices](https://www.pinecone.io/learn/vector-database/) | |
| ### Community | |
| - GitHub Issues: [repository-url]/issues | |
| - Hugging Face Forums: https://discuss.huggingface.co/ | |
| - Discord: [Your project Discord] | |
| --- | |
| ## π License | |
| MIT License - see LICENSE file for details | |
| --- | |
| ## π Acknowledgments | |
| - Built with [Gradio](https://gradio.app/) | |
| - Vector database: [ChromaDB](https://www.trychroma.com/) | |
| - Embeddings: [Sentence Transformers](https://www.sbert.net/) | |
| - LLM: [OpenAI](https://openai.com/) | |
| --- | |
| ## π Support | |
| For issues and questions: | |
| - **GitHub Issues**: [repository-url]/issues | |
| - **Email**: support@your-domain.com | |
| - **Documentation**: [repository-url]/wiki | |
| --- | |
| ## π Changelog | |
| ### v1.0.0 (2025-01-31) | |
| - β Initial release | |
| - β Base-RAG and Hier-RAG implementation | |
| - β Three preset hierarchies (Hospital, Bank, Fluid Simulation) | |
| - β Gradio UI and MCP server | |
| - β Comprehensive evaluation suite | |
| - β Full test coverage | |
| - β HF Spaces deployment ready | |
| --- | |
| **Built with β€οΈ for the RAG community** |