Spaces:
Sleeping
Sleeping
| # RAG Capstone Project | |
| A comprehensive Retrieval-Augmented Generation (RAG) system with TRACE evaluation metrics for medical/clinical domains. | |
| ## Features | |
| - π **Multiple RAG Bench Datasets**: HotpotQA, 2WikiMultihopQA, MuSiQue, Natural Questions, TriviaQA | |
| - π§© **Chunking Strategies**: Dense, Sparse, Hybrid, Re-ranking | |
| - π€ **Medical Embedding Models**: | |
| - sentence-transformers/embeddinggemma-300m-medical | |
| - emilyalsentzer/Bio_ClinicalBERT | |
| - Simonlee711/Clinical_ModernBERT | |
| - πΎ **ChromaDB Vector Storage**: Persistent vector storage with efficient retrieval | |
| - π¦ **Groq LLM Integration**: With rate limiting (30 RPM) | |
| - meta-llama/llama-4-maverick-17b-128e-instruct | |
| - llama-3.1-8b-instant | |
| - openai/gpt-oss-120b | |
| - π **TRACE Evaluation Metrics**: | |
| - **U**tilization: How well the system uses retrieved documents | |
| - **R**elevance: Relevance of retrieved documents to the query | |
| - **A**dherence: How well the response adheres to the retrieved context | |
| - **C**ompleteness: How complete the response is | |
| - π¬ **Chat Interface**: Streamlit-based interactive chat with history | |
| - π **REST API**: FastAPI backend for integration | |
| ## Installation | |
| ### Prerequisites | |
| - Python 3.8+ | |
| - pip | |
| - Groq API key | |
| ### Setup | |
| 1. Clone the repository: | |
| ```bash | |
| git clone <repository-url> | |
| cd "RAG Capstone Project" | |
| ``` | |
| 2. Create a virtual environment: | |
| ```bash | |
| python -m venv venv | |
| ``` | |
| 3. Activate the virtual environment: | |
| **Windows:** | |
| ```bash | |
| .\venv\Scripts\activate | |
| ``` | |
| **Linux/Mac:** | |
| ```bash | |
| source venv/bin/activate | |
| ``` | |
| 4. Install dependencies: | |
| ```bash | |
| pip install -r requirements.txt | |
| ``` | |
| 5. Create a `.env` file from the example: | |
| ```bash | |
| copy .env.example .env | |
| ``` | |
| 6. Edit `.env` and add your Groq API key: | |
| ``` | |
| GROQ_API_KEY=your_groq_api_key_here | |
| ``` | |
| ## Usage | |
| ### Streamlit Application | |
| Run the interactive Streamlit interface: | |
| ```bash | |
| streamlit run streamlit_app.py | |
| ``` | |
| Then open your browser to `http://localhost:8501` | |
| **Workflow:** | |
| 1. Enter your Groq API key in the sidebar | |
| 2. Select a dataset from RAG Bench | |
| 3. Choose chunking strategy | |
| 4. Select embedding model | |
| 5. Choose LLM model | |
| 6. Click "Load Data & Create Collection" | |
| 7. Start chatting! | |
| 8. View retrieved documents | |
| 9. Run TRACE evaluation | |
| 10. Export chat history | |
| ### FastAPI Backend | |
| Run the REST API server: | |
| ```bash | |
| python api.py | |
| ``` | |
| Or with uvicorn: | |
| ```bash | |
| uvicorn api:app --reload --host 0.0.0.0 --port 8000 | |
| ``` | |
| API documentation available at: `http://localhost:8000/docs` | |
| #### API Endpoints | |
| - `GET /` - Root endpoint | |
| - `GET /health` - Health check | |
| - `GET /datasets` - List available datasets | |
| - `GET /models/embedding` - List embedding models | |
| - `GET /models/llm` - List LLM models | |
| - `GET /chunking-strategies` - List chunking strategies | |
| - `GET /collections` - List all collections | |
| - `GET /collections/{name}` - Get collection info | |
| - `POST /load-dataset` - Load dataset and create collection | |
| - `POST /query` - Query the RAG system | |
| - `GET /chat-history` - Get chat history | |
| - `DELETE /chat-history` - Clear chat history | |
| - `POST /evaluate` - Run TRACE evaluation | |
| - `DELETE /collections/{name}` - Delete collection | |
| ### Python API | |
| Use the components programmatically: | |
| ```python | |
| from config import settings | |
| from dataset_loader import RAGBenchLoader | |
| from vector_store import ChromaDBManager | |
| from llm_client import GroqLLMClient, RAGPipeline | |
| from trace_evaluator import TRACEEvaluator | |
| # Load dataset | |
| loader = RAGBenchLoader() | |
| dataset = loader.load_dataset("hotpotqa", max_samples=100) | |
| # Create vector store | |
| vector_store = ChromaDBManager() | |
| vector_store.load_dataset_into_collection( | |
| collection_name="my_collection", | |
| embedding_model_name="emilyalsentzer/Bio_ClinicalBERT", | |
| chunking_strategy="hybrid", | |
| dataset_data=dataset | |
| ) | |
| # Initialize LLM | |
| llm = GroqLLMClient( | |
| api_key="your_api_key", | |
| model_name="llama-3.1-8b-instant" | |
| ) | |
| # Create RAG pipeline | |
| rag = RAGPipeline(llm, vector_store) | |
| # Query | |
| result = rag.query("What is the capital of France?") | |
| print(result["response"]) | |
| # Evaluate | |
| evaluator = TRACEEvaluator() | |
| test_cases = [...] # Your test cases | |
| results = evaluator.evaluate_batch(test_cases) | |
| print(results) | |
| ``` | |
| ## Project Structure | |
| ``` | |
| RAG Capstone Project/ | |
| βββ __init__.py # Package initialization | |
| βββ config.py # Configuration management | |
| βββ dataset_loader.py # RAG Bench dataset loader | |
| βββ chunking_strategies.py # Document chunking strategies | |
| βββ embedding_models.py # Embedding model implementations | |
| βββ vector_store.py # ChromaDB integration | |
| βββ llm_client.py # Groq LLM client with rate limiting | |
| βββ trace_evaluator.py # TRACE evaluation metrics | |
| βββ streamlit_app.py # Streamlit chat interface | |
| βββ api.py # FastAPI REST API | |
| βββ requirements.txt # Python dependencies | |
| βββ .env.example # Environment variables template | |
| βββ .gitignore # Git ignore file | |
| βββ README.md # This file | |
| ``` | |
| ## TRACE Metrics Explained | |
| ### Utilization (U) | |
| Measures how well the system uses the retrieved documents in generating the response. Higher scores indicate that the system effectively incorporates information from multiple retrieved documents. | |
| ### Relevance (R) | |
| Evaluates the relevance of retrieved documents to the user's query. Uses lexical overlap and keyword matching to determine if the right documents were retrieved. | |
| ### Adherence (A) | |
| Assesses how well the generated response adheres to the retrieved context. Ensures the response is grounded in the provided documents rather than hallucinated. | |
| ### Completeness (C) | |
| Evaluates how complete the response is in answering the query. Considers response length, question type, and comparison with ground truth if available. | |
| ## Deployment Options | |
| ### Heroku | |
| 1. Create `Procfile`: | |
| ``` | |
| web: streamlit run streamlit_app.py --server.port=$PORT --server.address=0.0.0.0 | |
| api: uvicorn api:app --host=0.0.0.0 --port=$PORT | |
| ``` | |
| 2. Deploy: | |
| ```bash | |
| heroku create your-app-name | |
| git push heroku main | |
| ``` | |
| ### Docker | |
| Create `Dockerfile`: | |
| ```dockerfile | |
| FROM python:3.9-slim | |
| WORKDIR /app | |
| COPY requirements.txt . | |
| RUN pip install -r requirements.txt | |
| COPY . . | |
| EXPOSE 8501 8000 | |
| CMD ["streamlit", "run", "streamlit_app.py"] | |
| ``` | |
| Build and run: | |
| ```bash | |
| docker build -t rag-capstone . | |
| docker run -p 8501:8501 -p 8000:8000 rag-capstone | |
| ``` | |
| ### Cloud Run / AWS / Azure | |
| The application can be deployed to any cloud platform that supports Python applications. See the respective platform documentation for deployment instructions. | |
| ## Configuration | |
| Edit `config.py` or set environment variables in `.env`: | |
| ```env | |
| GROQ_API_KEY=your_api_key | |
| CHROMA_PERSIST_DIRECTORY=./chroma_db | |
| GROQ_RPM_LIMIT=30 | |
| RATE_LIMIT_DELAY=2.0 | |
| LOG_LEVEL=INFO | |
| ``` | |
| ## Rate Limiting | |
| The application implements rate limiting for Groq API calls: | |
| - Maximum 30 requests per minute (configurable) | |
| - Automatic delay of 2 seconds between requests | |
| - Smart waiting when rate limit is reached | |
| ## Troubleshooting | |
| ### ChromaDB Issues | |
| If you encounter ChromaDB errors, try deleting the `chroma_db` directory and recreating collections. | |
| ### Embedding Model Loading | |
| Medical embedding models may require significant memory. If you encounter out-of-memory errors, try: | |
| - Using a smaller model | |
| - Reducing batch size | |
| - Using CPU instead of GPU | |
| ### API Key Errors | |
| Ensure your Groq API key is correctly set in the `.env` file or passed to the application. | |
| ## License | |
| MIT License | |
| ## Contributors | |
| RAG Capstone Team | |
| ## Support | |
| For issues and questions, please open an issue on the GitHub repository. | |