rag_agent / README.md
Cheh Kit Hong
changed rag method flags
94e0eef
---
title: RAG Agent
emoji: πŸ•΅πŸ»β€β™‚οΈ
colorFrom: indigo
colorTo: indigo
sdk: gradio
sdk_version: 6.0.1
app_file: main.py
pinned: false
hf_oauth: true
hf_oauth_expiration_minutes: 480
---
# πŸ“ Project Structure
```
mai-rag-agent/
β”‚
β”œβ”€β”€ πŸ“‚ agent/ # Core agent logic
β”‚ β”œβ”€β”€ graph.py # LangGraph workflow definition
β”‚ β”œβ”€β”€ nodes.py # Agent nodes (router, vectordb, web_search, generate)
β”‚ β”œβ”€β”€ prompts.py # System prompts and templates
β”‚ β”œβ”€β”€ state.py # Agent state management (AgentState, RAG_method)
β”‚ └── tools.py # Tool definitions (Tavily, Wikipedia, ArXiv, ChromaDB)
β”‚
β”œβ”€β”€ πŸ“‚ core/ # Business logic layer
β”‚ β”œβ”€β”€ llm.py # LLM initialization (Anthropic Claude)
β”‚ └── rag_agent.py # Main RAGAgent class with graph orchestration
β”‚
β”œβ”€β”€ πŸ“‚ ui/ # User interface
β”‚ └── gradio_components.py # Gradio web interface components
β”‚
β”œβ”€β”€ πŸ“‚ knowledge_base/ # scripts for setting up Chroma
β”‚
β”œβ”€β”€ πŸ“‚ chroma_data/ # Artifacts for Chroma
β”‚
β”œβ”€β”€ πŸ“‚ docs/ # Source documents (PDFs, text files)
β”‚
β”œβ”€β”€ πŸ“„ main.py # Application entry point
β”œβ”€β”€ πŸ“„ config.py # Configuration settings
β”œβ”€β”€ πŸ“„ test_scripts.py # Agent testing script
β”‚
β”œβ”€β”€ πŸ“„ .env # Environment variables (API keys)
β”œβ”€β”€ πŸ“„ .gitignore # Git ignore rules
β”‚
β”œβ”€β”€ πŸ“„ requirements.txt # Python dependencies
β”œβ”€β”€ πŸ“„ pyproject.toml # Project metadata (if using uv)
β”‚
└── πŸ“„ README.md # Project documentation (this file)
```
## πŸ“‹ Key Components
### πŸ€– Agent Module (`agent/`)
- **`graph.py`**: Defines the LangGraph workflow with conditional routing
- **`nodes.py`**: Implements agent nodes:
- `router_node`: Classifies queries (RAG/WEBSEARCH/GENERAL)
- `vectordb_node`: Retrieves from local ChromaDB
- `web_search_agent_node`: Executes web searches
- `generate_node`: Generates final responses
- **`state.py`**: Defines `AgentState` with message history, routing method, and context
- **`tools.py`**: Tool implementations for Tavily, Wikipedia, ArXiv, and ChromaDB
- **`prompts.py`**: System prompts for routing and generation
### 🎯 Core Module (`core/`)
- **`llm.py`**: Initializes the LLM (Anthropic Claude Sonnet 4.5)
- **`rag_agent.py`**: Main `RAGAgent` class that orchestrates the graph
### πŸ–₯️ UI Module (`ui/`)
- **`gradio_components.py`**: Gradio web interface with chat functionality
### πŸ“Š Data Module (`data/`)
- **`documents/`**: Raw source documents for ingestion
- **`chroma_db/`**: Persisted vector embeddings
### βš™οΈ Configuration
- **`config.py`**: Centralized configuration (model names, paths, API settings)
- **`.env`**: API keys (ANTHROPIC_API_KEY, TAVILY_API_KEY)
### πŸš€ Entry Points
- **`main.py`**: Launches the Gradio UI
- **`test_scripts.py`**: Runs agent tests
## πŸ”„ Data Flow
```
User Query
↓
[Router Node] β†’ Classifies intent (RAG/WEBSEARCH/GENERAL)
↓
β”œβ”€β†’ [VectorDB Node] β†’ Retrieves from ChromaDB β†’ [Generate Node]
β”œβ”€β†’ [Web Search Agent] β†’ Calls Tavily/Wikipedia β†’ [Generate Node]
└─→ [Generate Node] β†’ Uses LLM knowledge only
↓
Response to User
```
## πŸ› οΈ Technology Stack
- **LangChain**: Framework for LLM applications
- **LangGraph**: Workflow orchestration
- **Anthropic Claude**: LLM (Sonnet 4.5)
- **ChromaDB**: Vector database
- **Gradio**: Web UI framework
- **HuggingFace**: Embeddings model
- **Tavily**: Web search API
- **UV**: Python package manager
## πŸš€ Quick Start with UV
### Prerequisites
- Python 3.10+
- UV package manager ([Install UV](https://github.com/astral-sh/uv))
- API Keys: Anthropic, Tavily
### 1️⃣ Clone the Repository
```bash
git clone https://github.com/yourusername/mai-rag-agent.git
cd mai-rag-agent
```
### 2️⃣ Create Virtual Environment with UV
```bash
# Create a new virtual environment
uv venv
# Activate the environment
source .venv/bin/activate # Linux/macOS
# or
.venv\Scripts\activate # Windows
```
### 3️⃣ Install Dependencies
```bash
# Install all dependencies from requirements.txt
uv pip install -r requirements.txt
# Or install directly from pyproject.toml (if available)
uv pip install -e .
```
### 4️⃣ Set Up Environment Variables
```bash
# Copy example environment file
cp .env.example .env
# Edit .env and add your API keys
nano .env # or use your preferred editor
```
**Required environment variables:**
```bash
GOOGLE_API_KEY=xxxxxxxxxxxxx # Gemini API key
TAVILY_API_KEY=tvly-xxxxxxxxxxxxx # Enable web search
```
### 5️⃣ Prepare Data
```bash
# Create necessary directories
mkdir -p data/documents data/chroma_db
# Add your documents to data/documents/
# Then run ingestion (if you have an ingestion script)
# python ingest_data.py
```
### 6️⃣ Run the Application
```bash
# Launch the Gradio UI
python main.py
```
### 7️⃣ Run Tests (Optional)
```bash
# Test the agent functionality
python test_scripts.py
```
---
## 🐳 Quick Start with Dev Container (Alternative)
If you're using VS Code with Dev Containers:
```bash
# 1. Open in VS Code
code .
# 2. Reopen in Container
# Command Palette (Ctrl+Shift+P) β†’ "Dev Containers: Reopen in Container"
# 3. Inside container, install dependencies
uv pip install -r requirements.txt
# 4. Set up .env file
cp .env.example .env
# Edit .env with your API keys
# 5. Run the app
python main.py
```
---
## πŸ“¦ UV-Specific Commands
```bash
# Update all dependencies
uv pip install --upgrade -r requirements.txt
# List installed packages
uv pip list
# Freeze current environment
uv pip freeze > requirements.txt
# Install a new package
uv pip install package-name
# Uninstall a package
uv pip uninstall package-name
# Sync environment (removes unused packages)
uv pip sync requirements.txt
```
---
## πŸ”§ Troubleshooting
### Issue: `uv` command not found
```bash
# Install UV
curl -LsSf https://astral.sh/uv/install.sh | sh
# Add to PATH (if needed)
export PATH="$HOME/.cargo/bin:$PATH"
```
### Issue: API key not loading
```bash
# Check if .env exists
cat .env | grep -i api
# Ensure no typos in variable names
# Should be: ANTHROPIC_API_KEY and TAVILY_API_KEY
```
### Issue: ChromaDB not found
```bash
# Ensure data directories exist
mkdir -p data/chroma_db
# Check permissions
chmod -R 755 data/
```
### Issue: Port 7860 already in use
```bash
# Find and kill the process
lsof -ti:7860 | xargs kill -9
# Or use a different port in main.py
# demo.launch(server_port=7861)
```
---
## 🎯 Next Steps
1. βœ… Add your documents to `data/documents/`
2. βœ… Configure embeddings model in `config.py`
3. βœ… Customize prompts in `agent/prompts.py`
4. βœ… Test with sample queries in the Gradio UI
5. βœ… Deploy to production (see deployment docs)
---
## πŸ“š Additional Resources
- [UV Documentation](https://github.com/astral-sh/uv)
- [LangGraph Docs](https://langchain-ai.github.io/langgraph/)
- [Gemini API](https://ai.google.dev/gemini-api/docs/api-key)
- [Tavily API](https://docs.tavily.com/)
- [ChromaDB Docs](https://docs.trychroma.com/)
## πŸ“š Reference (Used for demo as document in vector store)
1. Wei, H., Sun, Y., & Li, Y. (2025). Deepseek-ocr: Contexts optical compression. arXiv preprint arXiv:2510.18234.
2. Chen, X., Chu, F. J., Gleize, P., Liang, K. J., Sax, A., Tang, H., ... & SAM 3D Team. (2025). SAM 3D: 3Dfy Anything in Images. arXiv preprint arXiv:2511.16624.
3. Carion, N., Gustafson, L., Hu, Y. T., Debnath, S., Hu, R., Suris, D., ... & Feichtenhofer, C. (2025). SAM 3: Segment Anything with Concepts. arXiv preprint arXiv:2511.16719.
4. Yan, B. Y., Li, C., Qian, H., Lu, S., & Liu, Z. (2025). General Agentic Memory Via Deep Research. arXiv preprint arXiv:2511.18423.
5. Zhang, S., Fan, J., Fan, M., Li, G., & Du, X. (2025). Deepanalyze: Agentic large language models for autonomous data science. arXiv preprint arXiv:2510.16872.