Spaces:

kith777
/

rag_agent

Paused

App Files Files Community

Cheh Kit Hong commited on 14 days ago

Commit

04d4d26

1 Parent(s): c92c6b2

cleaning uncessary files

Browse files

Files changed (8) hide show

README.md +288 -20
core/chat_interface.py +0 -196
knowledge_base/chroma.py +3 -7
knowledge_base/test_retrieval.py +1 -5
notebook.ipynb +0 -37
quick_check.py +0 -16
requirements.txt +184 -20
testing_main.py +0 -11

README.md CHANGED Viewed

@@ -1,20 +1,288 @@
-rag_agent/
-├── app.py                    # Main Gradio application entry point
-├── config.py                 # Configuration hub (models, chunk sizes, providers)
-├── util.py                   # PDF to markdown conversion
-├── core/                     # Core RAG components orchestration
-│   ├── chat_interface.py
-│   ├── document_manager.py
-│   └── rag_system.py
-├── knowledge_base/           # for create chromadb
-├── chroma_data/              # chroma vectorstore data
-├── agent_logic/                # LangGraph agent workflow
-│   ├── edges.py              # Conditional routing logic
-│   ├── graph.py              # Graph construction and compilation
-│   ├── graph_state.py        # State definitions
-│   ├── nodes.py              # Processing nodes (summarize, rewrite, agent)
-│   ├── prompts.py            # System prompts
-│   ├── schemas.py            # Pydantic data models
-│   └── tools.py              # Retrieval tools
-└── ui/                       # User interface
-    └── gradio_app.py         # Gradio interface components

+---
+title: RAG Agent
+emoji: 🕵🏻‍♂️
+colorFrom: indigo
+colorTo: indigo
+sdk: gradio
+sdk_version: 6.0.1
+app_file: main.py
+pinned: false
+hf_oauth: true
+hf_oauth_expiration_minutes: 480
+---
+# 📁 Project Structure
+```
+mai-rag-agent/
+│
+├── 📂 agent/                      # Core agent logic
+│   ├── graph.py                   # LangGraph workflow definition
+│   ├── nodes.py                   # Agent nodes (router, vectordb, web_search, generate)
+│   ├── prompts.py                 # System prompts and templates
+│   ├── state.py                   # Agent state management (AgentState, RAG_method)
+│   └── tools.py                   # Tool definitions (Tavily, Wikipedia, ArXiv, ChromaDB)
+│
+├── 📂 core/                       # Business logic layer
+│   ├── llm.py                     # LLM initialization (Anthropic Claude)
+│   └── rag_agent.py               # Main RAGAgent class with graph orchestration
+│
+├── 📂 ui/                         # User interface
+│   └── gradio_components.py       # Gradio web interface components
+│
+├── 📂 knowledge_base/             # scripts for setting up Chroma
+│
+├── 📂 chroma_data/                # Artifacts for Chroma
+│
+├── 📂 docs/                       # Source documents (PDFs, text files)
+│
+├── 📄 main.py                     # Application entry point
+├── 📄 config.py                   # Configuration settings
+├── 📄 test_scripts.py             # Agent testing script
+│
+├── 📄 .env                        # Environment variables (API keys)
+├── 📄 .gitignore                  # Git ignore rules
+│
+├── 📄 requirements.txt            # Python dependencies
+├── 📄 pyproject.toml              # Project metadata (if using uv)
+│
+└── 📄 README.md                   # Project documentation (this file)
+```
+## 📋 Key Components
+### 🤖 Agent Module (`agent/`)
+- **`graph.py`**: Defines the LangGraph workflow with conditional routing
+- **`nodes.py`**: Implements agent nodes:
+  - `router_node`: Classifies queries (RAG/WEBSEARCH/GENERAL)
+  - `vectordb_node`: Retrieves from local ChromaDB
+  - `web_search_agent_node`: Executes web searches
+  - `generate_node`: Generates final responses
+- **`state.py`**: Defines `AgentState` with message history, routing method, and context
+- **`tools.py`**: Tool implementations for Tavily, Wikipedia, ArXiv, and ChromaDB
+- **`prompts.py`**: System prompts for routing and generation
+### 🎯 Core Module (`core/`)
+- **`llm.py`**: Initializes the LLM (Anthropic Claude Sonnet 4.5)
+- **`rag_agent.py`**: Main `RAGAgent` class that orchestrates the graph
+### 🖥️ UI Module (`ui/`)
+- **`gradio_components.py`**: Gradio web interface with chat functionality
+### 📊 Data Module (`data/`)
+- **`documents/`**: Raw source documents for ingestion
+- **`chroma_db/`**: Persisted vector embeddings
+### ⚙️ Configuration
+- **`config.py`**: Centralized configuration (model names, paths, API settings)
+- **`.env`**: API keys (ANTHROPIC_API_KEY, TAVILY_API_KEY)
+### 🚀 Entry Points
+- **`main.py`**: Launches the Gradio UI
+- **`test_scripts.py`**: Runs agent tests
+## 🔄 Data Flow
+```
+User Query
+    ↓
+[Router Node] → Classifies intent (RAG/WEBSEARCH/GENERAL)
+    ↓
+    ├─→ [VectorDB Node] → Retrieves from ChromaDB → [Generate Node]
+    ├─→ [Web Search Agent] → Calls Tavily/Wikipedia → [Generate Node]
+    └─→ [Generate Node] → Uses LLM knowledge only
+         ↓
+    Response to User
+```
+## 🛠️ Technology Stack
+- **LangChain**: Framework for LLM applications
+- **LangGraph**: Workflow orchestration
+- **Anthropic Claude**: LLM (Sonnet 4.5)
+- **ChromaDB**: Vector database
+- **Gradio**: Web UI framework
+- **HuggingFace**: Embeddings model
+- **Tavily**: Web search API
+- **UV**: Python package manager
+## 🚀 Quick Start with UV
+### Prerequisites
+- Python 3.10+
+- UV package manager ([Install UV](https://github.com/astral-sh/uv))
+- API Keys: Anthropic, Tavily
+### 1️⃣ Clone the Repository
+```bash
+git clone https://github.com/yourusername/mai-rag-agent.git
+cd mai-rag-agent
+```
+### 2️⃣ Create Virtual Environment with UV
+```bash
+# Create a new virtual environment
+uv venv
+# Activate the environment
+source .venv/bin/activate  # Linux/macOS
+# or
+.venv\Scripts\activate     # Windows
+```
+### 3️⃣ Install Dependencies
+```bash
+# Install all dependencies from requirements.txt
+uv pip install -r requirements.txt
+# Or install directly from pyproject.toml (if available)
+uv pip install -e .
+```
+### 4️⃣ Set Up Environment Variables
+```bash
+# Copy example environment file
+cp .env.example .env
+# Edit .env and add your API keys
+nano .env  # or use your preferred editor
+```
+**Required environment variables:**
+```bash
+ANTHROPIC_API_KEY=sk-ant-xxxxxxxxxxxxx
+TAVILY_API_KEY=tvly-xxxxxxxxxxxxx
+```
+### 5️⃣ Prepare Data
+```bash
+# Create necessary directories
+mkdir -p data/documents data/chroma_db
+# Add your documents to data/documents/
+# Then run ingestion (if you have an ingestion script)
+# python ingest_data.py
+```
+### 6️⃣ Run the Application
+```bash
+# Launch the Gradio UI
+python main.py
+```
+The app will be available at: **http://127.0.0.1:7860**
+### 7️⃣ Run Tests (Optional)
+```bash
+# Test the agent functionality
+python test_scripts.py
+```
+---
+## 🐳 Quick Start with Dev Container (Alternative)
+If you're using VS Code with Dev Containers:
+```bash
+# 1. Open in VS Code
+code .
+# 2. Reopen in Container
+# Command Palette (Ctrl+Shift+P) → "Dev Containers: Reopen in Container"
+# 3. Inside container, install dependencies
+uv pip install -r requirements.txt
+# 4. Set up .env file
+cp .env.example .env
+# Edit .env with your API keys
+# 5. Run the app
+python main.py
+```
+---
+## 📦 UV-Specific Commands
+```bash
+# Update all dependencies
+uv pip install --upgrade -r requirements.txt
+# List installed packages
+uv pip list
+# Freeze current environment
+uv pip freeze > requirements.txt
+# Install a new package
+uv pip install package-name
+# Uninstall a package
+uv pip uninstall package-name
+# Sync environment (removes unused packages)
+uv pip sync requirements.txt
+```
+---
+## 🔧 Troubleshooting
+### Issue: `uv` command not found
+```bash
+# Install UV
+curl -LsSf https://astral.sh/uv/install.sh | sh
+# Add to PATH (if needed)
+export PATH="$HOME/.cargo/bin:$PATH"
+```
+### Issue: API key not loading
+```bash
+# Check if .env exists
+cat .env | grep -i api
+# Ensure no typos in variable names
+# Should be: ANTHROPIC_API_KEY and TAVILY_API_KEY
+```
+### Issue: ChromaDB not found
+```bash
+# Ensure data directories exist
+mkdir -p data/chroma_db
+# Check permissions
+chmod -R 755 data/
+```
+### Issue: Port 7860 already in use
+```bash
+# Find and kill the process
+lsof -ti:7860 | xargs kill -9
+# Or use a different port in main.py
+# demo.launch(server_port=7861)
+```
+---
+## 🎯 Next Steps
+1. ✅ Add your documents to `data/documents/`
+2. ✅ Configure embeddings model in `config.py`
+3. ✅ Customize prompts in `agent/prompts.py`
+4. ✅ Test with sample queries in the Gradio UI
+5. ✅ Deploy to production (see deployment docs)
+---
+## 📚 Additional Resources
+- [UV Documentation](https://github.com/astral-sh/uv)
+- [LangGraph Docs](https://langchain-ai.github.io/langgraph/)
+- [Anthropic API](https://docs.anthropic.com/)
+- [Tavily API](https://docs.tavily.com/)
+- [ChromaDB Docs](https://docs.trychroma.com/)

core/chat_interface.py DELETED Viewed

@@ -1,196 +0,0 @@
-import gradio as gr
-from core.rag_agent import RAGAgent
-from core.document_manager import DocumentManager
-import os
-# Initialize components
-doc_manager = DocumentManager()
-rag_agent = None
-def initialize_agent():
-    """Initialize RAG agent lazily"""
-    global rag_agent
-    if rag_agent is None:
-        rag_agent = RAGAgent()
-    return rag_agent
-def upload_files(files):
-    """Handle file uploads"""
-    if not files:
-        return "No files selected", get_file_list()
-    results = []
-    for file in files:
-        try:
-            result = doc_manager.add_document(file.name)
-            results.append(result)
-        except Exception as e:
-            results.append(f"Error processing {os.path.basename(file.name)}: {str(e)}")
-    return "\n".join(results), get_file_list()
-def get_file_list():
-    """Get list of documents in the knowledge base"""
-    try:
-        files = doc_manager.list_documents()
-        if not files:
-            return "No documents in knowledge base"
-        return "\n".join([f"• {f}" for f in files])
-    except Exception as e:
-        return f"Error listing files: {str(e)}"
-def clear_database():
-    """Clear all documents from the knowledge base"""
-    try:
-        result = doc_manager.clear_all()
-        return result, get_file_list()
-    except Exception as e:
-        return f"Error clearing database: {str(e)}", get_file_list()
-def chat_with_agent(message, history):
-    """Handle chat interactions with the RAG agent"""
-    if not message.strip():
-        return history
-    try:
-        agent = initialize_agent()
-        # Stream the agent's response
-        response_text = ""
-        for event in agent.agent_graph.stream(
-            {"messages": [("user", message)]},
-            agent.get_config(),
-            stream_mode="values"
-        ):
-            if "messages" in event and len(event["messages"]) > 0:
-                last_message = event["messages"][-1]
-                if hasattr(last_message, "content"):
-                    response_text = last_message.content
-        if not response_text:
-            response_text = "I apologize, but I couldn't generate a response. Please try again."
-        return response_text
-    except Exception as e:
-        return f"Error: {str(e)}"
-def reset_conversation():
-    """Reset the conversation thread"""
-    global rag_agent
-    if rag_agent:
-        rag_agent.reset_thread()
-    return None  # Clear chat history
-def create_gradio_ui():
-    """Create the complete Gradio interface"""
-    with gr.Blocks(title="RAG Agent with Agentic Memory", theme=gr.themes.Soft()) as demo:
-        gr.Markdown("""
-        # 🤖 RAG Agent with Agentic Memory
-        Upload documents and chat with an intelligent agent that uses:
-        - 📚 **Local Knowledge Base** (ChromaDB)
-        - 🔍 **Web Search** (Tavily)
-        - 📖 **Wikipedia**
-        - 🎓 **ArXiv** (Academic Papers)
-        """)
-        with gr.Tabs():
-            # Documents Tab
-            with gr.Tab("📄 Documents"):
-                gr.Markdown("### Upload and Manage Documents")
-                gr.Markdown("Upload PDF or Markdown files to add them to the knowledge base.")
-                with gr.Row():
-                    with gr.Column(scale=2):
-                        file_upload = gr.File(
-                            label="Upload Documents",
-                            file_count="multiple",
-                            file_types=[".pdf", ".md"]
-                        )
-                        upload_btn = gr.Button("📤 Add to Knowledge Base", variant="primary")
-                        upload_status = gr.Textbox(label="Upload Status", lines=3)
-                    with gr.Column(scale=1):
-                        file_list = gr.Textbox(
-                            label="Documents in Knowledge Base",
-                            lines=10,
-                            value=get_file_list()
-                        )
-                        refresh_btn = gr.Button("🔄 Refresh List")
-                        clear_btn = gr.Button("🗑️ Clear All Documents", variant="stop")
-                # Connect document management buttons
-                upload_btn.click(
-                    fn=upload_files,
-                    inputs=[file_upload],
-                    outputs=[upload_status, file_list]
-                )
-                refresh_btn.click(
-                    fn=get_file_list,
-                    outputs=[file_list]
-                )
-                clear_btn.click(
-                    fn=clear_database,
-                    outputs=[upload_status, file_list]
-                )
-            # Chat Tab
-            with gr.Tab("💬 Chat"):
-                gr.Markdown("### Chat with Your Documents")
-                gr.Markdown("Ask questions about your documents or any topic. The agent will search multiple sources.")
-                chatbot = gr.Chatbot(
-                    label="Conversation",
-                    height=500,
-                    show_label=True,
-                    avatar_images=(None, "🤖")
-                )
-                with gr.Row():
-                    msg = gr.Textbox(
-                        label="Your Message",
-                        placeholder="Ask me anything about your documents or general knowledge...",
-                        scale=4
-                    )
-                    submit_btn = gr.Button("Send", variant="primary", scale=1)
-                with gr.Row():
-                    clear_chat_btn = gr.Button("🔄 Reset Conversation")
-                    gr.Markdown("*Note: Resetting clears the conversation history*")
-                # Chat interface
-                chat_interface = gr.ChatInterface(
-                    fn=chat_with_agent,
-                    chatbot=chatbot,
-                    textbox=msg,
-                    submit_btn=submit_btn,
-                    retry_btn=None,
-                    undo_btn=None,
-                    clear_btn=None
-                )
-                clear_chat_btn.click(
-                    fn=reset_conversation,
-                    outputs=[chatbot]
-                )
-        gr.Markdown("""
-        ---
-        ### 🔧 How it works:
-        1. **Upload documents** in the Documents tab
-        2. **Ask questions** in the Chat tab
-        3. The agent will:
-           - Analyze your query
-           - Search relevant sources
-           - Provide comprehensive answers with citations
-        """)
-    return demo
-if __name__ == "__main__":
-    demo = create_gradio_ui()
-    demo.launch(share=False, server_name="127.0.0.1", server_port=7860)

knowledge_base/chroma.py CHANGED Viewed

@@ -9,13 +9,12 @@ from langchain_chroma import Chroma
 from config import configs
 if __name__ == "__main__":
-    # --- 1. Load Documents ---
     print("Loading documents from directory...")
     loader = DirectoryLoader(
         path=configs["DATA_PATH"],
         glob="*.md",
         loader_cls=TextLoader,
-        silent_errors=True # Set to False if you want to see loader errors
     )
     raw_documents = loader.load()
@@ -23,25 +22,22 @@ if __name__ == "__main__":
         print(f"Error: No documents found in {configs['DATA_PATH']}. Check your path and file types.")
         exit()
-    # --- 2. Split Documents into Chunks ---
     print(f"Loaded {len(raw_documents)} raw documents. Splitting into chunks...")
-    # Recursive splitting is better than simple splitting, preserving context.
     text_splitter = RecursiveCharacterTextSplitter(
         chunk_size=1000,
         chunk_overlap=200,
-        separators=["\n\n", "\n", " ", ""] # Optimal separators for markdown/text
     )
     documents_to_embed = text_splitter.split_documents(raw_documents)
     print(f"Split into {len(documents_to_embed)} chunks.")
-    # --- 3. Define Custom Embedding Model ---
     print(f"Initializing custom embedding model: {configs['EMBEDDING_MODEL_NAME']}...")
     dense_embeddings = HuggingFaceEmbeddings(
         model_name=configs["EMBEDDING_MODEL_NAME"]
     )
-    # --- 4. Create and Persist the Vector Store ---
     print(f"Creating Chroma vector store and persisting data to {configs['PERSIST_PATH']}...")
     vectorstore = Chroma.from_documents(
         documents=documents_to_embed,  # The prepared Document chunks

 from config import configs
 if __name__ == "__main__":
     print("Loading documents from directory...")
     loader = DirectoryLoader(
         path=configs["DATA_PATH"],
         glob="*.md",
         loader_cls=TextLoader,
+        silent_errors=True
     )
     raw_documents = loader.load()
         print(f"Error: No documents found in {configs['DATA_PATH']}. Check your path and file types.")
         exit()
+    # Split Documents into Chunks
     print(f"Loaded {len(raw_documents)} raw documents. Splitting into chunks...")
     text_splitter = RecursiveCharacterTextSplitter(
         chunk_size=1000,
         chunk_overlap=200,
+        separators=["\n\n", "\n", " ", ""]
     )
     documents_to_embed = text_splitter.split_documents(raw_documents)
     print(f"Split into {len(documents_to_embed)} chunks.")
     print(f"Initializing custom embedding model: {configs['EMBEDDING_MODEL_NAME']}...")
     dense_embeddings = HuggingFaceEmbeddings(
         model_name=configs["EMBEDDING_MODEL_NAME"]
     )
     print(f"Creating Chroma vector store and persisting data to {configs['PERSIST_PATH']}...")
     vectorstore = Chroma.from_documents(
         documents=documents_to_embed,  # The prepared Document chunks

knowledge_base/test_retrieval.py CHANGED Viewed

@@ -1,17 +1,14 @@
 from langchain_community.embeddings import HuggingFaceEmbeddings
 from langchain_chroma import Chroma
-# Configuration must match the creation step
 PERSIST_PATH = "./knowledge_base/chroma_data"
 EMBEDDING_MODEL_NAME = "sentence-transformers/all-mpnet-base-v2"
 COLLECTION_NAME = "langchain_mpnet_collection"
-# 1. Define the custom embedding object (Crucial for query vectorization)
 dense_embeddings = HuggingFaceEmbeddings(
     model_name=EMBEDDING_MODEL_NAME
 )
-# 2. Load the existing vector store from disk
 try:
     vectorstore = Chroma(
         persist_directory=PERSIST_PATH,
@@ -25,8 +22,7 @@ except Exception as e:
 query = "Tell me about SAM3 general architecture."
-# Perform the search
-# k=3 means it will return the top 3 most similar document chunks
 retrieved_docs = vectorstore.similarity_search(query, k=3)
 print(f"\n--- Search Results for: '{query}' ---")

 from langchain_community.embeddings import HuggingFaceEmbeddings
 from langchain_chroma import Chroma
 PERSIST_PATH = "./knowledge_base/chroma_data"
 EMBEDDING_MODEL_NAME = "sentence-transformers/all-mpnet-base-v2"
 COLLECTION_NAME = "langchain_mpnet_collection"
 dense_embeddings = HuggingFaceEmbeddings(
     model_name=EMBEDDING_MODEL_NAME
 )
 try:
     vectorstore = Chroma(
         persist_directory=PERSIST_PATH,
 query = "Tell me about SAM3 general architecture."
 retrieved_docs = vectorstore.similarity_search(query, k=3)
 print(f"\n--- Search Results for: '{query}' ---")

notebook.ipynb DELETED Viewed

@@ -1,37 +0,0 @@
-{
- "cells": [
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "524b8568",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "from langchain_community.document_loaders.text import DirectoryLoader, TextLoader"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "babc2558",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "print(\"Y\")"
-   ]
-  }
- ],
- "metadata": {
-  "kernelspec": {
-   "display_name": "rag_agent",
-   "language": "python",
-   "name": "python3"
-  },
-  "language_info": {
-   "name": "python",
-   "version": "3.10.17"
-  }
- },
- "nbformat": 4,
- "nbformat_minor": 5
-}

quick_check.py DELETED Viewed

@@ -1,16 +0,0 @@
-import gradio as gr
-print("Gradio Version:", gr.__version__)
-import gradio as gr
-history = [
-    gr.ChatMessage(role="assistant", content="How can I help you?"),
-    gr.ChatMessage(role="user", content="Can you make me a plot of quarterly sales?"),
-    gr.ChatMessage(role="assistant", content="I am happy to provide you that report and plot.")
-]
-with gr.Blocks() as demo:
-    gr.Chatbot(history)
-demo.launch()

requirements.txt CHANGED Viewed

@@ -1,20 +1,184 @@
-langchain
-langgraph
-langchain-huggingface
-langchain-google-genai
-langchain-chroma
-fastapi
-uvicorn
-pydantic
-chromadb
-pymupdf
-pymupdf4llm
-langchain-community
-langchain_text_splitters
-pymupdf-layout
-sentence_transformers
-gradio
-python-dotenv
-langchain-tavily
-arxiv
-wikipedia

+aiofiles==24.1.0
+aiohappyeyeballs==2.6.1
+aiohttp==3.13.2
+aiosignal==1.4.0
+annotated-doc==0.0.4
+annotated-types==0.7.0
+anyio==4.12.0
+arxiv==2.3.1
+async-timeout==5.0.1
+attrs==25.4.0
+backoff==2.2.1
+bcrypt==5.0.0
+beautifulsoup4==4.14.2
+brotli==1.2.0
+build==1.3.0
+cachetools==6.2.2
+certifi==2025.11.12
+charset-normalizer==3.4.4
+chromadb==1.3.5
+click==8.3.1
+coloredlogs==15.0.1
+dataclasses-json==0.6.7
+distro==1.9.0
+durationpy==0.10
+exceptiongroup==1.3.1
+fastapi==0.122.0
+feedparser==6.0.12
+ffmpy==1.0.0
+filelock==3.20.0
+filetype==1.2.0
+flatbuffers==25.9.23
+frozenlist==1.8.0
+fsspec==2025.10.0
+google-ai-generativelanguage==0.9.0
+google-api-core==2.28.1
+google-auth==2.43.0
+googleapis-common-protos==1.72.0
+gradio==6.0.1
+gradio-client==2.0.0
+greenlet==3.2.4
+groovy==0.1.2
+grpcio==1.76.0
+grpcio-status==1.76.0
+h11==0.16.0
+hf-xet==1.2.0
+httpcore==1.0.9
+httptools==0.7.1
+httpx==0.28.1
+httpx-sse==0.4.3
+huggingface-hub==0.36.0
+humanfriendly==10.0
+idna==3.11
+importlib-metadata==8.7.0
+importlib-resources==6.5.2
+jinja2==3.1.6
+joblib==1.5.2
+jsonpatch==1.33
+jsonpointer==3.0.0
+jsonschema==4.25.1
+jsonschema-specifications==2025.9.1
+kubernetes==34.1.0
+langchain==1.1.0
+langchain-chroma==1.0.0
+langchain-classic==1.0.0
+langchain-community==0.4.1
+langchain-core==1.1.0
+langchain-google-genai==3.2.0
+langchain-huggingface==1.1.0
+langchain-tavily==0.2.13
+langchain-text-splitters==1.0.0
+langgraph==1.0.4
+langgraph-checkpoint==3.0.1
+langgraph-prebuilt==1.0.5
+langgraph-sdk==0.2.10
+langsmith==0.4.49
+markdown-it-py==4.0.0
+markupsafe==3.0.3
+marshmallow==3.26.1
+mdurl==0.1.2
+mmh3==5.2.0
+mpmath==1.3.0
+multidict==6.7.0
+mypy-extensions==1.1.0
+networkx==3.4.2
+numpy==2.2.6
+nvidia-cublas-cu12==12.8.4.1
+nvidia-cuda-cupti-cu12==12.8.90
+nvidia-cuda-nvrtc-cu12==12.8.93
+nvidia-cuda-runtime-cu12==12.8.90
+nvidia-cudnn-cu12==9.10.2.21
+nvidia-cufft-cu12==11.3.3.83
+nvidia-cufile-cu12==1.13.1.3
+nvidia-curand-cu12==10.3.9.90
+nvidia-cusolver-cu12==11.7.3.90
+nvidia-cusparse-cu12==12.5.8.93
+nvidia-cusparselt-cu12==0.7.1
+nvidia-nccl-cu12==2.27.5
+nvidia-nvjitlink-cu12==12.8.93
+nvidia-nvshmem-cu12==3.3.20
+nvidia-nvtx-cu12==12.8.90
+oauthlib==3.3.1
+onnxruntime==1.23.2
+opentelemetry-api==1.38.0
+opentelemetry-exporter-otlp-proto-common==1.38.0
+opentelemetry-exporter-otlp-proto-grpc==1.38.0
+opentelemetry-proto==1.38.0
+opentelemetry-sdk==1.38.0
+opentelemetry-semantic-conventions==0.59b0
+orjson==3.11.4
+ormsgpack==1.12.0
+overrides==7.7.0
+packaging==25.0
+pandas==2.3.3
+pillow==12.0.0
+posthog==5.4.0
+propcache==0.4.1
+proto-plus==1.26.1
+protobuf==6.33.1
+pyasn1==0.6.1
+pyasn1-modules==0.4.2
+pybase64==1.4.2
+pydantic==2.12.5
+pydantic-core==2.41.5
+pydantic-settings==2.12.0
+pydub==0.25.1
+pygments==2.19.2
+pymupdf==1.26.6
+pymupdf-layout==1.26.6
+pymupdf4llm==0.2.4
+pypika==0.48.9
+pyproject-hooks==1.2.0
+python-dateutil==2.9.0.post0
+python-dotenv==1.2.1
+python-multipart==0.0.20
+pytz==2025.2
+pyyaml==6.0.3
+referencing==0.37.0
+regex==2025.11.3
+requests==2.32.5
+requests-oauthlib==2.0.0
+requests-toolbelt==1.0.0
+rich==14.2.0
+rpds-py==0.29.0
+rsa==4.9.1
+safehttpx==0.1.7
+safetensors==0.7.0
+scikit-learn==1.7.2
+scipy==1.15.3
+semantic-version==2.10.0
+sentence-transformers==5.1.2
+sgmllib3k==1.0.0
+shellingham==1.5.4
+six==1.17.0
+sniffio==1.3.1
+soupsieve==2.8
+sqlalchemy==2.0.44
+starlette==0.50.0
+sympy==1.14.0
+tabulate==0.9.0
+tenacity==9.1.2
+threadpoolctl==3.6.0
+tokenizers==0.22.1
+tomli==2.3.0
+tomlkit==0.13.3
+torch==2.9.1
+tqdm==4.67.1
+transformers==4.57.3
+triton==3.5.1
+typer==0.20.0
+typing-extensions==4.15.0
+typing-inspect==0.9.0
+typing-inspection==0.4.2
+tzdata==2025.2
+urllib3==2.5.0
+uvicorn==0.38.0
+uvloop==0.22.1
+watchfiles==1.1.1
+websocket-client==1.9.0
+websockets==15.0.1
+wikipedia==1.4.0
+xxhash==3.6.0
+yarl==1.22.0
+zipp==3.23.0
+zstandard==0.25.0

testing_main.py DELETED Viewed

@@ -1,11 +0,0 @@
-from config import configs
-from knowledge_base.test_retrieval import PERSIST_PATH, EMBEDDING_MODEL_NAME, COLLECTION_NAME
-if __name__ == "__main__":
-    print("Testing configuration values...")
-    for key, value in configs.items():
-        print(f"{key}: {value}")
-    print("✅ Configuration test completed successfully.")
-    print(f"PERSIST_PATH: {PERSIST_PATH}")
-    print(f"EMBEDDING_MODEL_NAME: {EMBEDDING_MODEL_NAME}")
-    print(f"COLLECTION_NAME: {COLLECTION_NAME}")