Spaces:

atara57769
/

RAG_Knowledge_Assistant

Running

App Files Files Community

atara57769 commited on 27 days ago

Commit

491caa2

1 Parent(s): af355c6

feat: add conversation history support to chat service and update Gradio UI for interactive sessions

Browse files

Files changed (5) hide show

README.md +16 -10
app.py +79 -23
services/chat_service.py +8 -4
services/ui_handlers.py +72 -0
tests/test_chat_service.py +41 -1

README.md CHANGED Viewed

@@ -13,25 +13,26 @@ license: mit
 # 🤖 RAG Knowledge Assistant
-Welcome to the **RAG Knowledge Assistant**—a high-performance, modular Retrieval-Augmented Generation (RAG) system tailored for the Playmobil Toy Shop. It utilizes vector search in **Qdrant** and inference with the **Groq LLM API** to answer user queries with precise store policies and product descriptions.
 ---
 ## 🌟 Key Features
 1. **Modular Clean Architecture**: Complete separation of the presentation layer (Gradio UI), data orchestration, vector databases, and modular RAG services.
-2. **Dynamic AI Query Routing**: Automatically routes questions to the correct context category (`policy`, `product`, or `none`) using high-speed classification.
-3. **Optimized RAG Pipeline**: Skips expensive database query retrievals for general conversational questions (`none` classification) to improve latency and reduce cost.
-4. **Metadata Filtered Retrieval**: Restricts vector search using Qdrant index payload checks for precise target matching.
-5. **Semantic Document Chunking**: Implements a robust text splitting algorithm using recursive character chunking with overlap to preserve semantic context across chunk lines.
-6. **100% Offline Testing Suite**: Includes 27 fast unit tests that fully mock external database connections and model downloads.
 ---
 ## 📂 Project Structure
 ```bash
-├── app.py                      # Main entrypoint launching the Gradio interface
 ├── config.py                   # Central configuration & env variable loader
 ├── requirements.txt            # Python package dependencies
 ├── data/                       # Raw dataset files
@@ -48,10 +49,11 @@ Welcome to the **RAG Knowledge Assistant**—a high-performance, modular Retriev
 ├── services/                   # High-level business logic coordinators
 │   ├── chat_service.py         # Coordinating Chat RAG pipeline flow
 │   ├── ingestion_service.py    # Document loading, tagging, chunking, and ingesting
 │   └── llm.py                  # Groq API client provisioning
 └── tests/                      # Automated test suite
     ├── conftest.py             # 100% Offline mocking layer intercepting API clients
-    ├── test_chat_service.py    # Unit tests for the main Chat service
     ├── test_ingestion_service.py # Unit tests for parsing, chunking, and db loader
     ├── test_retriever.py       # Unit tests for vector similarities search
     └── test_router.py          # Unit tests for query routing classifications
@@ -94,13 +96,17 @@ python app.py
 ```
 This will:
 1. Bootstrap the Qdrant collections and payload keyword indices.
-2. Launch the **Gradio UI** on `http://localhost:7860`.
 ---
 ## 🧪 Testing Suite
-The testing suite contains **27 unit tests** covering every functional capability of the assistant.
 ### Offline Testing Architecture
 To guarantee fast, deterministic execution and eliminate API cost:

 # 🤖 RAG Knowledge Assistant
+Welcome to the **RAG Knowledge Assistant**—a high-performance, modular Retrieval-Augmented Generation (RAG) system tailored for the Playmobil Toy Shop. It utilizes vector search in **Qdrant** and inference with the **Groq LLM API** to answer user queries with precise store policies and product descriptions, now enhanced with full conversation history and interactive session state.
 ---
 ## 🌟 Key Features
 1. **Modular Clean Architecture**: Complete separation of the presentation layer (Gradio UI), data orchestration, vector databases, and modular RAG services.
+2. **Conversational Session State & History**: Automatically retains preceding user and assistant messages so the virtual toy assistant (Maya) can understand and answer context-aware follow-up queries.
+3. **Dynamic AI Query Routing**: Automatically routes questions to the correct context category (`policy`, `product`, or `none`) using high-speed classification.
+4. **Optimized RAG Pipeline**: Skips expensive database query retrievals for general conversational questions (`none` classification) to improve latency and reduce cost.
+5. **Metadata Filtered Retrieval**: Restricts vector search using Qdrant index payload checks for precise target matching.
+6. **Semantic Document Chunking**: Implements a robust text splitting algorithm using recursive character chunking with overlap to preserve semantic context across chunk lines.
+7. **100% Offline Testing Suite**: Includes **29 fast unit tests** that fully mock external database connections and model downloads.
 ---
 ## 📂 Project Structure
 ```bash
+├── app.py                      # Declarative entrypoint establishing the Gradio interface
 ├── config.py                   # Central configuration & env variable loader
 ├── requirements.txt            # Python package dependencies
 ├── data/                       # Raw dataset files
 ├── services/                   # High-level business logic coordinators
 │   ├── chat_service.py         # Coordinating Chat RAG pipeline flow
 │   ├── ingestion_service.py    # Document loading, tagging, chunking, and ingesting
+│   ├── ui_handlers.py          # Gradio callback handlers keeping app.py completely declarative
 │   └── llm.py                  # Groq API client provisioning
 └── tests/                      # Automated test suite
     ├── conftest.py             # 100% Offline mocking layer intercepting API clients
+    ├── test_chat_service.py    # Unit tests for the main Chat service & history flow
     ├── test_ingestion_service.py # Unit tests for parsing, chunking, and db loader
     ├── test_retriever.py       # Unit tests for vector similarities search
     └── test_router.py          # Unit tests for query routing classifications
 ```
 This will:
 1. Bootstrap the Qdrant collections and payload keyword indices.
+2. Launch the **Gradio Chat UI** on `http://localhost:7860`.
+### Redesigned Chat Interface
+* **Interactive Chat Column**: Embeds a native `gr.Chatbot` utilizing active session state. Clear your text input field and post updates instantly. Supports both hitting **Enter** and clicking **Send 🚀**.
+* **Sidebar Controls & Utilities**: Allows toggling filter modes, checking background document indexing triggers via real-time status updates, or clicking **🗑️ Clear History** to start a new chat session.
 ---
 ## 🧪 Testing Suite
+The testing suite contains **29 unit tests** covering every functional capability of the assistant.
 ### Offline Testing Architecture
 To guarantee fast, deterministic execution and eliminate API cost:

app.py CHANGED Viewed

@@ -11,12 +11,16 @@ logger.info("Initializing RAG Knowledge Assistant application components...")
 import gradio as gr
 from db.bootstrap import init_database
-from services.chat_service import chat
-from services.ingestion_service import ingest
 # Initialize database schema/collections if not already done
 init_database()
 # Gradio Interface Construction
 logger.info("Creating Gradio UI structure...")
@@ -31,37 +35,89 @@ with gr.Blocks(title="Shop Knowledge Assistant") as demo:
     gr.Markdown(
         """
         # 🤖 Shop RAG Knowledge Assistant
-        *Ask questions about store policies and toy products with high-performance vector search.*
         """
     )
     with gr.Row():
-        with gr.Column(scale=3):
-            q = gr.Textbox(
-                label="Your Question",
-                placeholder="e.g., What is the return policy for toy products?",
-                lines=2
-            )
         with gr.Column(scale=1):
             doc_type = gr.Dropdown(
                 choices=["Auto (AI Router)", "Policy", "Product", "No Filter"],
                 value="Auto (AI Router)",
-                label="Filter Mode"
             )
-    a = gr.Textbox(
-        label="Response from Knowledge Assistant",
-        placeholder="Answer will appear here...",
-        interactive=False,
-        lines=6
-    )
-    with gr.Row():
-        btn = gr.Button("🔍 Ask Assistant", variant="primary")
-        ingest_btn = gr.Button("🔄 Re-Ingest Data", variant="secondary")
-    btn.click(chat, inputs=[q, doc_type], outputs=a)
-    ingest_btn.click(ingest, outputs=a)
 if __name__ == "__main__":
     logger.info("Launching Gradio UI server...")

 import gradio as gr
 from db.bootstrap import init_database
 # Initialize database schema/collections if not already done
 init_database()
+from services.ui_handlers import (
+    add_user_message,
+    bot_response,
+    clear_session,
+    run_ingestion
+)
 # Gradio Interface Construction
 logger.info("Creating Gradio UI structure...")
     gr.Markdown(
         """
         # 🤖 Shop RAG Knowledge Assistant
+        *Ask questions about store policies and toy products with high-performance vector search and conversation history.*
         """
     )
     with gr.Row():
+        # Left Sidebar Column: Configuration & Controls
         with gr.Column(scale=1):
+            gr.Markdown("### ⚙️ Settings & Actions")
             doc_type = gr.Dropdown(
                 choices=["Auto (AI Router)", "Policy", "Product", "No Filter"],
                 value="Auto (AI Router)",
+                label="Filter Mode",
+                visible=False
             )
+            ingest_btn = gr.Button("🔄 Re-Ingest Data", variant="secondary")
+            ingest_status = gr.Textbox(
+                label="Ingestion Status",
+                value="Ready",
+                interactive=False
+            )
+            clear_btn = gr.Button("🗑️ Clear History", variant="stop")
+            gr.Markdown(
+                """
+                ### 💡 Try Asking
+                - *What is the return policy for toy products?*
+                - *Do you have any outdoor adventure sets?*
+                - *Compare the Camping Adventure and the Horse Stable.*
+                """
+            )
+        # Right Chat Area Column: Full Interactive Chatbot
+        with gr.Column(scale=3):
+            chatbot = gr.Chatbot(
+                height=520,
+                label="Maya - Personal Play Consultant",
+                placeholder="Hi, I'm Maya! Ask me anything about our toys or store policies. I will remember our chat!"
+            )
+            with gr.Row():
+                q = gr.Textbox(
+                    placeholder="Ask Maya a question...",
+                    show_label=False,
+                    scale=4,
+                    container=False
+                )
+                btn = gr.Button("Send 🚀", variant="primary", scale=1)
+    # Event handlers binding
+    btn.click(
+        add_user_message,
+        inputs=[q, chatbot],
+        outputs=[q, chatbot],
+        queue=False
+    ).then(
+        bot_response,
+        inputs=[doc_type, chatbot],
+        outputs=[chatbot]
+    )
+    q.submit(
+        add_user_message,
+        inputs=[q, chatbot],
+        outputs=[q, chatbot],
+        queue=False
+    ).then(
+        bot_response,
+        inputs=[doc_type, chatbot],
+        outputs=[chatbot]
+    )
+    clear_btn.click(
+        clear_session,
+        outputs=[chatbot],
+        queue=False
+    )
+    ingest_btn.click(
+        run_ingestion,
+        outputs=[ingest_status]
+    )
 if __name__ == "__main__":
     logger.info("Launching Gradio UI server...")

services/chat_service.py CHANGED Viewed

@@ -28,7 +28,7 @@ def get_retrieved_context(question: str, resolved_type: str) -> str:
     docs = retrieve_all(question, resolved_type)
     return "\n\n".join(d.page_content for d in docs)
-def build_llm_messages(question: str, context: str) -> list:
     """Construct prompt message structure for LLM."""
     messages = [
         {"role": "system", "content": system_prompt}
@@ -41,6 +41,10 @@ def build_llm_messages(question: str, context: str) -> list:
     else:
         logger.warning("No context found. Proceeding with system prompt only.")
     messages.append({"role": "user", "content": question})
     return messages
@@ -57,13 +61,13 @@ def generate_llm_response(messages: list) -> str:
     logger.info("Chat completion completed successfully.")
     return answer
-def chat(question: str, doc_type: str = "Auto (AI Router)") -> str:
     """Coordinating function to run the full RAG chat workflow."""
-    logger.info(f"Initiating chat logic for question: '{question}' (selected mode: {doc_type})")
     try:
         resolved_type = resolve_category(question, doc_type)
         context = get_retrieved_context(question, resolved_type)
-        messages = build_llm_messages(question, context)
         return generate_llm_response(messages)
     except Exception as e:
         logger.error(f"Error in modular chat workflow: {e}", exc_info=True)

     docs = retrieve_all(question, resolved_type)
     return "\n\n".join(d.page_content for d in docs)
+def build_llm_messages(question: str, context: str, history: list = None) -> list:
     """Construct prompt message structure for LLM."""
     messages = [
         {"role": "system", "content": system_prompt}
     else:
         logger.warning("No context found. Proceeding with system prompt only.")
+    if history:
+        for msg in history:
+            messages.append({"role": msg["role"], "content": msg["content"]})
     messages.append({"role": "user", "content": question})
     return messages
     logger.info("Chat completion completed successfully.")
     return answer
+def chat(question: str, doc_type: str = "Auto (AI Router)", history: list = None) -> str:
     """Coordinating function to run the full RAG chat workflow."""
+    logger.info(f"Initiating chat logic for question: '{question}' (selected mode: {doc_type}, history length: {len(history) if history else 0})")
     try:
         resolved_type = resolve_category(question, doc_type)
         context = get_retrieved_context(question, resolved_type)
+        messages = build_llm_messages(question, context, history)
         return generate_llm_response(messages)
     except Exception as e:
         logger.error(f"Error in modular chat workflow: {e}", exc_info=True)

services/ui_handlers.py ADDED Viewed

	@@ -0,0 +1,72 @@

+import logging
+import gradio as gr
+from services.chat_service import chat
+from services.ingestion_service import ingest
+logger = logging.getLogger(__name__)
+def add_user_message(user_message, history):
+    """Instantly add the user question to the chatbot history list and clear the input field."""
+    if not user_message.strip():
+        return "", history
+    if history is None:
+        history = []
+    # In Gradio 6, gr.Chatbot history format is a list of dicts
+    history.append({"role": "user", "content": user_message})
+    return "", history
+def extract_text(content):
+    """Safely extract plain text from Gradio 6 structured MessageDict format."""
+    if isinstance(content, str):
+        return content
+    if isinstance(content, list):
+        parts = []
+        for item in content:
+            if isinstance(item, dict) and "text" in item:
+                parts.append(item["text"])
+            elif isinstance(item, str):
+                parts.append(item)
+        return " ".join(parts)
+    if isinstance(content, dict):
+        return content.get("text", str(content))
+    return str(content)
+def bot_response(doc_type, history):
+    """Retrieve the chatbot's response from the chat service, injecting prior conversation context."""
+    if not history or history[-1]["role"] != "user":
+        return history
+    raw_user_message = history[-1]["content"]
+    user_message = extract_text(raw_user_message)
+    # Clean up preceding history for downstream compatibility
+    history_before = []
+    for msg in history[:-1]:
+        history_before.append({
+            "role": msg.get("role"),
+            "content": extract_text(msg.get("content", ""))
+        })
+    logger.info(f"Generating bot response for: '{user_message}' with {len(history_before)} historical turns.")
+    # Generate bot reply using chat service
+    bot_message = chat(user_message, doc_type, history_before)
+    history.append({"role": "assistant", "content": bot_message})
+    return history
+def clear_session():
+    """Clear conversation history."""
+    logger.info("Clearing chat session history.")
+    return []
+def run_ingestion():
+    """Execute document chunking and vector storage ingestion, providing visual alerts."""
+    logger.info("Triggering manual re-ingestion of knowledge data...")
+    try:
+        result = ingest()
+        gr.Info("Data ingestion completed successfully!")
+        return f"Success: {result}"
+    except Exception as e:
+        gr.Error(f"Data ingestion failed: {e}")
+        return f"Error: {e}"

tests/test_chat_service.py CHANGED Viewed

@@ -103,7 +103,7 @@ def test_chat_success(mock_generate, mock_build, mock_get, mock_resolve):
     assert response == "LLM answer"
     mock_resolve.assert_called_once_with("What is toy X?", "Auto (AI Router)")
     mock_get.assert_called_once_with("What is toy X?", "product")
-    mock_build.assert_called_once_with("What is toy X?", "Toy facts")
     mock_generate.assert_called_once()
 def test_chat_exception_handling():
@@ -112,3 +112,43 @@ def test_chat_exception_handling():
         response = chat("any question", "Auto (AI Router)")
         assert "An error occurred while formulating a response:" in response
         assert "Database crash" in response

     assert response == "LLM answer"
     mock_resolve.assert_called_once_with("What is toy X?", "Auto (AI Router)")
     mock_get.assert_called_once_with("What is toy X?", "product")
+    mock_build.assert_called_once_with("What is toy X?", "Toy facts", None)
     mock_generate.assert_called_once()
 def test_chat_exception_handling():
         response = chat("any question", "Auto (AI Router)")
         assert "An error occurred while formulating a response:" in response
         assert "Database crash" in response
+def test_build_llm_messages_with_history():
+    """Verify message context structure layout when conversation history is provided."""
+    history = [
+        {"role": "user", "content": "hi Maya"},
+        {"role": "assistant", "content": "Hello! How can I help you?"}
+    ]
+    messages = build_llm_messages("my second question", "some retrieved facts", history)
+    # Standard format: system prompt, context prompt, history user, history assistant, current user question
+    assert len(messages) == 5
+    assert messages[0]["role"] == "system"
+    assert messages[1]["role"] == "system"
+    assert "some retrieved facts" in messages[1]["content"]
+    assert messages[2]["role"] == "user"
+    assert messages[2]["content"] == "hi Maya"
+    assert messages[3]["role"] == "assistant"
+    assert messages[3]["content"] == "Hello! How can I help you?"
+    assert messages[4]["role"] == "user"
+    assert messages[4]["content"] == "my second question"
+@patch("services.chat_service.resolve_category")
+@patch("services.chat_service.get_retrieved_context")
+@patch("services.chat_service.build_llm_messages")
+@patch("services.chat_service.generate_llm_response")
+def test_chat_with_history_success(mock_generate, mock_build, mock_get, mock_resolve):
+    """Verify the coordinated end-to-end chat workflow with history works on success."""
+    mock_resolve.return_value = "product"
+    mock_get.return_value = "Toy facts"
+    mock_build.return_value = [{"role": "user", "content": "Query"}]
+    mock_generate.return_value = "LLM answer"
+    history = [{"role": "user", "content": "hi"}]
+    response = chat("What is toy X?", "Auto (AI Router)", history)
+    assert response == "LLM answer"
+    mock_resolve.assert_called_once_with("What is toy X?", "Auto (AI Router)")
+    mock_get.assert_called_once_with("What is toy X?", "product")
+    mock_build.assert_called_once_with("What is toy X?", "Toy facts", history)
+    mock_generate.assert_called_once()