atara57769 commited on
Commit
491caa2
·
1 Parent(s): af355c6

feat: add conversation history support to chat service and update Gradio UI for interactive sessions

Browse files
README.md CHANGED
@@ -13,25 +13,26 @@ license: mit
13
 
14
  # 🤖 RAG Knowledge Assistant
15
 
16
- Welcome to the **RAG Knowledge Assistant**—a high-performance, modular Retrieval-Augmented Generation (RAG) system tailored for the Playmobil Toy Shop. It utilizes vector search in **Qdrant** and inference with the **Groq LLM API** to answer user queries with precise store policies and product descriptions.
17
 
18
  ---
19
 
20
  ## 🌟 Key Features
21
 
22
  1. **Modular Clean Architecture**: Complete separation of the presentation layer (Gradio UI), data orchestration, vector databases, and modular RAG services.
23
- 2. **Dynamic AI Query Routing**: Automatically routes questions to the correct context category (`policy`, `product`, or `none`) using high-speed classification.
24
- 3. **Optimized RAG Pipeline**: Skips expensive database query retrievals for general conversational questions (`none` classification) to improve latency and reduce cost.
25
- 4. **Metadata Filtered Retrieval**: Restricts vector search using Qdrant index payload checks for precise target matching.
26
- 5. **Semantic Document Chunking**: Implements a robust text splitting algorithm using recursive character chunking with overlap to preserve semantic context across chunk lines.
27
- 6. **100% Offline Testing Suite**: Includes 27 fast unit tests that fully mock external database connections and model downloads.
 
28
 
29
  ---
30
 
31
  ## 📂 Project Structure
32
 
33
  ```bash
34
- ├── app.py # Main entrypoint launching the Gradio interface
35
  ├── config.py # Central configuration & env variable loader
36
  ├── requirements.txt # Python package dependencies
37
  ├── data/ # Raw dataset files
@@ -48,10 +49,11 @@ Welcome to the **RAG Knowledge Assistant**—a high-performance, modular Retriev
48
  ├── services/ # High-level business logic coordinators
49
  │ ├── chat_service.py # Coordinating Chat RAG pipeline flow
50
  │ ├── ingestion_service.py # Document loading, tagging, chunking, and ingesting
 
51
  │ └── llm.py # Groq API client provisioning
52
  └── tests/ # Automated test suite
53
  ├── conftest.py # 100% Offline mocking layer intercepting API clients
54
- ├── test_chat_service.py # Unit tests for the main Chat service
55
  ├── test_ingestion_service.py # Unit tests for parsing, chunking, and db loader
56
  ├── test_retriever.py # Unit tests for vector similarities search
57
  └── test_router.py # Unit tests for query routing classifications
@@ -94,13 +96,17 @@ python app.py
94
  ```
95
  This will:
96
  1. Bootstrap the Qdrant collections and payload keyword indices.
97
- 2. Launch the **Gradio UI** on `http://localhost:7860`.
 
 
 
 
98
 
99
  ---
100
 
101
  ## 🧪 Testing Suite
102
 
103
- The testing suite contains **27 unit tests** covering every functional capability of the assistant.
104
 
105
  ### Offline Testing Architecture
106
  To guarantee fast, deterministic execution and eliminate API cost:
 
13
 
14
  # 🤖 RAG Knowledge Assistant
15
 
16
+ Welcome to the **RAG Knowledge Assistant**—a high-performance, modular Retrieval-Augmented Generation (RAG) system tailored for the Playmobil Toy Shop. It utilizes vector search in **Qdrant** and inference with the **Groq LLM API** to answer user queries with precise store policies and product descriptions, now enhanced with full conversation history and interactive session state.
17
 
18
  ---
19
 
20
  ## 🌟 Key Features
21
 
22
  1. **Modular Clean Architecture**: Complete separation of the presentation layer (Gradio UI), data orchestration, vector databases, and modular RAG services.
23
+ 2. **Conversational Session State & History**: Automatically retains preceding user and assistant messages so the virtual toy assistant (Maya) can understand and answer context-aware follow-up queries.
24
+ 3. **Dynamic AI Query Routing**: Automatically routes questions to the correct context category (`policy`, `product`, or `none`) using high-speed classification.
25
+ 4. **Optimized RAG Pipeline**: Skips expensive database query retrievals for general conversational questions (`none` classification) to improve latency and reduce cost.
26
+ 5. **Metadata Filtered Retrieval**: Restricts vector search using Qdrant index payload checks for precise target matching.
27
+ 6. **Semantic Document Chunking**: Implements a robust text splitting algorithm using recursive character chunking with overlap to preserve semantic context across chunk lines.
28
+ 7. **100% Offline Testing Suite**: Includes **29 fast unit tests** that fully mock external database connections and model downloads.
29
 
30
  ---
31
 
32
  ## 📂 Project Structure
33
 
34
  ```bash
35
+ ├── app.py # Declarative entrypoint establishing the Gradio interface
36
  ├── config.py # Central configuration & env variable loader
37
  ├── requirements.txt # Python package dependencies
38
  ├── data/ # Raw dataset files
 
49
  ├── services/ # High-level business logic coordinators
50
  │ ├── chat_service.py # Coordinating Chat RAG pipeline flow
51
  │ ├── ingestion_service.py # Document loading, tagging, chunking, and ingesting
52
+ │ ├── ui_handlers.py # Gradio callback handlers keeping app.py completely declarative
53
  │ └── llm.py # Groq API client provisioning
54
  └── tests/ # Automated test suite
55
  ├── conftest.py # 100% Offline mocking layer intercepting API clients
56
+ ├── test_chat_service.py # Unit tests for the main Chat service & history flow
57
  ├── test_ingestion_service.py # Unit tests for parsing, chunking, and db loader
58
  ├── test_retriever.py # Unit tests for vector similarities search
59
  └── test_router.py # Unit tests for query routing classifications
 
96
  ```
97
  This will:
98
  1. Bootstrap the Qdrant collections and payload keyword indices.
99
+ 2. Launch the **Gradio Chat UI** on `http://localhost:7860`.
100
+
101
+ ### Redesigned Chat Interface
102
+ * **Interactive Chat Column**: Embeds a native `gr.Chatbot` utilizing active session state. Clear your text input field and post updates instantly. Supports both hitting **Enter** and clicking **Send 🚀**.
103
+ * **Sidebar Controls & Utilities**: Allows toggling filter modes, checking background document indexing triggers via real-time status updates, or clicking **🗑️ Clear History** to start a new chat session.
104
 
105
  ---
106
 
107
  ## 🧪 Testing Suite
108
 
109
+ The testing suite contains **29 unit tests** covering every functional capability of the assistant.
110
 
111
  ### Offline Testing Architecture
112
  To guarantee fast, deterministic execution and eliminate API cost:
app.py CHANGED
@@ -11,12 +11,16 @@ logger.info("Initializing RAG Knowledge Assistant application components...")
11
 
12
  import gradio as gr
13
  from db.bootstrap import init_database
14
- from services.chat_service import chat
15
- from services.ingestion_service import ingest
16
-
17
  # Initialize database schema/collections if not already done
18
  init_database()
19
 
 
 
 
 
 
 
 
20
  # Gradio Interface Construction
21
  logger.info("Creating Gradio UI structure...")
22
 
@@ -31,37 +35,89 @@ with gr.Blocks(title="Shop Knowledge Assistant") as demo:
31
  gr.Markdown(
32
  """
33
  # 🤖 Shop RAG Knowledge Assistant
34
- *Ask questions about store policies and toy products with high-performance vector search.*
35
  """
36
  )
37
 
38
  with gr.Row():
39
- with gr.Column(scale=3):
40
- q = gr.Textbox(
41
- label="Your Question",
42
- placeholder="e.g., What is the return policy for toy products?",
43
- lines=2
44
- )
45
  with gr.Column(scale=1):
 
46
  doc_type = gr.Dropdown(
47
  choices=["Auto (AI Router)", "Policy", "Product", "No Filter"],
48
  value="Auto (AI Router)",
49
- label="Filter Mode"
 
50
  )
51
 
52
- a = gr.Textbox(
53
- label="Response from Knowledge Assistant",
54
- placeholder="Answer will appear here...",
55
- interactive=False,
56
- lines=6
57
- )
58
-
59
- with gr.Row():
60
- btn = gr.Button("🔍 Ask Assistant", variant="primary")
61
- ingest_btn = gr.Button("🔄 Re-Ingest Data", variant="secondary")
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
62
 
63
- btn.click(chat, inputs=[q, doc_type], outputs=a)
64
- ingest_btn.click(ingest, outputs=a)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
65
 
66
  if __name__ == "__main__":
67
  logger.info("Launching Gradio UI server...")
 
11
 
12
  import gradio as gr
13
  from db.bootstrap import init_database
 
 
 
14
  # Initialize database schema/collections if not already done
15
  init_database()
16
 
17
+ from services.ui_handlers import (
18
+ add_user_message,
19
+ bot_response,
20
+ clear_session,
21
+ run_ingestion
22
+ )
23
+
24
  # Gradio Interface Construction
25
  logger.info("Creating Gradio UI structure...")
26
 
 
35
  gr.Markdown(
36
  """
37
  # 🤖 Shop RAG Knowledge Assistant
38
+ *Ask questions about store policies and toy products with high-performance vector search and conversation history.*
39
  """
40
  )
41
 
42
  with gr.Row():
43
+ # Left Sidebar Column: Configuration & Controls
 
 
 
 
 
44
  with gr.Column(scale=1):
45
+ gr.Markdown("### ⚙️ Settings & Actions")
46
  doc_type = gr.Dropdown(
47
  choices=["Auto (AI Router)", "Policy", "Product", "No Filter"],
48
  value="Auto (AI Router)",
49
+ label="Filter Mode",
50
+ visible=False
51
  )
52
 
53
+ ingest_btn = gr.Button("🔄 Re-Ingest Data", variant="secondary")
54
+ ingest_status = gr.Textbox(
55
+ label="Ingestion Status",
56
+ value="Ready",
57
+ interactive=False
58
+ )
59
+
60
+ clear_btn = gr.Button("🗑️ Clear History", variant="stop")
61
+
62
+ gr.Markdown(
63
+ """
64
+ ### 💡 Try Asking
65
+ - *What is the return policy for toy products?*
66
+ - *Do you have any outdoor adventure sets?*
67
+ - *Compare the Camping Adventure and the Horse Stable.*
68
+ """
69
+ )
70
+
71
+ # Right Chat Area Column: Full Interactive Chatbot
72
+ with gr.Column(scale=3):
73
+ chatbot = gr.Chatbot(
74
+ height=520,
75
+ label="Maya - Personal Play Consultant",
76
+ placeholder="Hi, I'm Maya! Ask me anything about our toys or store policies. I will remember our chat!"
77
+ )
78
+
79
+ with gr.Row():
80
+ q = gr.Textbox(
81
+ placeholder="Ask Maya a question...",
82
+ show_label=False,
83
+ scale=4,
84
+ container=False
85
+ )
86
+ btn = gr.Button("Send 🚀", variant="primary", scale=1)
87
 
88
+ # Event handlers binding
89
+ btn.click(
90
+ add_user_message,
91
+ inputs=[q, chatbot],
92
+ outputs=[q, chatbot],
93
+ queue=False
94
+ ).then(
95
+ bot_response,
96
+ inputs=[doc_type, chatbot],
97
+ outputs=[chatbot]
98
+ )
99
+
100
+ q.submit(
101
+ add_user_message,
102
+ inputs=[q, chatbot],
103
+ outputs=[q, chatbot],
104
+ queue=False
105
+ ).then(
106
+ bot_response,
107
+ inputs=[doc_type, chatbot],
108
+ outputs=[chatbot]
109
+ )
110
+
111
+ clear_btn.click(
112
+ clear_session,
113
+ outputs=[chatbot],
114
+ queue=False
115
+ )
116
+
117
+ ingest_btn.click(
118
+ run_ingestion,
119
+ outputs=[ingest_status]
120
+ )
121
 
122
  if __name__ == "__main__":
123
  logger.info("Launching Gradio UI server...")
services/chat_service.py CHANGED
@@ -28,7 +28,7 @@ def get_retrieved_context(question: str, resolved_type: str) -> str:
28
  docs = retrieve_all(question, resolved_type)
29
  return "\n\n".join(d.page_content for d in docs)
30
 
31
- def build_llm_messages(question: str, context: str) -> list:
32
  """Construct prompt message structure for LLM."""
33
  messages = [
34
  {"role": "system", "content": system_prompt}
@@ -41,6 +41,10 @@ def build_llm_messages(question: str, context: str) -> list:
41
  else:
42
  logger.warning("No context found. Proceeding with system prompt only.")
43
 
 
 
 
 
44
  messages.append({"role": "user", "content": question})
45
  return messages
46
 
@@ -57,13 +61,13 @@ def generate_llm_response(messages: list) -> str:
57
  logger.info("Chat completion completed successfully.")
58
  return answer
59
 
60
- def chat(question: str, doc_type: str = "Auto (AI Router)") -> str:
61
  """Coordinating function to run the full RAG chat workflow."""
62
- logger.info(f"Initiating chat logic for question: '{question}' (selected mode: {doc_type})")
63
  try:
64
  resolved_type = resolve_category(question, doc_type)
65
  context = get_retrieved_context(question, resolved_type)
66
- messages = build_llm_messages(question, context)
67
  return generate_llm_response(messages)
68
  except Exception as e:
69
  logger.error(f"Error in modular chat workflow: {e}", exc_info=True)
 
28
  docs = retrieve_all(question, resolved_type)
29
  return "\n\n".join(d.page_content for d in docs)
30
 
31
+ def build_llm_messages(question: str, context: str, history: list = None) -> list:
32
  """Construct prompt message structure for LLM."""
33
  messages = [
34
  {"role": "system", "content": system_prompt}
 
41
  else:
42
  logger.warning("No context found. Proceeding with system prompt only.")
43
 
44
+ if history:
45
+ for msg in history:
46
+ messages.append({"role": msg["role"], "content": msg["content"]})
47
+
48
  messages.append({"role": "user", "content": question})
49
  return messages
50
 
 
61
  logger.info("Chat completion completed successfully.")
62
  return answer
63
 
64
+ def chat(question: str, doc_type: str = "Auto (AI Router)", history: list = None) -> str:
65
  """Coordinating function to run the full RAG chat workflow."""
66
+ logger.info(f"Initiating chat logic for question: '{question}' (selected mode: {doc_type}, history length: {len(history) if history else 0})")
67
  try:
68
  resolved_type = resolve_category(question, doc_type)
69
  context = get_retrieved_context(question, resolved_type)
70
+ messages = build_llm_messages(question, context, history)
71
  return generate_llm_response(messages)
72
  except Exception as e:
73
  logger.error(f"Error in modular chat workflow: {e}", exc_info=True)
services/ui_handlers.py ADDED
@@ -0,0 +1,72 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import logging
2
+ import gradio as gr
3
+ from services.chat_service import chat
4
+ from services.ingestion_service import ingest
5
+
6
+ logger = logging.getLogger(__name__)
7
+
8
+ def add_user_message(user_message, history):
9
+ """Instantly add the user question to the chatbot history list and clear the input field."""
10
+ if not user_message.strip():
11
+ return "", history
12
+ if history is None:
13
+ history = []
14
+ # In Gradio 6, gr.Chatbot history format is a list of dicts
15
+ history.append({"role": "user", "content": user_message})
16
+ return "", history
17
+
18
+ def extract_text(content):
19
+ """Safely extract plain text from Gradio 6 structured MessageDict format."""
20
+ if isinstance(content, str):
21
+ return content
22
+ if isinstance(content, list):
23
+ parts = []
24
+ for item in content:
25
+ if isinstance(item, dict) and "text" in item:
26
+ parts.append(item["text"])
27
+ elif isinstance(item, str):
28
+ parts.append(item)
29
+ return " ".join(parts)
30
+ if isinstance(content, dict):
31
+ return content.get("text", str(content))
32
+ return str(content)
33
+
34
+ def bot_response(doc_type, history):
35
+ """Retrieve the chatbot's response from the chat service, injecting prior conversation context."""
36
+ if not history or history[-1]["role"] != "user":
37
+ return history
38
+
39
+ raw_user_message = history[-1]["content"]
40
+ user_message = extract_text(raw_user_message)
41
+
42
+ # Clean up preceding history for downstream compatibility
43
+ history_before = []
44
+ for msg in history[:-1]:
45
+ history_before.append({
46
+ "role": msg.get("role"),
47
+ "content": extract_text(msg.get("content", ""))
48
+ })
49
+
50
+ logger.info(f"Generating bot response for: '{user_message}' with {len(history_before)} historical turns.")
51
+
52
+ # Generate bot reply using chat service
53
+ bot_message = chat(user_message, doc_type, history_before)
54
+
55
+ history.append({"role": "assistant", "content": bot_message})
56
+ return history
57
+
58
+ def clear_session():
59
+ """Clear conversation history."""
60
+ logger.info("Clearing chat session history.")
61
+ return []
62
+
63
+ def run_ingestion():
64
+ """Execute document chunking and vector storage ingestion, providing visual alerts."""
65
+ logger.info("Triggering manual re-ingestion of knowledge data...")
66
+ try:
67
+ result = ingest()
68
+ gr.Info("Data ingestion completed successfully!")
69
+ return f"Success: {result}"
70
+ except Exception as e:
71
+ gr.Error(f"Data ingestion failed: {e}")
72
+ return f"Error: {e}"
tests/test_chat_service.py CHANGED
@@ -103,7 +103,7 @@ def test_chat_success(mock_generate, mock_build, mock_get, mock_resolve):
103
  assert response == "LLM answer"
104
  mock_resolve.assert_called_once_with("What is toy X?", "Auto (AI Router)")
105
  mock_get.assert_called_once_with("What is toy X?", "product")
106
- mock_build.assert_called_once_with("What is toy X?", "Toy facts")
107
  mock_generate.assert_called_once()
108
 
109
  def test_chat_exception_handling():
@@ -112,3 +112,43 @@ def test_chat_exception_handling():
112
  response = chat("any question", "Auto (AI Router)")
113
  assert "An error occurred while formulating a response:" in response
114
  assert "Database crash" in response
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
103
  assert response == "LLM answer"
104
  mock_resolve.assert_called_once_with("What is toy X?", "Auto (AI Router)")
105
  mock_get.assert_called_once_with("What is toy X?", "product")
106
+ mock_build.assert_called_once_with("What is toy X?", "Toy facts", None)
107
  mock_generate.assert_called_once()
108
 
109
  def test_chat_exception_handling():
 
112
  response = chat("any question", "Auto (AI Router)")
113
  assert "An error occurred while formulating a response:" in response
114
  assert "Database crash" in response
115
+
116
+ def test_build_llm_messages_with_history():
117
+ """Verify message context structure layout when conversation history is provided."""
118
+ history = [
119
+ {"role": "user", "content": "hi Maya"},
120
+ {"role": "assistant", "content": "Hello! How can I help you?"}
121
+ ]
122
+ messages = build_llm_messages("my second question", "some retrieved facts", history)
123
+
124
+ # Standard format: system prompt, context prompt, history user, history assistant, current user question
125
+ assert len(messages) == 5
126
+ assert messages[0]["role"] == "system"
127
+ assert messages[1]["role"] == "system"
128
+ assert "some retrieved facts" in messages[1]["content"]
129
+ assert messages[2]["role"] == "user"
130
+ assert messages[2]["content"] == "hi Maya"
131
+ assert messages[3]["role"] == "assistant"
132
+ assert messages[3]["content"] == "Hello! How can I help you?"
133
+ assert messages[4]["role"] == "user"
134
+ assert messages[4]["content"] == "my second question"
135
+
136
+ @patch("services.chat_service.resolve_category")
137
+ @patch("services.chat_service.get_retrieved_context")
138
+ @patch("services.chat_service.build_llm_messages")
139
+ @patch("services.chat_service.generate_llm_response")
140
+ def test_chat_with_history_success(mock_generate, mock_build, mock_get, mock_resolve):
141
+ """Verify the coordinated end-to-end chat workflow with history works on success."""
142
+ mock_resolve.return_value = "product"
143
+ mock_get.return_value = "Toy facts"
144
+ mock_build.return_value = [{"role": "user", "content": "Query"}]
145
+ mock_generate.return_value = "LLM answer"
146
+
147
+ history = [{"role": "user", "content": "hi"}]
148
+ response = chat("What is toy X?", "Auto (AI Router)", history)
149
+
150
+ assert response == "LLM answer"
151
+ mock_resolve.assert_called_once_with("What is toy X?", "Auto (AI Router)")
152
+ mock_get.assert_called_once_with("What is toy X?", "product")
153
+ mock_build.assert_called_once_with("What is toy X?", "Toy facts", history)
154
+ mock_generate.assert_called_once()