robertokostov-ej commited on
Commit
1060b65
·
1 Parent(s): 193b5a1

Update space

Browse files
Files changed (4) hide show
  1. README.md +4 -3
  2. app.py +871 -59
  3. evaluation.py +333 -0
  4. requirements.txt +29 -1
README.md CHANGED
@@ -1,12 +1,13 @@
1
  ---
2
- title: Gbd
3
  emoji: 💬
4
  colorFrom: yellow
5
  colorTo: purple
6
  sdk: gradio
7
- sdk_version: 5.0.1
8
  app_file: app.py
9
  pinned: false
 
10
  ---
11
 
12
- An example chatbot using [Gradio](https://gradio.app), [`huggingface_hub`](https://huggingface.co/docs/huggingface_hub/v0.22.2/en/index), and the [Hugging Face Inference API](https://huggingface.co/docs/api-inference/index).
 
1
  ---
2
+ title: Recipe Thesis Agent
3
  emoji: 💬
4
  colorFrom: yellow
5
  colorTo: purple
6
  sdk: gradio
7
+ sdk_version: 5.23.2
8
  app_file: app.py
9
  pinned: false
10
+ python_version: 3.12
11
  ---
12
 
13
+ An example chatbot using [Gradio](https://gradio.app), [`huggingface_hub`](https://huggingface.co/docs/huggingface_hub/v0.22.2/en/index), and the [Hugging Face Inference API](https://huggingface.co/docs/api-inference/index).
app.py CHANGED
@@ -1,64 +1,876 @@
 
 
 
 
 
1
  import gradio as gr
2
- from huggingface_hub import InferenceClient
3
-
4
- """
5
- For more information on `huggingface_hub` Inference API support, please check the docs: https://huggingface.co/docs/huggingface_hub/v0.22.2/en/guides/inference
6
- """
7
- client = InferenceClient("HuggingFaceH4/zephyr-7b-beta")
8
-
9
-
10
- def respond(
11
- message,
12
- history: list[tuple[str, str]],
13
- system_message,
14
- max_tokens,
15
- temperature,
16
- top_p,
17
- ):
18
- messages = [{"role": "system", "content": system_message}]
19
-
20
- for val in history:
21
- if val[0]:
22
- messages.append({"role": "user", "content": val[0]})
23
- if val[1]:
24
- messages.append({"role": "assistant", "content": val[1]})
25
-
26
- messages.append({"role": "user", "content": message})
27
-
28
- response = ""
29
-
30
- for message in client.chat_completion(
31
- messages,
32
- max_tokens=max_tokens,
33
- stream=True,
34
- temperature=temperature,
35
- top_p=top_p,
36
- ):
37
- token = message.choices[0].delta.content
38
-
39
- response += token
40
- yield response
41
-
42
-
43
- """
44
- For information on how to customize the ChatInterface, peruse the gradio docs: https://www.gradio.app/docs/chatinterface
45
- """
46
- demo = gr.ChatInterface(
47
- respond,
48
- additional_inputs=[
49
- gr.Textbox(value="You are a friendly Chatbot.", label="System message"),
50
- gr.Slider(minimum=1, maximum=2048, value=512, step=1, label="Max new tokens"),
51
- gr.Slider(minimum=0.1, maximum=4.0, value=0.7, step=0.1, label="Temperature"),
52
- gr.Slider(
53
- minimum=0.1,
54
- maximum=1.0,
55
- value=0.95,
56
- step=0.05,
57
- label="Top-p (nucleus sampling)",
58
- ),
59
- ],
60
  )
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
61
 
 
 
 
62
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
63
  if __name__ == "__main__":
64
- demo.launch()
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # -*- coding: utf-8 -*-
2
+ import os
3
+ import pandas as pd
4
+ import time
5
+ import logging
6
  import gradio as gr
7
+ from typing import Optional, List, Dict # Keep typing
8
+ # from functools import lru_cache # Keep commented out
9
+ import random
10
+ import shutil
11
+ import re # Used for parsing recipe directions
12
+
13
+ # --- LangChain Imports ---
14
+ # Core
15
+ from langchain_core.documents import Document
16
+ from langchain_core.prompts import PromptTemplate
17
+ from langchain_core.output_parsers import StrOutputParser
18
+ from langchain_core.runnables import RunnablePassthrough
19
+ # LLMs (using Google GenAI wrapper)
20
+ from langchain_google_genai import ChatGoogleGenerativeAI
21
+ # Vector Stores / Embeddings
22
+ from langchain_huggingface import HuggingFaceEmbeddings
23
+ from langchain_community.vectorstores import Chroma
24
+ # --- Other Imports ---
25
+ from datasets import load_dataset # Keep specific exception handling removed
26
+ import pyarrow # Keep explicit import
27
+
28
+ # Attempt to load python-dotenv for easier local API key management (optional)
29
+ try:
30
+ from dotenv import load_dotenv
31
+ load_dotenv() # Load variables from .env file if it exists
32
+ DOTENV_AVAILABLE = True
33
+ except ImportError:
34
+ DOTENV_AVAILABLE = False
35
+
36
+ # ==============================================================================
37
+ # Logging Configuration
38
+ # ==============================================================================
39
+ logging.basicConfig(
40
+ level=logging.INFO, # INFO level is usually sufficient for running
41
+ format='%(asctime)s - %(name)s - %(levelname)s - %(message)s'
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
42
  )
43
+ logger = logging.getLogger('recipe_system')
44
+
45
+ # ==============================================================================
46
+ # Conditional Imports & Feature Flags
47
+ # ==============================================================================
48
+ # --- Vector Search Imports Check ---
49
+ VECTOR_IMPORTS_AVAILABLE = False
50
+ try:
51
+ if HuggingFaceEmbeddings and Chroma and Document and load_dataset and pyarrow:
52
+ VECTOR_IMPORTS_AVAILABLE = True
53
+ logger.info("Vector search dependencies check: OK.")
54
+ except NameError:
55
+ logger.error("Import check failed for vector search dependencies.")
56
+ VECTOR_IMPORTS_AVAILABLE = False
57
+ # --- LLM (LangChain Google GenAI) Imports Check ---
58
+ LANGCHAIN_LLM_AVAILABLE = False
59
+ GOOGLE_API_KEY = None
60
+ try:
61
+ if ChatGoogleGenerativeAI and PromptTemplate and StrOutputParser:
62
+ GOOGLE_API_KEY = os.environ.get('GOOGLE_API_KEY')
63
+ if not GOOGLE_API_KEY:
64
+ logger.warning("GOOGLE_API_KEY environment variable not found.")
65
+ if DOTENV_AVAILABLE: logger.info("Checked environment and .env file (if present).")
66
+ else: logger.info("Checked environment variables.")
67
+ LANGCHAIN_LLM_AVAILABLE = False
68
+ else:
69
+ logger.info("GOOGLE_API_KEY found. LangChain LLM dependencies appear available.")
70
+ LANGCHAIN_LLM_AVAILABLE = True
71
+ except NameError:
72
+ logger.error("Import check failed for LangChain LLM (Gemini) components.")
73
+ logger.error("<<<<< Please ensure 'langchain-google-genai' is installed (in requirements.txt) >>>>>")
74
+ LANGCHAIN_LLM_AVAILABLE = False
75
+
76
+ if not VECTOR_IMPORTS_AVAILABLE: logger.warning("Vector database imports failed - vector search disabled.")
77
+ if not LANGCHAIN_LLM_AVAILABLE: logger.warning("LangChain LLM setup incomplete - LLM features disabled.")
78
+ # --- End Import Check ---
79
+
80
+ # ==============================================================================
81
+ # Constants
82
+ # ==============================================================================
83
+ VECTOR_DB_PATH = "./recipe_vectordb" # Example path for persistence (not implemented yet)
84
+ DATASET_NAME = "corbt/all-recipes"
85
+ RECIPES_CSV_PATH = "recipes_data.csv"
86
+ GEMINI_MODEL_NAME = "models/gemini-1.5-flash-latest"
87
+
88
+ # ==============================================================================
89
+ # Recipe Recommendation System Class (Includes Agentic Routing)
90
+ # ==============================================================================
91
+ class RecipeRecommendationSystem:
92
+ """
93
+ Manages recipe data loading, parsing, indexing (vector or text), searching,
94
+ and optional LLM query expansion & RAG using LangChain. Includes enhanced logging
95
+ and minimal agentic routing.
96
+ """
97
+ def __init__(self):
98
+ self.is_initialized = False
99
+ self.initialization_error = None
100
+ self.embeddings = None
101
+ self.vector_db = None
102
+ self.recipes_df = None
103
+ self.sample_size = 1000
104
+ self.backup_recipes = self._get_backup_recipes()
105
+ self.lc_llm: Optional[ChatGoogleGenerativeAI] = None
106
+ self.use_vector_search = VECTOR_IMPORTS_AVAILABLE
107
+ self.use_llm = LANGCHAIN_LLM_AVAILABLE
108
+ logger.info(f"System instance created. Vector search: {self.use_vector_search}, LLM (LangChain Gemini): {self.use_llm}")
109
+
110
+ def _load_llm(self):
111
+ if not self.use_llm:
112
+ logger.info("LLM features disabled or dependencies missing.")
113
+ return False
114
+ if self.lc_llm:
115
+ logger.info("LangChain LLM wrapper already configured.")
116
+ return True
117
+ try:
118
+ logger.info(f"Configuring LangChain Gemini LLM wrapper for model: {GEMINI_MODEL_NAME}...")
119
+ self.lc_llm = ChatGoogleGenerativeAI(
120
+ model=GEMINI_MODEL_NAME, google_api_key=GOOGLE_API_KEY, temperature=0.7
121
+ )
122
+ logger.info("LangChain Gemini LLM wrapper configured successfully.")
123
+ return True
124
+ except Exception as e:
125
+ logger.exception(f"Error configuring LangChain Gemini LLM wrapper: {e}")
126
+ self.lc_llm = None
127
+ self.use_llm = False
128
+ self.initialization_error = (self.initialization_error or "") + f" | LangChain LLM Config Failed: {e}"
129
+ return False
130
+
131
+ def initialize(self, force_reload=False, sample_size=1000):
132
+ start_time = time.time()
133
+ logger.info(f"Initialize called. Force reload: {force_reload}, Sample size: {sample_size}")
134
+ llm_ready = not self.use_llm or (self.lc_llm is not None)
135
+
136
+ if (self.is_initialized and not force_reload and self.sample_size == sample_size and
137
+ self.recipes_df is not None and not self.recipes_df.empty and llm_ready):
138
+ search_mode_ok = (self.use_vector_search and self.vector_db is not None) or \
139
+ (not self.use_vector_search and self.vector_db is None)
140
+ if search_mode_ok:
141
+ logger.info(f"System already initialized ({'Vector' if self.use_vector_search else 'Text'} Search, LLM: {llm_ready}). Skipping.")
142
+ return True
143
+
144
+ self.sample_size = sample_size
145
+ logger.info(f"{'Reloading' if self.is_initialized or force_reload else 'Initializing'} system...")
146
+ self.is_initialized = False
147
+ self.initialization_error = None
148
+ self.vector_db = None # Reset DB on initialize/reload
149
+ self.recipes_df = None # Reset DF
150
+ if force_reload: self.lc_llm = None # Reset LLM wrapper too if forcing
151
+
152
+ llm_load_success = self._load_llm()
153
+ if not llm_load_success: logger.warning("LLM configuration failed. LLM features will be disabled.")
154
+
155
+ should_attempt_vector = VECTOR_IMPORTS_AVAILABLE
156
+ init_success = False
157
+ if should_attempt_vector:
158
+ logger.info("Attempting vector search initialization...")
159
+ # Note: Persistence logic would go here - check if VECTOR_DB_PATH exists and load if !force_reload
160
+ create_success = self._create_new_db() # Currently always creates new
161
+ if create_success:
162
+ logger.info("Vector DB creation successful.")
163
+ self.use_vector_search = True
164
+ init_success = True
165
+ else:
166
+ error_msg = self.initialization_error or "DB creation failed"
167
+ logger.error(f"{error_msg}. Falling back to text search.")
168
+ self.recipes_df = pd.DataFrame(self.backup_recipes).reset_index()
169
+ self.use_vector_search = False; self.vector_db = None
170
+ if self.recipes_df is not None and not self.recipes_df.empty:
171
+ logger.info(f"Loaded {len(self.recipes_df)} backup recipes for fallback.")
172
+ init_success = True
173
+ else: logger.error("Failed to load backup recipes during fallback.")
174
+ else: # Fallback if vector imports missing
175
+ logger.info("Vector dependencies unavailable. Initializing with text search fallback.")
176
+ self.recipes_df = pd.DataFrame(self.backup_recipes).reset_index()
177
+ self.use_vector_search = False; self.vector_db = None
178
+ if self.recipes_df is not None and not self.recipes_df.empty:
179
+ logger.info(f"Loaded {len(self.recipes_df)} backup recipes.")
180
+ init_success = True
181
+ else: logger.error("Failed to load backup recipes.")
182
+
183
+ elapsed = time.time() - start_time
184
+ if init_success and self.recipes_df is not None and not self.recipes_df.empty:
185
+ self.is_initialized = True
186
+ search_type = "vector" if self.use_vector_search else "text (fallback)"
187
+ llm_status = "active" if self.use_llm and self.lc_llm else "inactive"
188
+ logger.info(f"Init finished in {elapsed:.2f}s. Search: {search_type}. LLM: {llm_status}. Recipes: {len(self.recipes_df)}.")
189
+ return True
190
+ else: # Handle overall init failure
191
+ if not self.initialization_error: self.initialization_error = "Init failed (unknown reason)"
192
+ logger.error(f"Initialization failed: {self.initialization_error}")
193
+ self.is_initialized = False
194
+ return False
195
+
196
+ def _create_new_db(self):
197
+ """ Creates vector DB and populates self.recipes_df. Includes enhanced logging."""
198
+ try:
199
+ # --- 1. Load Raw Data ---
200
+ logger.info(f"Loading dataset '{DATASET_NAME}' from Hugging Face...")
201
+ try:
202
+ # Consider adding cache_dir argument if needed: cache_dir="./hf_cache"
203
+ dataset = load_dataset(DATASET_NAME, split='train')
204
+ recipes_raw_df = dataset.to_pandas()
205
+ logger.info(f"Loaded and converted {len(recipes_raw_df)} recipes.")
206
+ assert 'input' in recipes_raw_df.columns, "Missing 'input' column"
207
+ except Exception as e:
208
+ logger.exception(f"Dataset load failed: {e}")
209
+ self.initialization_error = f"Dataset load failed: {e}"
210
+ return False
211
+
212
+ # --- 2. Sample Data ---
213
+ logger.debug("Checking sample size...")
214
+ if 0 < self.sample_size < len(recipes_raw_df): # Ensure sample_size is positive
215
+ logger.info(f"Sampling {self.sample_size} recipes...")
216
+ recipes_sampled_df = recipes_raw_df.sample(
217
+ self.sample_size, random_state=42
218
+ ).reset_index(drop=True).copy()
219
+ else:
220
+ logger.info(f"Using all {len(recipes_raw_df)} loaded recipes (or invalid sample size).")
221
+ recipes_sampled_df = recipes_raw_df.reset_index(drop=True).copy()
222
+ logger.debug(f"DataFrame shape for processing: {recipes_sampled_df.shape}")
223
+
224
+ # --- 3. Initialize Embeddings ---
225
+ if not self.embeddings:
226
+ logger.info("Initializing embeddings model (sentence-transformers/all-MiniLM-L6-v2)...")
227
+ # Consider adding cache_folder argument if needed
228
+ self.embeddings = HuggingFaceEmbeddings(
229
+ model_name="sentence-transformers/all-MiniLM-L6-v2"
230
+ )
231
+ logger.info("Embeddings model initialized.")
232
+ else:
233
+ logger.info("Embeddings model already initialized.")
234
+
235
+ # --- 4. Parse 'input' Column & Create LangChain Documents ---
236
+ logger.info(f"Starting parsing loop for {len(recipes_sampled_df)} recipes...")
237
+ documents: List[Document] = []
238
+ processed_data = []
239
+ skipped = 0
240
+ log_interval = max(1, len(recipes_sampled_df) // 10) # Log more frequently if needed
241
+
242
+ for idx, row in recipes_sampled_df.iterrows():
243
+ if (idx + 1) % log_interval == 0:
244
+ logger.debug(f"Parsing progress: {idx + 1}/{len(recipes_sampled_df)}")
245
+ try:
246
+ inp = row.get('input','')
247
+ lines = [ln.strip() for ln in inp.splitlines()] if isinstance(inp, str) else []
248
+ if not lines: skipped += 1; continue
249
+ title = lines[0] if lines else f'Untitled Recipe {idx}'
250
+ ingreds = []; directs = []; in_i = False; in_d = False # Reset flags for each recipe
251
+
252
+ for line in lines[1:]:
253
+ line_strip = line.strip()
254
+ line_lower = line_strip.lower()
255
+ # State machine for parsing sections
256
+ if line_lower == 'ingredients:': in_i = True; in_d = False; continue
257
+ elif line_lower == 'directions:': in_d = True; in_i = False; continue
258
+ # If inside a section, append
259
+ if in_i: ingreds.append(line_strip.lstrip('- '))
260
+ elif in_d: directs.append(re.sub(r"^\s*[\d\W]+\.?\s*", "", line_strip)) # Clean step numbers/bullets
261
+ # Don't reset flags on empty lines within sections
262
+
263
+ i_str = "\n".join(ingreds).strip()
264
+ d_str = "\n".join(directs).strip()
265
+
266
+ if not title or not i_str or not d_str: skipped += 1; continue # Skip if essential parts missing
267
+
268
+ processed_data.append({
269
+ 'title': title, 'ingredients': i_str, 'instructions': d_str,
270
+ 'description': '', 'rating': None # Add placeholders
271
+ })
272
+ meta = { "doc_id": int(idx), "title": title, "ingredients": i_str, "instructions": d_str }
273
+ # Create document content combining key fields
274
+ doc_content = f"Title: {title}\n\nIngredients:\n{i_str}\n\nInstructions:\n{d_str}"
275
+ documents.append(Document(page_content=doc_content, metadata=meta))
276
+ except Exception as e:
277
+ logger.warning(f"Error parsing row index {idx}: {e}. Title: '{title if 'title' in locals() else 'N/A'}'. Skipping.", exc_info=False)
278
+ skipped += 1
279
+
280
+ logger.info(f"Parsing complete. Docs created: {len(documents)}, Data rows: {len(processed_data)}, Skipped: {skipped}")
281
+ if not documents:
282
+ self.initialization_error = "No valid documents were created after parsing."
283
+ return False
284
+
285
+ # --- 5. Store Parsed DataFrame & Save CSV ---
286
+ self.recipes_df = pd.DataFrame(processed_data)
287
+ if self.recipes_df.empty:
288
+ self.initialization_error = "Parsed DataFrame is empty after processing."
289
+ return False
290
+ try:
291
+ logger.info(f"Saving {len(self.recipes_df)} parsed recipes to CSV: {RECIPES_CSV_PATH}...")
292
+ self.recipes_df.to_csv(RECIPES_CSV_PATH, index=False)
293
+ logger.info("CSV saved.")
294
+ except Exception as e:
295
+ logger.warning(f"Could not save parsed recipes CSV: {e}")
296
+
297
+ # --- 6. Create IN-MEMORY Chroma DB ---
298
+ logger.info(f"Creating Chroma DB with {len(documents)} documents...")
299
+ try:
300
+ # Persistence logic would involve using persist_directory and Chroma(persist_directory=...) on reload
301
+ self.vector_db = Chroma.from_documents(
302
+ documents=documents,
303
+ embedding=self.embeddings
304
+ )
305
+ logger.info("Chroma DB created successfully.")
306
+ if self.recipes_df is None or self.recipes_df.empty: # Sanity check
307
+ raise RuntimeError("Critical Error: recipes_df lost after DB creation")
308
+ return True
309
+ except Exception as e:
310
+ logger.exception(f"Chroma DB creation failed: {e}")
311
+ self.initialization_error = f"Chroma DB creation failed: {e}"
312
+ self.vector_db = None
313
+ return False
314
+ except Exception as e: # Catch any other unexpected error
315
+ logger.exception(f"Outer error in _create_new_db: {e}")
316
+ self.initialization_error = f"Outer DB creation error: {str(e)}"
317
+ self.recipes_df = None; self.vector_db = None
318
+ return False
319
+
320
+ def _expand_query_with_llm(self, query: str) -> Optional[str]:
321
+ """Uses LCEL chain with Gemini to expand search query."""
322
+ if not self.use_llm or not self.lc_llm: return None
323
+ start_time = time.time(); logger.info(f"LCEL Chain: Expanding query: '{query}'")
324
+ try:
325
+ template = "Expand this recipe search query with related terms: {query}"
326
+ prompt = PromptTemplate.from_template(template)
327
+ output_parser = StrOutputParser()
328
+ expansion_chain = prompt | self.lc_llm | output_parser
329
+ expanded_query = expansion_chain.invoke({"query": query})
330
+ elapsed = time.time() - start_time
331
+ logger.info(f"LCEL Chain: Original: '{query}' -> Expanded: '{expanded_query}' ({elapsed:.2f}s)")
332
+ if not expanded_query or expanded_query.lower().strip() == query.lower().strip():
333
+ logger.info("LCEL expansion resulted in empty or identical query."); return None
334
+ return expanded_query.strip()
335
+ except Exception as e: logger.exception(f"LCEL expansion error: {e}"); return None
336
+
337
+ def _get_routing_decision(self, query: str) -> str:
338
+ """Uses the LLM to decide whether a query is better for RAG or Text Search."""
339
+ if not self.use_llm or not self.lc_llm:
340
+ logger.warning("Router: LLM off. Defaulting to RAG.")
341
+ return "RAG"
342
+ logger.info(f"Router: Getting decision for query: '{query}'")
343
+ start_time = time.time()
344
+ routing_template = """You are a request router for a recipe system. Determine the best approach:
345
+ 1. 'RAG': For specific questions about recipes (ingredients, instructions, properties like "is it vegetarian?").
346
+ 2. 'TEXT_SEARCH': For general searches by name or keywords (e.g., "chocolate chip cookies", "tomato soup").
347
+ Respond ONLY 'RAG' or 'TEXT_SEARCH'. Query: {query} Approach:"""
348
+ routing_prompt = PromptTemplate.from_template(routing_template)
349
+ output_parser = StrOutputParser()
350
+ try:
351
+ routing_chain = routing_prompt | self.lc_llm | output_parser
352
+ decision = routing_chain.invoke({"query": query}).strip().upper()
353
+ elapsed = time.time() - start_time
354
+ if decision in ["RAG", "TEXT_SEARCH"]: logger.info(f"Router: Decision '{decision}' ({elapsed:.2f}s)."); return decision
355
+ else: logger.warning(f"Router: Bad response '{decision}'. Defaulting RAG."); return "RAG"
356
+ except Exception as e: logger.exception(f"Router error: {e}. Defaulting RAG."); return "RAG"
357
+
358
+ def search_recipes(self, query, num_results=3):
359
+ """Searches recipes using LLM-routed approach."""
360
+ log_prefix = f"Search(Q='{query}', N={num_results})"
361
+ logger.info(f"{log_prefix}: Called. Init: {self.is_initialized}...")
362
+ if not self.is_initialized: return "System not initialized."
363
+ if self.recipes_df is None or self.recipes_df.empty: return "No recipe data."
364
+
365
+ original_query = query; search_query = query
366
+ expanded_query_used = False; llm_expansion_note = ""
367
+
368
+ # Optional Expansion
369
+ if self.use_llm:
370
+ expanded_query = self._expand_query_with_llm(original_query)
371
+ if expanded_query:
372
+ search_query = expanded_query; expanded_query_used = True
373
+ llm_expansion_note = f" (LLM expanded to: \"{search_query}\")"
374
+ logger.info(f"{log_prefix}: Using expanded query '{search_query}'")
375
+ else: logger.info(f"{log_prefix}: Using original query '{original_query}'")
376
+ else: logger.info(f"{log_prefix}: LLM expansion off. Using original query.")
377
+
378
+ search_start = time.time(); final_result = ""; search_method_used = "unknown"
379
+
380
+ # Routing
381
+ routing_decision = self._get_routing_decision(original_query)
382
+ logger.info(f"{log_prefix}: Router path: {routing_decision}")
383
+
384
+ try:
385
+ # --- RAG Path ---
386
+ if routing_decision == "RAG":
387
+ search_method_used = "vector (RAG chosen)"
388
+ if self.use_vector_search and self.vector_db is not None:
389
+ try: # Attempt RAG
390
+ logger.info(f"{log_prefix}: Retrieving docs (Q: '{search_query}')")
391
+ retriever = self.vector_db.as_retriever(search_kwargs={'k': num_results})
392
+ retrieved_docs: List[Document] = retriever.invoke(search_query)
393
+ logger.info(f"{log_prefix}: Found {len(retrieved_docs)} docs.")
394
+ if retrieved_docs and self.lc_llm:
395
+ logger.info(f"{log_prefix}: Running RAG chain.")
396
+ def format_docs(docs): return "\n\n---\n\n".join([f"Doc {i+1} (Title: {doc.metadata.get('title','N/A')}):\n{doc.page_content}" for i, doc in enumerate(docs)])
397
+ context_string = format_docs(retrieved_docs)
398
+ # Refined RAG prompt for better instructions
399
+ rag_template_qa = """You are a helpful Recipe Assistant. Your goal is to answer the user's query based *only* on the provided recipe Context. Be factual and concise. Follow these specific instructions:
400
+ 1. **Analyze the Query:** Is it a specific question about a recipe (e.g., "how long to bake", "ingredients for X", "is Y vegetarian?") or a general search term (e.g., "chicken soup", "easy dessert")?
401
+ 2. **Answer Based ONLY on Context:**
402
+ * If the query is a specific question AND the Context contains a clear answer, provide that answer directly.
403
+ * If the query is a specific question BUT the Context contains relevant recipes but NOT the specific answer, state what information IS available in the context related to the question (e.g., "The context includes a recipe for Chocolate Chip Cookies, but doesn't specify the exact baking temperature needed."). DO NOT GUESS or add external knowledge.
404
+ * If the query is a specific question BUT the retrieved Context seems completely irrelevant, state that you couldn't find relevant information *in the provided documents* to answer the question.
405
+ * If the query seems like a general search term AND the Context contains relevant recipes, present the recipes found clearly. For each recipe, include: Title, Ingredients, and Instructions. Format them nicely using Markdown.
406
+ * If the query is a general search term BUT no relevant recipes are found in the Context, state that no matching recipes were found in the provided documents.
407
+ 3. **Formatting:** Use Markdown for readability (like bullet points for ingredients, numbered steps for instructions).
408
+
409
+ Context:
410
+ {context}
411
+
412
+ Query: {query}
413
+ Answer:"""
414
+ rag_prompt = PromptTemplate.from_template(rag_template_qa)
415
+ # Setup RAG chain
416
+ rag_chain = (
417
+ {"context": lambda x: context_string, "query": RunnablePassthrough()}
418
+ | rag_prompt
419
+ | self.lc_llm
420
+ | StrOutputParser()
421
+ )
422
+ logger.info(f"{log_prefix}: Invoking RAG chain with original query: '{original_query}'")
423
+ final_result = rag_chain.invoke(original_query) # Use original query as the question for the LLM
424
+ search_method_used = "vector (RAG executed)"
425
+ elif not retrieved_docs:
426
+ logger.info(f"{log_prefix}: 0 docs found for RAG. Falling back to text search.")
427
+ final_result = "" # Trigger fallback
428
+ else: # Docs found, but LLM is inactive
429
+ logger.warning(f"{log_prefix}: Docs found, but LLM inactive. Cannot RAG. Falling back to text search.")
430
+ final_result = "" # Trigger fallback
431
+ except Exception as rag_error:
432
+ logger.exception(f"{log_prefix}: Vector retrieval or RAG chain error: {rag_error}")
433
+ final_result = "" # Trigger fallback on error
434
+ else: # RAG path chosen, but vector search is disabled or DB not available
435
+ logger.warning(f"{log_prefix}: RAG path chosen, but vector search is disabled or DB failed. Falling back to text search.")
436
+ final_result = "" # Trigger fallback
437
+
438
+ # Fallback within RAG path if RAG failed or produced no result
439
+ if not final_result:
440
+ logger.info(f"{log_prefix}: Falling back to text search (RAG path failed or yielded no result).")
441
+ search_method_used = "text (RAG fallback)"
442
+ final_result = self._execute_text_search_and_format(original_query, search_query, num_results, llm_expansion_note, is_fallback=True)
443
+
444
+ # --- Text Search Path (Chosen by Router) ---
445
+ elif routing_decision == "TEXT_SEARCH":
446
+ search_method_used = "text (router chosen)"
447
+ logger.info(f"{log_prefix}: Executing text search directly based on router decision.")
448
+ final_result = self._execute_text_search_and_format(original_query, search_query, num_results, llm_expansion_note, is_fallback=False)
449
+
450
+ # --- Handle unexpected router decision ---
451
+ else:
452
+ logger.error(f"{log_prefix}: Invalid router decision '{routing_decision}'. Critical error.")
453
+ final_result = f"❌ Internal Error: Invalid routing decision '{routing_decision}'."
454
+
455
+ # --- Final Logging and Return ---
456
+ search_elapsed = time.time() - search_start
457
+ logger.info(f"{log_prefix}: Completed via '{search_method_used}' path in {search_elapsed:.2f}s.")
458
+
459
+ # --- MODIFICATION START ---
460
+ # Prepare the main response string
461
+ final_output_string = final_result if final_result else f"😕 No results found for \"{original_query}\"."
462
 
463
+ # Create the debug string (add extra newlines for separation)
464
+ # Use markdown code block for clarity
465
+ debug_info = f"\n\n---\n`DEBUG: Router={routing_decision}, Method={search_method_used}`"
466
 
467
+ # Append debug info to the main response
468
+ return final_output_string + debug_info
469
+ # --- MODIFICATION END ---
470
+
471
+ except Exception as e: # Catch unexpected outer errors
472
+ logger.exception(f"{log_prefix}: Unexpected outer error: {e}")
473
+ # Also add debug info to error messages if possible (or default)
474
+ error_debug_info = f"\n\n---\n`DEBUG: Router={routing_decision}, Method=ErrorBeforeCompletion`"
475
+ return f"❌ An unexpected critical error occurred: {str(e)}" + error_debug_info
476
+
477
+ # --- Helper for Text Search Execution and Formatting ---
478
+ def _execute_text_search_and_format(self, original_query, search_query, num_results, llm_expansion_note, is_fallback=False):
479
+ """
480
+ Helper to run text search and format results for display.
481
+ Includes debug info about the execution method in the returned string.
482
+ """
483
+ log_prefix = f"Search(Q='{original_query}', N={num_results})" # Re-establish prefix for logging clarity
484
+ logger.info(f"{log_prefix}: Executing text search logic (Fallback={is_fallback}). Query='{search_query}'")
485
+ if self.recipes_df is None or self.recipes_df.empty:
486
+ logger.error(f"{log_prefix}: Text search error: recipes_df missing.")
487
+ # Add debug info even to error messages if possible
488
+ method = "text (RAG fallback)" if is_fallback else "text (router chosen)"
489
+ debug_info = f"\n\n---\n`DEBUG: Method={method}`"
490
+ return f"❌ Error: Recipe data frame is missing." + debug_info
491
+
492
+ text_indices = self._text_search(search_query, num_results) # Use potentially expanded query
493
+ logger.info(f"{log_prefix}: Text search found indices: {text_indices}")
494
+ text_results_data = []
495
+ processed_indices = set()
496
+ for recipe_id in text_indices:
497
+ # Validate index before attempting iloc
498
+ if isinstance(recipe_id, int) and 0 <= recipe_id < len(self.recipes_df) and recipe_id not in processed_indices:
499
+ try:
500
+ recipe_data = self.recipes_df.iloc[recipe_id]
501
+ # Ensure necessary keys exist, provide defaults if not
502
+ title = recipe_data.get('title', f'Recipe {recipe_id}')
503
+ ingredients = str(recipe_data.get('ingredients', 'N/A'))
504
+ instructions = str(recipe_data.get('instructions', 'N/A'))
505
+ text_results_data.append({'title': title, 'ingredients': ingredients, 'instructions': instructions})
506
+ processed_indices.add(recipe_id)
507
+ except Exception as df_error:
508
+ logger.warning(f"Text search DF access error for index {recipe_id}: {df_error}")
509
+ else:
510
+ logger.warning(f"Invalid or already processed text index skipped: {recipe_id}")
511
+
512
+ # Determine the method string for notes and debug info
513
+ method = "text (RAG fallback)" if is_fallback else "text (router chosen)"
514
+ search_note = "(using _text search fallback_)" if is_fallback else "(using _text search_)"
515
+ debug_info = f"\n\n---\n`DEBUG: Method={method}`" # Debug info based on how this function was called
516
+
517
+ if text_results_data:
518
+ logger.info(f"{log_prefix}: Formatting {len(text_results_data)} text results.")
519
+ # Start formatted output
520
+ formatted_output = f"Found {len(text_results_data)} recipe(s) for \"**{original_query}**\"{llm_expansion_note} {search_note}:\n\n---\n\n"
521
+ # Loop through collected data
522
+ for i, recipe in enumerate(text_results_data):
523
+ try:
524
+ title = recipe.get('title', 'Untitled Recipe') # Use data from list
525
+ formatted_output += f"### {i+1}. {title}\n\n"
526
+ ing = recipe.get('ingredients')
527
+ inst = recipe.get('instructions')
528
+ # Format ingredients if present
529
+ if ing and ing != 'N/A':
530
+ ing_list = [f"- {line.strip()}" for line in ing.strip().split('\n') if line.strip()]
531
+ if ing_list: formatted_output += "**Ingredients:**\n" + "\n".join(ing_list) + "\n\n"
532
+ # Format instructions if present
533
+ if inst and inst != 'N/A':
534
+ inst_list = [f"{num}. {line.strip()}" for num, line in enumerate(inst.strip().split('\n'), 1) if line.strip()]
535
+ if inst_list: formatted_output += "**Instructions:**\n" + "\n".join(inst_list) + "\n\n"
536
+ except Exception as fmt_e:
537
+ logger.warning(f"Error formatting text result #{i+1} (Title: '{recipe.get('title', 'N/A')}'): {fmt_e}")
538
+ formatted_output += f"*Error formatting recipe {i+1}*\n\n" # Add error note in output
539
+ # Add separator between recipes
540
+ if i < len(text_results_data) - 1:
541
+ formatted_output += "---\n\n"
542
+ # Append debug info before returning
543
+ return formatted_output.strip() + debug_info
544
+ else:
545
+ # Handle case where text search yields no results
546
+ logger.info(f"{log_prefix}: Text search (Fallback={is_fallback}) found 0 results after index processing.")
547
+ # Append debug info before returning
548
+ return f"😕 No recipes found matching: \"{original_query}\"." + debug_info
549
+
550
+ def _text_search(self, query, num_results=3):
551
+ """Performs keyword search on self.recipes_df."""
552
+ if self.recipes_df is None or self.recipes_df.empty: return []
553
+ try:
554
+ query_lower = query.lower()
555
+ # Improved keyword extraction (handles more cases)
556
+ query_words = set(re.findall(r'\b\w{3,}\b', query_lower))
557
+ if not query_words: logger.warning(f"Text Search: No valid keywords found in '{query}'."); return []
558
+
559
+ scored_recipes = []
560
+ # Ensure columns exist and handle potential NaN before string operations
561
+ titles = self.recipes_df.get('title', pd.Series(dtype=str)).fillna('').str.lower()
562
+ ingredients_col = self.recipes_df.get('ingredients', pd.Series(dtype=str)).fillna('').astype(str).str.lower()
563
+ # Consider adding instructions to search space? instructions_col = self.recipes_df.get('instructions', pd.Series(dtype=str)).fillna('').astype(str).str.lower()
564
+ search_texts = titles + " " + ingredients_col # Combine relevant text fields
565
+
566
+ for idx, text_content in search_texts.items():
567
+ score = 0
568
+ try:
569
+ # Basic scoring logic
570
+ if query_lower in text_content: score += 20 # Boost exact phrase match
571
+ # Word overlap scoring
572
+ text_words = set(word for word in re.findall(r'\b\w{3,}\b', text_content))
573
+ score += len(query_words.intersection(text_words)) * 5 # Keyword overlap
574
+ # Title overlap boost
575
+ title_words = set(word for word in re.findall(r'\b\w{3,}\b', titles.get(idx, '')))
576
+ score += len(query_words.intersection(title_words)) * 10 # Title keyword overlap boost
577
+ except Exception as score_err:
578
+ # Log scoring errors but continue
579
+ logger.warning(f"Scoring error for index {idx}: {score_err}", exc_info=False)
580
+ if score > 0: scored_recipes.append((idx, score))
581
+
582
+ # Sort by score descending
583
+ scored_recipes.sort(key=lambda x: x[1], reverse=True)
584
+ # Return top N indices
585
+ return [idx for idx, score in scored_recipes[:num_results]]
586
+ except Exception as e:
587
+ # Log unexpected errors during the search process
588
+ logger.exception(f"Unexpected error during text search for '{query}': {e}")
589
+ return []
590
+
591
+ @staticmethod
592
+ def _get_backup_recipes():
593
+ """ Provides a small, hardcoded list of recipes as a fallback. """
594
+ return [
595
+ {"title": "Spaghetti Carbonara", "description": "", "ingredients": "Spaghetti\nEggs\nPancetta or Guanciale\nPecorino Romano cheese\nBlack pepper", "instructions": "Cook spaghetti.\nFry pancetta.\nWhisk eggs and cheese.\nCombine pasta, pancetta fat, egg mixture off heat.\nAdd pasta water if needed.\nServe with pepper.", "rating": None},
596
+ {"title": "Chocolate Chip Cookies", "description": "", "ingredients": "Butter\nSugar\nBrown Sugar\nEggs\nVanilla Extract\nFlour\nBaking Soda\nSalt\nChocolate Chips", "instructions": "Cream butter and sugars.\nBeat in eggs and vanilla.\nCombine dry ingredients.\nMix wet and dry.\nStir in chocolate chips.\nDrop onto baking sheets.\nBake until golden brown.", "rating": None},
597
+ {"title": "Chicken Stir Fry", "description": "", "ingredients": "Chicken breast\nBroccoli\nBell peppers\nCarrots\nSoy sauce\nGinger\nGarlic\nSesame oil\nRice", "instructions": "Cut chicken and vegetables.\nStir-fry chicken until cooked.\nAdd vegetables and stir-fry until tender-crisp.\nMix sauce ingredients.\nPour sauce over stir-fry.\nServe with rice.", "rating": None},
598
+ {"title": "Greek Salad", "description": "", "ingredients": "Cucumber\nTomatoes\nRed onion\nKalamata olives\nFeta cheese\nOlive oil\nRed wine vinegar\nOregano", "instructions": "Chop vegetables.\nCombine vegetables and olives in a bowl.\nCrumble feta cheese over salad.\nWhisk olive oil, vinegar, and oregano for dressing.\nDrizzle dressing over salad.", "rating": None},
599
+ {"title": "Easy Banana Bread", "description": "", "ingredients": "Ripe bananas\nButter\nSugar\nEgg\nVanilla extract\nFlour\nBaking soda\nSalt", "instructions": "Mash bananas.\nMelt butter.\nMix melted butter, sugar, egg, and vanilla.\nCombine dry ingredients.\nMix wet and dry ingredients until just combined.\nPour into loaf pan.\nBake until a toothpick comes out clean.", "rating": None}
600
+ ]
601
+
602
+ # ==============================================================================
603
+ # Gradio Interface Creation (Stateful Chatbot UI - Corrected Outputs/Yields)
604
+ # ==============================================================================
605
+ def create_interface():
606
+ """Sets up and defines the Gradio web interface using a stateful gr.Chatbot."""
607
+ recipe_system = RecipeRecommendationSystem()
608
+ logger.info("Creating Gradio interface with Stateful Chatbot...")
609
+
610
+ # --- UI Helper Functions (Corrected outputs for ALL buttons/inputs) ---
611
+ def ui_init_system(sample_size_value, progress=gr.Progress(track_tqdm=True)):
612
+ logger.info(f"UI: Init clicked. Sample size: {sample_size_value}")
613
+ status_msg = "Initializing..."
614
+ # Outputs: Status, Init Btn, Reload Btn, Send Btn, Msg Input
615
+ # Yield status + 4 updates (for the 4 components in outputs list below)
616
+ yield status_msg, gr.update(interactive=False), gr.update(interactive=False), gr.update(interactive=False), gr.update(interactive=False)
617
+ try:
618
+ success = recipe_system.initialize(force_reload=False, sample_size=int(sample_size_value))
619
+ if success and recipe_system.is_initialized:
620
+ num = len(recipe_system.recipes_df) if recipe_system.recipes_df is not None else 0; db = "vector" if recipe_system.use_vector_search else "text"; llm = "active" if recipe_system.use_llm and recipe_system.lc_llm else "inactive"; status_msg = f"✅ Initialized ({num} recipes, {db} search, LLM {llm}). Ready."
621
+ # Enable all relevant controls -> Yield Status + 4 True updates
622
+ yield status_msg, gr.update(interactive=True), gr.update(interactive=True), gr.update(interactive=True), gr.update(interactive=True)
623
+ else:
624
+ status_msg = f"❌ Init failed: {recipe_system.initialization_error}. May use backups."
625
+ ok = recipe_system.recipes_df is not None and not recipe_system.recipes_df.empty
626
+ # Enable Init Btn, enable others based on fallback 'ok' -> Yield Status + 1 True + 3 'ok' updates
627
+ yield status_msg, gr.update(interactive=True), gr.update(interactive=ok), gr.update(interactive=ok), gr.update(interactive=ok)
628
+ except Exception as e:
629
+ logger.exception(f"UI initialization error: {e}")
630
+ # Enable all controls on error to allow retry -> Yield Status + 4 True updates
631
+ yield f"❌ UI Error: {e}", gr.update(interactive=True), gr.update(interactive=True), gr.update(interactive=True), gr.update(interactive=True)
632
+
633
+ def ui_reload_system(sample_size_value, progress=gr.Progress(track_tqdm=True)):
634
+ logger.info(f"UI: Reload clicked. Sample size: {sample_size_value}")
635
+ status_msg = "Reloading..."
636
+ # Outputs: Status, Init Btn, Reload Btn, Send Btn, Msg Input
637
+ # Yield status + 4 updates
638
+ yield status_msg, gr.update(interactive=False), gr.update(interactive=False), gr.update(interactive=False), gr.update(interactive=False)
639
+ try:
640
+ success = recipe_system.initialize(force_reload=True, sample_size=int(sample_size_value))
641
+ if success and recipe_system.is_initialized:
642
+ num = len(recipe_system.recipes_df) if recipe_system.recipes_df is not None else 0; db = "vector" if recipe_system.use_vector_search else "text"; llm = "active" if recipe_system.use_llm and recipe_system.lc_llm else "inactive"; status_msg = f"✅ Reloaded ({num} recipes, {db} search, LLM {llm}). Ready."
643
+ # Enable all -> Yield Status + 4 True updates
644
+ yield status_msg, gr.update(interactive=True), gr.update(interactive=True), gr.update(interactive=True), gr.update(interactive=True)
645
+ else:
646
+ status_msg = f"❌ Reload failed: {recipe_system.initialization_error}. May use backups."
647
+ ok = recipe_system.recipes_df is not None and not recipe_system.recipes_df.empty
648
+ # Enable Init Btn, enable others based on fallback 'ok' -> Yield Status + 1 True + 3 'ok' updates
649
+ yield status_msg, gr.update(interactive=True), gr.update(interactive=ok), gr.update(interactive=ok), gr.update(interactive=ok)
650
+ except Exception as e:
651
+ logger.exception(f"UI reload error: {e}")
652
+ # Enable all controls on error -> Yield Status + 4 True updates
653
+ yield f"❌ UI Error: {e}", gr.update(interactive=True), gr.update(interactive=True), gr.update(interactive=True), gr.update(interactive=True)
654
+
655
+ # --- Stateful Chat Interaction Function (Includes fix for ValidationError) ---
656
+ def respond(message, chat_history_list, num_results_value):
657
+ """
658
+ Handles user message, appends to history, calls backend, updates history.
659
+ Uses 'messages' format (list of dicts with 'role' and 'content').
660
+ Uses '...' as placeholder instead of None for content.
661
+ """
662
+ logger.info(f"UI Chat: Msg='{message}', History Len={len(chat_history_list)}, N={num_results_value}")
663
+
664
+ # Input Validation & Initialization Check
665
+ if not message or not message.strip():
666
+ logger.warning("Respond function called with empty message.")
667
+ chat_history_list.append({"role": "user", "content": message})
668
+ chat_history_list.append({"role": "assistant", "content": "⚠️ Please enter a message."})
669
+ return chat_history_list, gr.update(value="") # Return history and clear update
670
+
671
+ if not recipe_system.is_initialized and (recipe_system.recipes_df is None or recipe_system.recipes_df.empty):
672
+ logger.warning("Respond function called but system not initialized.")
673
+ chat_history_list.append({"role": "user", "content": message})
674
+ chat_history_list.append({"role": "assistant", "content": "⚠️ System not initialized or no data loaded. Please Initialize/Reload."})
675
+ return chat_history_list, gr.update(value="") # Return history and clear update
676
+
677
+ # Append user message and placeholder for bot - yield for immediate display
678
+ chat_history_list.append({"role": "user", "content": message})
679
+ chat_history_list.append({"role": "assistant", "content": "..."}) # Placeholder
680
+ # Yield history to display user message & placeholder, yield empty string "" to clear input
681
+ yield chat_history_list, ""
682
+
683
+ # Call Backend
684
+ bot_response_content = "Error generating response." # Default
685
+ try:
686
+ logger.info("Calling recipe_system.search_recipes...")
687
+ bot_response_content = recipe_system.search_recipes(message, int(num_results_value))
688
+ if not bot_response_content: # Handle empty returns
689
+ bot_response_content = "😕 No specific information found."
690
+ logger.info("Backend search successful.")
691
+ except Exception as e:
692
+ logger.exception(f"Error during backend search call from chat: {e}")
693
+ bot_response_content = f"❌ Error calling backend: {e}"
694
+
695
+ # Update the placeholder in history with the actual response
696
+ chat_history_list[-1]["content"] = bot_response_content
697
+
698
+ # Yield final history state (input box already cleared)
699
+ yield chat_history_list, ""
700
+
701
+ # --- UI Layout ---
702
+ with gr.Blocks(
703
+ title="Recipe Chat Agent",
704
+ theme=gr.themes.Soft(primary_hue=gr.themes.colors.amber, secondary_hue=gr.themes.colors.lime),
705
+ css=".gradio-container {max-width: 800px !important}"
706
+ ) as demo:
707
+ gr.Markdown("# 🍲 Recipe Chat Agent 🎉")
708
+ gr.Markdown("### Ask questions or search for recipes conversationally!")
709
+
710
+ # Define ALL UI Components FIRST
711
+ with gr.Row():
712
+ with gr.Column(scale=1):
713
+ status_display = gr.Textbox("Status: Not initialized.", label="System Status", interactive=False, lines=2)
714
+ with gr.Column(scale=2):
715
+ with gr.Accordion("⚙️ Settings & Initialization", open=False):
716
+ sample_slider = gr.Slider(minimum=100, maximum=5000, value=1000, step=100, label="Recipes to Load/Sample", info="Affects init time/memory.")
717
+ results_slider = gr.Slider(minimum=1, maximum=5, value=3, step=1, label="# Results/Context Docs", info="For RAG context or # Text Results")
718
+ with gr.Row():
719
+ init_button = gr.Button("🚀 Initialize System", variant="secondary", size="sm") # Interactive state set by load
720
+ reload_button = gr.Button("🔄 Reload Data", variant="stop", size="sm") # Interactive state set by load
721
+
722
+ with gr.Group(visible=True) as chat_interface_group: # Keep visible
723
+ chatbot = gr.Chatbot(label="Conversation", bubble_full_width=False, height=500, type='messages') # Use 'messages' type
724
+ chat_history = gr.State([]) # Initialize state for history list
725
+ with gr.Row():
726
+ msg_input = gr.Textbox(label="Your Message:", placeholder="Type your message here...", lines=1, scale=4, container=False) # Interactive state set by load
727
+ send_button = gr.Button("✉️ Send", variant="primary", scale=1, min_width=100) # Interactive state set by load
728
+ gr.Examples(
729
+ examples=[
730
+ ["easy weeknight dinner"], ["healthy vegetarian soup"],
731
+ ["how long does the banana bread take to bake?"],
732
+ ["does the carbonara recipe use cream?"], ["супа со печурки"],
733
+ ["find recipes with feta and olives"]
734
+ ],
735
+ inputs=msg_input, label="Example Messages"
736
+ )
737
+
738
+ # --- Define ALL Event Listeners AFTER components ---
739
+ init_button.click(
740
+ fn=ui_init_system,
741
+ inputs=[sample_slider],
742
+ # Outputs: Status, Init Btn, Reload Btn, Send Btn, Msg Input (5 total)
743
+ outputs=[status_display, init_button, reload_button, send_button, msg_input] # CORRECTED
744
+ )
745
+ reload_button.click(
746
+ fn=ui_reload_system,
747
+ inputs=[sample_slider],
748
+ # Outputs: Status, Init Btn, Reload Btn, Send Btn, Msg Input (5 total)
749
+ outputs=[status_display, init_button, reload_button, send_button, msg_input] # CORRECTED
750
+ )
751
+
752
+ # Connect chat interactions
753
+ send_button.click(
754
+ fn=respond,
755
+ inputs=[msg_input, chat_history, results_slider],
756
+ outputs=[chatbot, msg_input] # Respond updates chatbot and clears input
757
+ )
758
+ msg_input.submit(
759
+ fn=respond,
760
+ inputs=[msg_input, chat_history, results_slider],
761
+ outputs=[chatbot, msg_input] # Respond updates chatbot and clears input
762
+ )
763
+
764
+ # Initial setup on load: Enable ONLY init_button
765
+ def setup_load_state():
766
+ # Return updates for: Init, Reload, Send, MsgInput (4 total)
767
+ # Enable Init, disable others
768
+ return gr.update(interactive=True), gr.update(interactive=False), gr.update(interactive=False), gr.update(interactive=False) # CORRECTED
769
+ demo.load(
770
+ fn=setup_load_state, inputs=None,
771
+ # Components to update: Init, Reload, Send, MsgInput (4 total)
772
+ outputs=[init_button, reload_button, send_button, msg_input] # CORRECTED
773
+ )
774
+
775
+ logger.info("Gradio Interface definition complete.")
776
+ return demo
777
+
778
+ # ==============================================================================
779
+ # Main Execution Block (Keep as before)
780
+ # ==============================================================================
781
+ # ... (rest of the script including if __name__ == "__main__":) ...
782
+
783
+ # --- UI Layout ---
784
+ with gr.Blocks(
785
+ title="Recipe Chat Agent",
786
+ theme=gr.themes.Soft(primary_hue=gr.themes.colors.amber, secondary_hue=gr.themes.colors.lime),
787
+ css=".gradio-container {max-width: 800px !important}"
788
+ ) as demo:
789
+ gr.Markdown("# 🍲 Recipe Chat Agent 🎉")
790
+ gr.Markdown("### Ask questions or search for recipes conversationally!")
791
+
792
+ # Define ALL UI Components FIRST
793
+ with gr.Row():
794
+ with gr.Column(scale=1):
795
+ status_display = gr.Textbox("Status: Not initialized.", label="System Status", interactive=False, lines=2)
796
+ with gr.Column(scale=2):
797
+ with gr.Accordion("⚙️ Settings & Initialization", open=False):
798
+ sample_slider = gr.Slider(minimum=100, maximum=5000, value=1000, step=100, label="Recipes to Load/Sample", info="Affects init time/memory.")
799
+ results_slider = gr.Slider(minimum=1, maximum=5, value=3, step=1, label="# Results/Context Docs", info="For RAG context or # Text Results")
800
+ with gr.Row():
801
+ init_button = gr.Button("🚀 Initialize System", variant="secondary", size="sm") # Interactive state set by load
802
+ reload_button = gr.Button("🔄 Reload Data", variant="stop", size="sm") # Interactive state set by load
803
+
804
+ with gr.Group(visible=True) as chat_interface_group: # Keep visible
805
+ chatbot = gr.Chatbot(label="Conversation", height=500, type='messages') # Use 'messages' type
806
+ chat_history = gr.State([]) # Initialize state for history list
807
+ with gr.Row():
808
+ msg_input = gr.Textbox(label="Your Message:", placeholder="Type your message here...", lines=1, scale=4, container=False) # Interactive state set by load
809
+ send_button = gr.Button("✉️ Send", variant="primary", scale=1, min_width=100) # Interactive state set by load
810
+ gr.Examples(
811
+ examples=[
812
+ ["easy weeknight dinner"], ["healthy vegetarian soup"],
813
+ ["how long does the banana bread take to bake?"],
814
+ ["does the carbonara recipe use cream?"], ["супа со печурки"],
815
+ ["find recipes with feta and olives"]
816
+ ],
817
+ inputs=msg_input, label="Example Messages"
818
+ )
819
+
820
+ # --- Define ALL Event Listeners AFTER components ---
821
+ init_button.click(
822
+ fn=ui_init_system,
823
+ inputs=[sample_slider],
824
+ outputs=[status_display, init_button, reload_button, send_button, msg_input]
825
+ )
826
+ reload_button.click(
827
+ fn=ui_reload_system,
828
+ inputs=[sample_slider],
829
+ outputs=[status_display, init_button, reload_button, send_button, msg_input]
830
+ )
831
+
832
+ # Connect chat interactions
833
+ # Use .then() to clear input AFTER respond finishes and updates chatbot
834
+ # Clears input textbox
835
+ clear_input = msg_input.submit(
836
+ fn=respond,
837
+ inputs=[msg_input, chat_history, results_slider],
838
+ outputs=[chatbot, msg_input] # Respond updates chatbot and clears input
839
+ )
840
+ # Send button also uses respond and clears input
841
+ send_button.click(
842
+ fn=respond,
843
+ inputs=[msg_input, chat_history, results_slider],
844
+ outputs=[chatbot, msg_input] # Respond updates chatbot and clears input
845
+ )
846
+
847
+
848
+ # Initial setup on load: Enable ONLY init_button
849
+ def setup_load_state():
850
+ # Return updates for: Init, Reload, Send, MsgInput
851
+ return gr.update(interactive=True), gr.update(interactive=False), gr.update(interactive=False), gr.update(interactive=False)
852
+ demo.load(
853
+ fn=setup_load_state, inputs=None,
854
+ outputs=[init_button, reload_button, send_button, msg_input]
855
+ )
856
+
857
+ logger.info("Gradio Interface definition complete.")
858
+ return demo
859
+
860
+ # ==============================================================================
861
+ # Main Execution Block
862
+ # ==============================================================================
863
  if __name__ == "__main__":
864
+ logger.info("Application starting...")
865
+ if not LANGCHAIN_LLM_AVAILABLE: logger.warning("!"*20 + "\nLangChain LLM (Gemini) setup INCOMPLETE...\n" + "!"*20)
866
+ else: logger.info("LangChain LLM dependencies and API key found.")
867
+ if not VECTOR_IMPORTS_AVAILABLE: logger.warning("!"*20 + "\nVector search dependencies NOT FOUND...\n" + "!"*20)
868
+ else: logger.info("Vector search dependencies found.")
869
+
870
+ logger.info("Creating Gradio interface...")
871
+ interface = create_interface()
872
+
873
+ logger.info("Launching Gradio interface...")
874
+ interface.launch(share=False) # Share=False for local testing
875
+
876
+ logger.info("Gradio interface closed.")
evaluation.py ADDED
@@ -0,0 +1,333 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # -*- coding: utf-8 -*-
2
+ import os
3
+ import re
4
+ import time
5
+ import logging
6
+ from gradio_client import Client
7
+ from sklearn.metrics import (
8
+ accuracy_score,
9
+ confusion_matrix,
10
+ classification_report,
11
+ precision_recall_fscore_support
12
+ )
13
+ import pandas as pd # Optional: if loading data from file
14
+
15
+ # --- Configuration ---
16
+ # Option 1: Hardcode your Space ID/URL
17
+ # SPACE_ID = "your-username/your-space-name"
18
+ # Option 2: Get from environment variable (useful if running script ON the space or elsewhere with env set)
19
+ SPACE_ID = "rkostov/thesis-agent" # Add a default placeholder
20
+ # Option 3: Use full URL if needed
21
+ # SPACE_URL = "https://your-username-your-space-name.hf.space"
22
+
23
+ API_NAME = "/respond" # From view_api output
24
+ NUM_RESULTS = 3 # Default value for the slider input
25
+ SLEEP_BETWEEN_CALLS = 1 # Seconds to wait to avoid rate limiting
26
+
27
+ # --- Logging Setup ---
28
+ logging.basicConfig(level=logging.INFO, format='%(asctime)s - %(levelname)s - %(message)s')
29
+ logger = logging.getLogger('evaluation_script')
30
+
31
+ # --- Benchmark Dataset ---
32
+ # Full list of ~100 queries with intended routing targets
33
+ # You can modify/expand this list or load from a file (CSV/JSON)
34
+ benchmark_data = [
35
+ # === RAG - Specific Questions ===
36
+ # Ingredients & Quantities
37
+ {'query': 'Does spaghetti carbonara use cream?', 'intended_target': 'RAG'},
38
+ {'query': 'What kind of cheese is in the Greek Salad recipe?', 'intended_target': 'RAG'},
39
+ {'query': 'Is there onion in the banana bread?', 'intended_target': 'RAG'},
40
+ {'query': 'List ingredients for chocolate chip cookies.', 'intended_target': 'RAG'},
41
+ {'query': 'How much butter for the chocolate chip cookies?', 'intended_target': 'RAG'},
42
+ {'query': 'Does the stir fry recipe contain peanuts?', 'intended_target': 'RAG'},
43
+ {'query': 'What oil is recommended for the stir fry?', 'intended_target': 'RAG'},
44
+ {'query': 'Are eggs required for the carbonara?', 'intended_target': 'RAG'},
45
+ {'query': 'Tell me the spices in the default chicken recipe.', 'intended_target': 'RAG'},
46
+ {'query': 'Any garlic in the greek salad?', 'intended_target': 'RAG'},
47
+ {'query': 'What type of flour is used in the banana bread?', 'intended_target': 'RAG'},
48
+ {'query': 'How many eggs in the carbonara?', 'intended_target': 'RAG'},
49
+ {'query': 'Does the banana bread use baking soda or baking powder?', 'intended_target': 'RAG'},
50
+ {'query': 'Are fresh tomatoes needed for the greek salad?', 'intended_target': 'RAG'},
51
+ {'query': 'What cut of chicken for the stir fry?', 'intended_target': 'RAG'},
52
+ # Instructions & Timing
53
+ {'query': 'How long do I bake the chocolate chip cookies?', 'intended_target': 'RAG'},
54
+ {'query': 'What temperature to bake cookies?', 'intended_target': 'RAG'},
55
+ {'query': 'What is the first step for the chicken stir fry?', 'intended_target': 'RAG'},
56
+ {'query': 'How do you make the dressing for the Greek Salad?', 'intended_target': 'RAG'},
57
+ {'query': 'Tell me how to cook spaghetti carbonara.', 'intended_target': 'RAG'},
58
+ {'query': 'Summarize the banana bread instructions.', 'intended_target': 'RAG'},
59
+ {'query': 'How many steps are there to make the cookies?', 'intended_target': 'RAG'},
60
+ {'query': 'What do I do after frying the pancetta in carbonara?', 'intended_target': 'RAG'},
61
+ {'query': 'How should I prepare the vegetables for the stir fry?', 'intended_target': 'RAG'},
62
+ {'query': "What's the final step for the Greek salad?", 'intended_target': 'RAG'},
63
+ {'query': 'How long does the banana bread need to cool?', 'intended_target': 'RAG'},
64
+ {'query': 'At what point are the chocolate chips added?', 'intended_target': 'RAG'},
65
+ {'query': 'How long to cook the chicken in the stir fry?', 'intended_target': 'RAG'},
66
+ {'query': 'When is the pasta water used in carbonara?', 'intended_target': 'RAG'},
67
+ {'query': 'Should the feta be crumbled or cubed for the salad?', 'intended_target': 'RAG'},
68
+ # Properties/Suitability
69
+ {'query': 'Is the Greek Salad vegetarian?', 'intended_target': 'RAG'},
70
+ {'query': 'Are the chocolate chip cookies gluten-free?', 'intended_target': 'RAG'},
71
+ {'query': 'Is the banana bread recipe vegan?', 'intended_target': 'RAG'},
72
+ {'query': 'Can the carbonara be made ahead of time?', 'intended_target': 'RAG'},
73
+ {'query': 'Is the chicken stir fry spicy?', 'intended_target': 'RAG'},
74
+ {'query': 'Approximate prep time for banana bread?', 'intended_target': 'RAG'},
75
+ {'query': 'Which of the backup recipes are vegetarian?', 'intended_target': 'RAG'},
76
+ {'query': 'Difficulty level of the carbonara?', 'intended_target': 'RAG'},
77
+ {'query': 'Does the cookie recipe yield many cookies?', 'intended_target': 'RAG'},
78
+ {'query': 'Is the stir fry low-carb?', 'intended_target': 'RAG'},
79
+ # Technique/Tools
80
+ {'query': 'How do I cream butter and sugar?', 'intended_target': 'RAG'},
81
+ {'query': "What does 'fold in' mean for the banana bread?", 'intended_target': 'RAG'},
82
+ {'query': 'What pan size for the banana bread?', 'intended_target': 'RAG'},
83
+ {'query': 'Do I need a whisk for the carbonara?', 'intended_target': 'RAG'},
84
+ {'query': "What does 'tender-crisp' mean for stir fry vegetables?", 'intended_target': 'RAG'},
85
+ {'query': 'How to mash bananas properly?', 'intended_target': 'RAG'},
86
+ {'query': 'What kind of pan for stir fry?', 'intended_target': 'RAG'},
87
+ {'query': 'How to chop an onion for the salad?', 'intended_target': 'RAG'},
88
+ {'query': "What does 'al dente' mean for spaghetti?", 'intended_target': 'RAG'},
89
+ {'query': 'Why mix wet and dry ingredients separately for cookies?', 'intended_target': 'RAG'},
90
+
91
+ # === Text Search - General Queries ===
92
+ # Recipe Name
93
+ {'query': 'Spaghetti Carbonara', 'intended_target': 'TEXT_SEARCH'},
94
+ {'query': 'Easy Banana Bread', 'intended_target': 'TEXT_SEARCH'},
95
+ {'query': 'Chicken Stir Fry', 'intended_target': 'TEXT_SEARCH'},
96
+ {'query': 'Greek Salad', 'intended_target': 'TEXT_SEARCH'},
97
+ {'query': 'Chocolate Chip Cookies', 'intended_target': 'TEXT_SEARCH'},
98
+ {'query': 'Recipe for carbonara', 'intended_target': 'TEXT_SEARCH'},
99
+ {'query': 'Show me banana bread', 'intended_target': 'TEXT_SEARCH'},
100
+ {'query': 'cookies', 'intended_target': 'TEXT_SEARCH'},
101
+ {'query': 'salad', 'intended_target': 'TEXT_SEARCH'},
102
+ {'query': 'pasta', 'intended_target': 'TEXT_SEARCH'},
103
+ # Main Ingredient(s)
104
+ {'query': 'recipes with chicken breast', 'intended_target': 'TEXT_SEARCH'},
105
+ {'query': 'broccoli soup', 'intended_target': 'TEXT_SEARCH'},
106
+ {'query': 'something with eggs and pancetta', 'intended_target': 'TEXT_SEARCH'},
107
+ {'query': 'Find recipes using feta cheese.', 'intended_target': 'TEXT_SEARCH'},
108
+ {'query': 'pasta with eggs', 'intended_target': 'TEXT_SEARCH'},
109
+ {'query': 'banana recipes', 'intended_target': 'TEXT_SEARCH'},
110
+ {'query': 'cookies with chocolate', 'intended_target': 'TEXT_SEARCH'},
111
+ {'query': 'salad with olives', 'intended_target': 'TEXT_SEARCH'},
112
+ {'query': 'dinner with chicken', 'intended_target': 'TEXT_SEARCH'},
113
+ {'query': 'recipes using ripe bananas', 'intended_target': 'TEXT_SEARCH'},
114
+ {'query': 'find recipes with bell peppers', 'intended_target': 'TEXT_SEARCH'},
115
+ {'query': 'Pecorino Romano recipes', 'intended_target': 'TEXT_SEARCH'},
116
+ {'query': 'What can I make with butter and sugar?', 'intended_target': 'TEXT_SEARCH'},
117
+ {'query': 'Search for recipes with cucumber', 'intended_target': 'TEXT_SEARCH'},
118
+ {'query': 'Got extra eggs, what can I make?', 'intended_target': 'TEXT_SEARCH'},
119
+ # Meal Type/Descriptor
120
+ {'query': 'quick weeknight dinner', 'intended_target': 'TEXT_SEARCH'},
121
+ {'query': 'healthy dessert', 'intended_target': 'TEXT_SEARCH'},
122
+ {'query': 'vegetarian main course', 'intended_target': 'TEXT_SEARCH'},
123
+ {'query': 'party appetizer', 'intended_target': 'TEXT_SEARCH'},
124
+ {'query': 'easy baking recipes', 'intended_target': 'TEXT_SEARCH'},
125
+ {'query': 'low carb meals', 'intended_target': 'TEXT_SEARCH'},
126
+ {'query': 'comfort food', 'intended_target': 'TEXT_SEARCH'},
127
+ {'query': 'salad recipes', 'intended_target': 'TEXT_SEARCH'},
128
+ {'query': 'budget friendly ideas', 'intended_target': 'TEXT_SEARCH'},
129
+ {'query': 'simple lunch', 'intended_target': 'TEXT_SEARCH'},
130
+ {'query': 'italian pasta', 'intended_target': 'TEXT_SEARCH'},
131
+ {'query': 'something sweet', 'intended_target': 'TEXT_SEARCH'},
132
+ {'query': 'savory dishes', 'intended_target': 'TEXT_SEARCH'},
133
+ {'query': 'recipes for beginners', 'intended_target': 'TEXT_SEARCH'},
134
+ {'query': '30 minute meals', 'intended_target': 'TEXT_SEARCH'},
135
+
136
+ # === Ambiguous Queries (Assigning a default target for evaluation) ===
137
+ {'query': 'ingredients for healthy vegetarian soup', 'intended_target': 'TEXT_SEARCH'},
138
+ {'query': 'how to make vegetarian lasagna', 'intended_target': 'TEXT_SEARCH'},
139
+ {'query': 'best chocolate chip cookie recipe', 'intended_target': 'TEXT_SEARCH'},
140
+ {'query': 'carbonara no cream', 'intended_target': 'TEXT_SEARCH'},
141
+ {'query': 'information about banana bread', 'intended_target': 'RAG'},
142
+ {'query': 'Greek salad dressing instructions', 'intended_target': 'RAG'},
143
+ {'query': 'quick vegetarian pasta', 'intended_target': 'TEXT_SEARCH'},
144
+ {'query': 'tell me about stir fry', 'intended_target': 'RAG'},
145
+ {'query': 'carbonara recipe details', 'intended_target': 'RAG'},
146
+ {'query': 'cookie variations', 'intended_target': 'TEXT_SEARCH'},
147
+ {'query': 'Can you find a low-sugar banana bread?', 'intended_target': 'TEXT_SEARCH'},
148
+ {'query': 'What are some salads with cucumber?', 'intended_target': 'TEXT_SEARCH'},
149
+ {'query': 'Talk me through the carbonara recipe', 'intended_target': 'RAG'},
150
+ {'query': 'Nutritional info for cookies', 'intended_target': 'RAG'}, # RAG likely to fail gracefully
151
+ {'query': 'Compare carbonara and stir fry', 'intended_target': 'RAG'}, # RAG likely to fail gracefully
152
+
153
+ # === Edge Cases ===
154
+ {'query': 'choclate chip cookis', 'intended_target': 'TEXT_SEARCH'}, # Misspelling
155
+ {'query': 'soup', 'intended_target': 'TEXT_SEARCH'}, # Broad
156
+ {'query': 'Does any recipe use saffron?', 'intended_target': 'RAG'}, # Likely Out of Scope Ingredient
157
+ {'query': 'asdfghjkl', 'intended_target': 'TEXT_SEARCH'}, # Nonsense
158
+ {'query': 'tell me a joke about cooking', 'intended_target': 'RAG'} # Out of scope Topic
159
+ ]
160
+ # --- End Benchmark Dataset ---
161
+
162
+
163
+ # --- Helper function to extract routing decision ---
164
+ def extract_routing_decision(response_content):
165
+ if not isinstance(response_content, str):
166
+ return "PARSE_ERROR" # Handle non-string content
167
+ # Pattern to find Router=VALUE within the debug string `DEBUG: Router=VALUE,...`
168
+ pattern = r"Router=([^,`]+)"
169
+ match = re.search(pattern, response_content)
170
+ if match:
171
+ decision = match.group(1).strip()
172
+ if decision in ["RAG", "TEXT_SEARCH"]:
173
+ return decision
174
+ else:
175
+ logger.warning(f"Parsed unexpected decision value: {decision}")
176
+ return "PARSE_ERROR" # Unexpected value
177
+ else:
178
+ # Check if Method= only is present (e.g. from text search helper)
179
+ method_pattern = r"Method=([^`]+)"
180
+ method_match = re.search(method_pattern, response_content)
181
+ if method_match:
182
+ method_used = method_match.group(1).strip()
183
+ # Infer routing based on method if Router= missing
184
+ if "text (router chosen)" in method_used:
185
+ logger.warning("Router= missing, inferred TEXT_SEARCH from method.")
186
+ return "TEXT_SEARCH"
187
+ elif "text (RAG fallback)" in method_used:
188
+ logger.warning("Router= missing, inferred RAG (fallback) from method.")
189
+ return "RAG" # It was intended RAG, even if it failed
190
+ elif "vector (RAG executed)" in method_used:
191
+ logger.warning("Router= missing, inferred RAG from method.")
192
+ return "RAG"
193
+ logger.warning(f"Could not parse routing decision or infer from method in response.")
194
+ return "PARSE_ERROR" # Pattern not found
195
+
196
+
197
+ # --- Main Evaluation Logic ---
198
+ if not SPACE_ID:
199
+ logger.error("Error: SPACE_ID not configured. Set the environment variable or hardcode it.")
200
+ exit()
201
+
202
+ logger.info(f"Connecting to Gradio Space: {SPACE_ID}")
203
+ try:
204
+ # Increase timeout if needed client = Client(SPACE_ID, hf_token=...)
205
+ client = Client(SPACE_ID)
206
+ logger.info("Connection successful.")
207
+ except Exception as e:
208
+ logger.error(f"Failed to connect to Gradio Space: {e}")
209
+ exit()
210
+
211
+ # Lists to store labels and predictions
212
+ y_true = [] # Your manual labels ('intended_target')
213
+ y_pred = [] # Agent's actual routing decisions
214
+
215
+ logger.info(f"Starting evaluation for {len(benchmark_data)} queries...")
216
+
217
+ for i, item in enumerate(benchmark_data):
218
+ query = item['query']
219
+ intended_target = item['intended_target']
220
+ logger.info(f"Processing query {i+1}/{len(benchmark_data)}: '{query}' (Expected: {intended_target})")
221
+
222
+ actual_decision = "API_ERROR" # Default if API call fails
223
+
224
+ try:
225
+ # Make the API call (stateless) - Requires API to accept only message & num_results
226
+ result = client.predict(
227
+ message=query,
228
+ num_results_value=NUM_RESULTS,
229
+ api_name=API_NAME # Use "/respond"
230
+ # No chat_history argument here due to API bug
231
+ )
232
+
233
+ # Process the result
234
+ # Expected result (based on corrected respond): tuple (chat_history_list, "")
235
+ if isinstance(result, tuple) and len(result) == 2 and isinstance(result[0], list) and result[0]:
236
+ # Get the last message added (should be the assistant's response)
237
+ last_message = result[0][-1]
238
+ if isinstance(last_message, dict) and last_message.get("role") == "assistant":
239
+ bot_content = last_message.get("content")
240
+ actual_decision = extract_routing_decision(bot_content) # Parse debug info
241
+ else:
242
+ logger.warning(f"Unexpected structure in last message: {last_message}")
243
+ actual_decision = "PARSE_ERROR"
244
+ elif result is None:
245
+ logger.error(f"API call for query '{query}' returned None.")
246
+ actual_decision = "API_NONE_RETURN"
247
+ else:
248
+ logger.warning(f"Unexpected API result structure: {type(result)} | Content: {result}")
249
+ actual_decision = "API_STRUCT_ERROR"
250
+
251
+ except Exception as e:
252
+ logger.error(f"API call failed for query '{query}': {e}")
253
+ actual_decision = "API_ERROR"
254
+
255
+ # Append results, ensuring labels are consistent
256
+ y_true.append(intended_target)
257
+ y_pred.append(actual_decision)
258
+
259
+ logger.info(f" -> Actual Decision: {actual_decision}")
260
+
261
+ # Wait briefly to avoid hitting potential rate limits on free Spaces
262
+ time.sleep(SLEEP_BETWEEN_CALLS)
263
+
264
+ logger.info("Evaluation loop finished.")
265
+
266
+ # --- Calculate and Print Metrics ---
267
+ logger.info("\n--- Evaluation Results ---")
268
+
269
+ # Define the valid labels we expect the parser to return
270
+ valid_labels = ['RAG', 'TEXT_SEARCH']
271
+ filtered_y_true = []
272
+ filtered_y_pred = []
273
+ # Tally errors
274
+ error_codes = ["API_ERROR", "PARSE_ERROR", "API_STRUCT_ERROR", "API_NONE_RETURN"]
275
+ error_counts = {code: 0 for code in error_codes}
276
+ unknown_preds = []
277
+
278
+ for true_label, pred_label in zip(y_true, y_pred):
279
+ if pred_label in valid_labels:
280
+ filtered_y_true.append(true_label)
281
+ filtered_y_pred.append(pred_label)
282
+ elif pred_label in error_counts:
283
+ error_counts[pred_label] += 1
284
+ else: # Catch any unexpected prediction labels
285
+ logger.error(f"Encountered unexpected predicted label: {pred_label} for true label: {true_label}")
286
+ unknown_preds.append(pred_label)
287
+
288
+
289
+ total_processed = len(filtered_y_true)
290
+ total_errors = sum(error_counts.values())
291
+ logger.info(f"Total Queries Run: {len(benchmark_data)}")
292
+ logger.info(f"Successfully Parsed Predictions: {total_processed}")
293
+ logger.info(f"API/Parse Errors: {total_errors}")
294
+ for code, count in error_counts.items():
295
+ if count > 0: logger.info(f" - {code}: {count}")
296
+ if unknown_preds: logger.warning(f"Unknown predicted labels encountered: {set(unknown_preds)}")
297
+
298
+
299
+ if total_processed > 0:
300
+ # Overall Accuracy
301
+ accuracy = accuracy_score(filtered_y_true, filtered_y_pred)
302
+ logger.info(f"\nOverall Routing Accuracy (on {total_processed} successful predictions): {accuracy:.2%}")
303
+
304
+ # Confusion Matrix
305
+ logger.info("\nConfusion Matrix (Rows: Actual/Intended, Columns: Predicted by Agent):")
306
+ # Ensure consistent labeling for the matrix
307
+ cm = confusion_matrix(filtered_y_true, filtered_y_pred, labels=valid_labels)
308
+ logger.info(f"Labels: {valid_labels}")
309
+ # Print matrix with labels
310
+ cm_df = pd.DataFrame(cm, index=[f'Actual_{l}' for l in valid_labels], columns=[f'Predicted_{l}' for l in valid_labels])
311
+ logger.info(f"\n{cm_df}\n")
312
+ # Explanation (assuming RAG=0, TEXT_SEARCH=1) -> Use labels instead
313
+ logger.info(f"TN (Actual RAG, Predicted RAG): {cm[0][0]}")
314
+ logger.info(f"FP (Actual RAG, Predicted TEXT_SEARCH): {cm[0][1]}")
315
+ logger.info(f"FN (Actual TEXT_SEARCH, Predicted RAG): {cm[1][0]}")
316
+ logger.info(f"TP (Actual TEXT_SEARCH, Predicted TEXT_SEARCH): {cm[1][1]}")
317
+
318
+
319
+ # Classification Report (Precision, Recall, F1 per class)
320
+ logger.info("\nClassification Report:")
321
+ # Use dict output for easier logging if needed, default string is fine too
322
+ report = classification_report(
323
+ filtered_y_true,
324
+ filtered_y_pred,
325
+ labels=valid_labels,
326
+ target_names=valid_labels,
327
+ zero_division=0 # Report 0 instead of warning for classes with no support/predictions
328
+ )
329
+ logger.info(f"\n{report}")
330
+ else:
331
+ logger.warning("No successful predictions were parsed, cannot calculate metrics.")
332
+
333
+ logger.info("--- Evaluation Complete ---")
requirements.txt CHANGED
@@ -1 +1,29 @@
1
- huggingface_hub==0.25.2
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # requirements.txt - Reflecting original constraints before updates
2
+
3
+ langchain>=0.1.0,<0.4.0
4
+ langchain-community>=0.0.1,<0.4.0
5
+ langchain-huggingface>=0.0.1,<0.2.0
6
+ nltk>=3.8.1
7
+ pandas>=2.0.0,<3.0.0
8
+ requests>=2.31.0
9
+ gradio==5.23.2 # Note: Recommend latest 4.x like 4.31.0 if available within this range
10
+ chromadb>=0.4.17,<0.7.0
11
+ sentence-transformers>=2.2.2
12
+ transformers>=4.35.0
13
+ torch>=2.0.0
14
+ numpy>=1.24.0
15
+ scikit-learn>=1.3.0
16
+ tqdm>=4.66.1
17
+ pyyaml>=6.0.1
18
+ python-dotenv>=1.0.0
19
+ fastapi>=0.104.0
20
+ uvicorn[standard]>=0.23.2 # Added [standard] for common extras
21
+ datasets>=2.14.0 # Hugging Face datasets library
22
+ huggingface_hub>=0.19.0
23
+ pyarrow # No specific version constraint provided initially
24
+ sentencepiece # No specific version constraint provided initially
25
+ google-generativeai # No specific version constraint provided initially
26
+ langchain-google-genai>=0.0.9
27
+
28
+ # Note: Pip will install the latest versions compatible with these constraints.
29
+ # Using this list might avoid some dependency conflicts compared to using all absolute latest versions.