restaurant-intelligence-agent / docs /Gradio_implementation.md
TushP's picture
Upload folder using huggingface_hub
bb9baa9 verified
# Gradio 6 Implementation Guide
## Restaurant Intelligence Agent UI
**Date:** November 24, 2025 (Day 15)
**Hackathon:** Anthropic MCP 1st Birthday - Track 2 (Productivity)
---
## πŸ“‹ Table of Contents
1. [Overview](#overview)
2. [Installation](#installation)
3. [Architecture](#architecture)
4. [Implementation Steps](#implementation-steps)
5. [Key Components](#key-components)
6. [Challenges & Solutions](#challenges--solutions)
7. [Testing](#testing)
8. [Next Steps](#next-steps)
---
## 🎯 Overview
Built a production-ready Gradio 6 web interface for the Restaurant Intelligence Agent that:
- Accepts OpenTable URLs for analysis
- Displays role-based insights (Chef vs Manager)
- Enables Q&A over customer reviews
- Provides interactive drill-down functionality
**Technology Stack:**
- **Framework:** Gradio 6.0.0
- **Backend:** Python 3.12
- **AI:** Claude Sonnet 4 (via Anthropic API)
- **Scraper:** Selenium + BeautifulSoup
- **Analysis:** Custom NLP pipeline
---
## πŸ“¦ Installation
### **Step 1: Install Gradio 6**
```bash
pip install gradio==6.0.0
```
### **Step 2: Verify Installation**
```python
import gradio as gr
print(gr.__version__) # Should show 6.0.0
```
### **Step 3: Install Project Dependencies**
```bash
pip install anthropic selenium beautifulsoup4 pandas python-dotenv fastmcp
```
---
## πŸ—οΈ Architecture
### **File Structure**
```
src/
β”œβ”€β”€ ui/
β”‚ β”œβ”€β”€ __init__.py
β”‚ └── gradio_app.py # Main Gradio interface
β”œβ”€β”€ scrapers/
β”‚ └── opentable_scraper.py # Web scraping
β”œβ”€β”€ data_processing/
β”‚ └── review_cleaner.py # Text preprocessing
β”œβ”€β”€ agent/
β”‚ β”œβ”€β”€ base_agent.py # Core analysis agent
β”‚ β”œβ”€β”€ unified_analyzer.py # Menu/aspect analysis
β”‚ └── insights_generator.py # Chef/Manager insights
└── mcp_integrations/
β”œβ”€β”€ generate_chart.py # Visualizations
└── query_reviews.py # Q&A system (RAG)
```
### **Data Flow**
```
User Input (URL + Review Count)
↓
[Gradio Interface]
↓
[OpenTable Scraper] β†’ Raw HTML
↓
[Review Processor] β†’ Cleaned Text
↓
[AI Agent] β†’ Unified Analysis
↓
[Insights Generator] β†’ Chef + Manager Insights
↓
[Visualization Generator] β†’ Charts
↓
[Gradio Display] β†’ Interactive Results
↓
[Q&A System] ← User Questions
```
---
## πŸ› οΈ Implementation Steps
### **Step 1: Create UI Directory Structure**
```bash
mkdir -p src/ui
touch src/ui/__init__.py
touch src/ui/gradio_app.py
```
### **Step 2: Build Basic Gradio Interface**
**Key Gradio 6 Change:** Theme moved from `Blocks()` to `.launch()`
```python
import gradio as gr
# ❌ OLD (Gradio 5)
with gr.Blocks(theme=gr.themes.Soft()) as demo:
pass
# βœ… NEW (Gradio 6)
with gr.Blocks() as demo:
pass
demo.launch(theme=gr.themes.Soft())
```
### **Step 3: Design Layout**
**Three-Tab Design:**
1. **Chef Insights** - Menu performance, food quality
2. **Manager Insights** - Service, operations, ambience
3. **Ask Questions** - RAG-powered Q&A
**Components Used:**
- `gr.Textbox()` - URL input, progress display
- `gr.Dropdown()` - Review count selection, drill-down menus
- `gr.Button()` - Analyze, Ask buttons
- `gr.Image()` - Charts display
- `gr.Markdown()` - Formatted insights
- `gr.State()` - Context persistence (critical!)
- `gr.Tabs()` + `gr.Tab()` - Tabbed navigation
### **Step 4: Implement Progress Tracking**
Used `gr.Progress()` with `yield` for real-time updates:
```python
def analyze_restaurant_interface(url, review_count, progress=gr.Progress()):
# Phase 1: Scraping
progress(0.1, desc="πŸ“₯ Scraping reviews...")
yield (..., "πŸ“₯ Scraping reviews...", ...)
# Phase 2: Processing
progress(0.3, desc="βš™οΈ Processing data...")
yield (..., "βš™οΈ Processing data...", ...)
# Phase 3: Analysis
progress(0.8, desc="πŸ€– Running AI analysis...")
yield (..., "πŸ€– Running AI analysis...", ...)
# Final
progress(1.0, desc="βœ… Complete!")
yield (..., "βœ… Complete!", ...)
```
### **Step 5: Connect Backend**
**Imports:**
```python
from src.scrapers.opentable_scraper import scrape_opentable
from src.data_processing import process_reviews, clean_reviews_for_ai
from src.agent.base_agent import RestaurantAnalysisAgent
from src.mcp_integrations.query_reviews import query_reviews_direct
```
**Integration:**
```python
# Scrape
result = scrape_opentable(url=url, max_reviews=review_count, headless=True)
# Process
df = process_reviews(result)
reviews = clean_reviews_for_ai(df['review_text'].tolist())
# Analyze
agent = RestaurantAnalysisAgent()
analysis = agent.analyze_restaurant(url, restaurant_name, reviews)
# Display
chef_insights = analysis['insights']['chef']
manager_insights = analysis['insights']['manager']
```
### **Step 6: Implement Drill-Down Functionality**
**Dynamic Dropdowns:**
```python
# Populate dropdowns after analysis
chef_dropdown_choices = [item['name'] for item in menu_items]
manager_dropdown_choices = [aspect['name'] for aspect in aspects]
# Connect change events
chef_dropdown.change(
fn=get_menu_item_summary,
inputs=chef_dropdown,
outputs=chef_summary
)
```
**Detail Functions:**
```python
def get_menu_item_summary(item_name: str) -> str:
# Load menu_analysis.json
# Find selected item
# Return formatted summary with sentiment, mentions, reviews
pass
```
### **Step 7: Build Q&A System**
**Architecture:**
1. Index reviews after analysis
2. Store in memory dictionary (keyed by restaurant name)
3. Use keyword search to find relevant reviews
4. Send top 50 to Claude for answer
**Key Code:**
```python
# In query_reviews.py
def find_relevant_reviews(reviews, question, max_reviews=50):
# Extract keywords from question
keywords = [k for k in question.lower().split() if k not in stop_words]
# Score reviews by keyword matches
scored = [(sum(1 for k in keywords if k in r.lower()), r) for r in reviews]
scored.sort(reverse=True)
# Return top matches
return [r for score, r in scored[:max_reviews]]
```
**Context Persistence (Critical!):**
```python
# ❌ WRONG - Context lost between interactions
restaurant_context = gr.Textbox(visible=False)
# βœ… CORRECT - Context persists
restaurant_context = gr.State("")
```
---
## πŸ”‘ Key Components
### **1. Main Interface (`create_interface()`)**
**Features:**
- Clean, professional design
- Mobile-responsive layout
- Real-time progress updates
- Error handling
**Code Structure:**
```python
def create_interface():
with gr.Blocks(title="Restaurant Intelligence Agent") as demo:
# Header
gr.Markdown("# 🍽️ Restaurant Intelligence Agent")
# Input Section
with gr.Row():
url_input = gr.Textbox(...)
review_count = gr.Dropdown(...)
analyze_btn = gr.Button(...)
# Progress
progress_box = gr.Textbox(...)
# Hidden state
restaurant_context = gr.State("")
# Results Tabs
with gr.Tabs():
with gr.Tab("🍳 Chef Insights"):
...
with gr.Tab("πŸ‘” Manager Insights"):
...
with gr.Tab("πŸ’¬ Ask Questions"):
...
# Event handlers
analyze_btn.click(fn=analyze_restaurant_interface, ...)
return demo
```
### **2. Analysis Function (`analyze_restaurant_interface()`)**
**Generator Pattern for Progress:**
```python
def analyze_restaurant_interface(url, review_count, progress=gr.Progress()):
try:
# Validate input
if not url or "opentable" not in url.lower():
return error_output
# Phase 1: Scrape
progress(0.1, desc="Scraping...")
yield intermediate_output
result = scrape_opentable(...)
# Phase 2: Process
progress(0.3, desc="Processing...")
yield intermediate_output
reviews = process_reviews(result)
# Phase 3: Analyze
progress(0.5, desc="Analyzing...")
yield intermediate_output
analysis = agent.analyze_restaurant(...)
# Phase 4: Format & Display
progress(1.0, desc="Complete!")
yield final_output
except Exception as e:
yield error_output
```
### **3. Insight Formatting (`clean_insight_text()`)**
**Problem:** Claude returns insights in various formats:
- Plain text
- Lists: `["item1", "item2"]`
- Dicts: `[{"priority": "high", "action": "..."}]`
- Mixed with quotes and brackets
**Solution:** Universal text cleaner
```python
def clean_insight_text(text):
if isinstance(text, list):
# Handle list of dicts (recommendations)
if text and isinstance(text[0], dict):
return '\n\n'.join(f"β€’ {item['action']}" for item in text)
# Handle simple list
return '\n\n'.join(f"β€’ {item}" for item in text)
elif isinstance(text, str):
# Parse string representations
if text.startswith('[{'):
parsed = ast.literal_eval(text)
return format_list(parsed)
if text.startswith('['):
parsed = ast.literal_eval(text)
return '\n\n'.join(f"β€’ {item}" for item in parsed)
# Clean quotes
return text.strip('"\'[]')
return str(text)
```
### **4. Q&A System (`query_reviews.py`)**
**Features:**
- Keyword-based relevance scoring
- Searches all indexed reviews
- Returns top 50 most relevant
- Context-aware answers
**Key Functions:**
```python
# Index reviews after analysis
def index_reviews_direct(restaurant_name, reviews):
REVIEW_INDEX[restaurant_name.lower()] = reviews
return f"Indexed {len(reviews)} reviews"
# Find relevant reviews
def find_relevant_reviews(reviews, question, max_reviews=50):
keywords = extract_keywords(question)
scored = score_by_keywords(reviews, keywords)
return top_n(scored, max_reviews)
# Answer question
def query_reviews_direct(restaurant_name, question):
reviews = REVIEW_INDEX.get(restaurant_name.lower())
relevant = find_relevant_reviews(reviews, question)
return ask_claude(relevant, question)
```
---
## πŸ› Challenges & Solutions
### **Challenge 1: Gradio 6 Breaking Changes**
**Problem:** `theme=` parameter in `Blocks()` causes error
```
TypeError: BlockContext.__init__() got an unexpected keyword argument 'theme'
```
**Solution:** Move theme to `.launch()`
```python
# Before
with gr.Blocks(theme=gr.themes.Soft()) as demo:
pass
demo.launch()
# After
with gr.Blocks() as demo:
pass
demo.launch(theme=gr.themes.Soft())
```
### **Challenge 2: Insights Formatting Issues**
**Problem:** Raw JSON in display
```
["Strength 1", "Strength 2"]
[{'priority': 'high', 'action': '...'}]
```
**Solution:** Created `clean_insight_text()` function
- Handles lists, dicts, strings
- Extracts 'action' from recommendation dicts
- Converts to bullet points
- Removes brackets/quotes
### **Challenge 3: Manager Insights Rate Limit**
**Problem:** API rate limit (30K tokens/min) hit when generating insights
```
Error 429: rate_limit_error
```
**Solution:** Added 15s delay between chef and manager insights
```python
# In base_agent.py
chef_insights = generate_insights(role='chef')
time.sleep(15) # Wait to avoid rate limit
manager_insights = generate_insights(role='manager')
```
### **Challenge 4: Q&A Context Not Persisting**
**Problem:** Restaurant context arrives as empty string `''`
```python
DEBUG: restaurant_context = ''
```
**Solution:** Use `gr.State()` instead of hidden `gr.Textbox()`
```python
# Before
restaurant_context = gr.Textbox(visible=False)
# After
restaurant_context = gr.State("")
```
**Why:** `gr.State()` is designed for persisting values between interactions, while hidden textboxes can lose state.
### **Challenge 5: Poor Q&A Quality**
**Problem:** Q&A using only first 10 reviews, missing relevant content
```
"Reviews don't mention Brussels sprouts" (but they do!)
```
**Solution:**
1. Increased to 50 reviews
2. Added keyword-based filtering
3. Improved Claude prompt
**Result:** Now finds relevant reviews from entire dataset
---
## πŸ§ͺ Testing
### **Test 1: Basic Functionality (20 reviews)**
- βœ… Scraping works
- βœ… Analysis completes
- βœ… Insights display
- βœ… Charts generate
- βœ… Q&A works
### **Test 2: Rate Limits (100 reviews)**
- βœ… Manager insights generate (with 15s delay)
- βœ… No rate limit errors
- ⏱️ Total time: ~5-6 minutes
### **Test 3: Q&A Quality**
- βœ… Keyword search finds relevant reviews
- βœ… Answers cite specific review numbers
- βœ… Handles topics not in reviews gracefully
### **Test 4: Edge Cases**
- βœ… Invalid URL β†’ Clear error message
- βœ… Empty reviews β†’ Fallback message
- βœ… No context β†’ "Analyze restaurant first" message
---
## πŸ“Š Performance Metrics
| Reviews | Scraping | Analysis | Insights | Total | Cost |
|---------|----------|----------|----------|-------|------|
| 20 | 30s | 1m | 30s | 2m | $0.20 |
| 100 | 2m | 3m | 1m | 6m | $1.20 |
| 500 | 8m | 12m | 2m | 22m* | $5.00* |
*Estimated based on scaling
---
## 🎨 UI/UX Design Decisions
### **1. Three-Tab Layout**
**Why:** Separates concerns by user role
- Chef tab β†’ Food/menu focused
- Manager tab β†’ Operations focused
- Q&A tab β†’ Ad-hoc questions
### **2. Drill-Down Dropdowns**
**Why:** Reduces cognitive load
- Overview first (charts + summaries)
- Details on demand (select item)
### **3. Progress Indicators**
**Why:** Long-running operations (5-20 minutes)
- Real-time updates every 30 seconds
- Phase descriptions (Scraping β†’ Processing β†’ Analyzing)
- Prevents user from thinking app is frozen
### **4. Error Handling**
**Why:** Graceful degradation
- Clear error messages
- Fallback insights if generation fails
- Validation before expensive operations
---
## πŸš€ Next Steps
### **Immediate (Day 16)**
1. Deploy backend to Modal
2. Create Modal API endpoints
3. Update Gradio to call Modal instead of local functions
### **Day 17**
1. Create HuggingFace Space
2. Deploy Gradio UI to HF Space
3. Connect UI to Modal backend
4. Add API key as HF Secret
### **Day 18-19**
1. Create demo video (1-5 mins)
2. Polish README
3. Social media post
4. Final testing
5. Submit before Nov 30, 11:59 PM UTC
---
## πŸ“ Code Summary
### **Files Created/Modified (Day 15)**
1. **src/ui/gradio_app.py** (NEW - 620 lines)
- Main Gradio interface
- Progress tracking
- Event handlers
- Insight formatting
2. **src/mcp_integrations/query_reviews.py** (UPDATED)
- Added keyword-based search
- Increased max_reviews to 50
- Better prompts for Claude
3. **src/agent/base_agent.py** (UPDATED)
- Added 15s delay between insights
- Fixed state clearing
4. **src/agent/insights_generator.py** (UPDATED)
- Better error handling
- Improved prompts
5. **src/data_processing/review_cleaner.py** (CREATED)
- Text sanitization
- Token reduction
---
## πŸŽ“ Key Learnings
### **Gradio 6 Best Practices**
1. **Use `gr.State()` for persistence**, not hidden textboxes
2. **Move theme to `.launch()`**, not `Blocks()`
3. **Use generators with `yield`** for progress updates
4. **Wrap long operations** in try-except with user-friendly errors
5. **Test with `share=False`** locally before deploying
### **AI Agent Integration**
1. **Add delays between API calls** to avoid rate limits
2. **Handle variable response formats** from LLMs
3. **Provide fallback responses** when generation fails
4. **Log extensively** for debugging
5. **Validate responses** before displaying
### **Q&A System Design**
1. **Simple keyword search** often beats complex embeddings for small datasets
2. **Normalize inputs** (lowercase, strip) to avoid mismatches
3. **Show what's available** when context missing
4. **Cite sources** in answers for credibility
5. **Filter first, then send to LLM** to reduce tokens
---
## πŸ“š References
- [Gradio 6 Documentation](https://www.gradio.app/docs)
- [Gradio 6 Migration Guide](https://www.gradio.app/main/guides/gradio-6-migration-guide)
- [Anthropic API Docs](https://docs.anthropic.com/)
- [MCP 1st Birthday Hackathon](https://huggingface.co/MCP-1st-Birthday)
---
## βœ… Day 15 Completion Checklist
- [x] Install Gradio 6
- [x] Create UI directory structure
- [x] Build basic interface
- [x] Implement progress tracking
- [x] Connect backend (scraper, agent, insights)
- [x] Add drill-down functionality
- [x] Build Q&A system with RAG
- [x] Fix insights formatting
- [x] Fix rate limit issues
- [x] Fix Q&A context persistence
- [x] Improve Q&A quality (keyword search)
- [x] Test with 20 reviews βœ…
- [x] Test with 100 reviews βœ…
- [x] Document implementation βœ…
---
**Status:** βœ… Day 15 Complete!
**Next:** Day 16 - Modal Backend Deployment
---
*Generated: November 24, 2025*
*Project: Restaurant Intelligence Agent*
*Hackathon: Anthropic MCP 1st Birthday - Track 2*