mikeboone commited on
Commit
b91d31b
·
1 Parent(s): 35d3d13

Features: Add tag support, improve error handling, MCP improvements

Browse files

- Add tag_name parameter to deployment flow
- Improve population retry logic with numbered options
- Add MCP package to requirements
- Update CLAUDE.md with dual method documentation
- File organization: move docs to dev_notes/, create scratch/ folder
- Add .cursorrules for file organization
- Add data adjuster modules for runtime data modification

.cursorrules ADDED
@@ -0,0 +1,214 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Cursor Rules for Demo Wire Project
2
+
3
+ ## CRITICAL PROCESS MANAGEMENT RULE
4
+
5
+ **NEVER restart the application unless explicitly requested**
6
+
7
+ - DO NOT run commands like `kill`, `lsof -ti:7862 | xargs kill`, or restart scripts
8
+ - DO NOT execute `launch_chat.py`, `demo_prep.py` or similar startup commands
9
+ - When making code changes, say "Changes saved to [file] - restart when ready" and STOP
10
+ - ONLY restart if user explicitly says: "restart", "rerun the app", "relaunch", "kill and restart"
11
+ - If user says "don't stop the process" or "keep it running" - NEVER restart, no matter what
12
+
13
+ ## Contact & Communication
14
+
15
+ - User (boone) is a ThoughtSpot Architect with 20+ years experience
16
+ - You can challenge ideas, but respect deep domain knowledge
17
+ - When asked to STOP - stop immediately and wait for further instructions
18
+ - Start discussions with back-and-forth using numbered questions
19
+ - TS = ThoughtSpot (the company where boone works)
20
+ - Ask rather than assume when in doubt
21
+
22
+ ## Debugging and Problem-Solving Protocol
23
+
24
+ ### Confidence and Communication
25
+
26
+ 1. **Don't act over-confident unless you're extremely sure**
27
+ - Check documentation before claiming you know how something works
28
+ - Say "I don't know" or "Let me check" instead of guessing
29
+ - If you're uncertain, say so upfront
30
+
31
+ 2. **NEVER claim something is fixed until it's tested**
32
+ - ❌ WRONG: "I've fixed the tags issue" (without testing)
33
+ - ✅ RIGHT: "I've updated the code - let's test if tags work now"
34
+ - Show the test results, don't just assume it works
35
+
36
+ 3. **When debugging:**
37
+ - Check documentation FIRST before blaming external systems
38
+ - State what you're checking and why
39
+ - Share what you found (even if it proves you wrong)
40
+
41
+ 4. **It's OK to say:**
42
+ - "I don't know - should I research this?"
43
+ - "I'm not certain, but here are 2 possibilities..."
44
+ - "Let me verify this works before saying it's fixed"
45
+
46
+ ## File Organization - CRITICAL
47
+
48
+ ### NEVER create files in root without asking first
49
+
50
+ **tests/** - Real, reusable test cases only
51
+ - Unit tests for core functions
52
+ - Integration tests that could be automated
53
+ - Tests you'd run in CI/CD
54
+
55
+ **scratch/** - ALL temporary/experimental/debug files
56
+ - ALL experimental/debug/check/verify/analyze scripts
57
+ - One-off fixes (fix_*.py, adjust_*.py, emergency_*.py)
58
+ - Debug scripts (debug_*.py, check_*.py, verify_*.py)
59
+ - Analysis tools (analyze_*.py, get_*.py, show_*.py)
60
+ - Test files you're experimenting with
61
+ - Backup files (.bak, .bak2)
62
+ - Export/debug .yml/.json files
63
+ - **ANY script that's temporary or one-time use**
64
+ - DO NOT commit without cleanup
65
+
66
+ **dev_notes/** - All documentation
67
+ - All .md files except README.md and CLAUDE.md
68
+ - Presentation materials (.pptx, .html, .txt)
69
+ - Research documents
70
+ - Architecture notes
71
+
72
+ **Root directory** - ONLY essential files
73
+ - Main application files
74
+ - Core utilities
75
+ - Configuration (.env, requirements.txt)
76
+ - README.md and CLAUDE.md only
77
+ - DO NOT create random files here
78
+
79
+ ### Simple Decision Tree
80
+
81
+ Creating a new file? Ask yourself:
82
+
83
+ 1. **Is it a real test that should be automated?** → `tests/`
84
+ 2. **Is it documentation/presentation?** → `dev_notes/`
85
+ 3. **Is it core application code?** → Root (but ASK first!)
86
+ 4. **Everything else?** → `scratch/` (debug, check, verify, analyze, fix, backup, experimental)
87
+
88
+ ### When in doubt: PUT IT IN SCRATCH
89
+
90
+ ## Testing Existing Features
91
+
92
+ **NEVER create simplified versions of working code**
93
+
94
+ When testing:
95
+ - ❌ WRONG: Write new simplified code from scratch
96
+ - ✅ RIGHT: Call existing working functions in a test harness
97
+
98
+ Example:
99
+ - ❌ WRONG: Create new viz configs and call low-level functions
100
+ - ✅ RIGHT: Call `create_liveboard_from_model()` (the actual function used)
101
+
102
+ ## Environment & Setup
103
+
104
+ - Python virtual environment: `./demo_wire/bin/activate`
105
+ - NOT using conda
106
+ - Supabase IS installed and configured (.env works)
107
+ - ThoughtSpot auth works with demo_builder_user
108
+
109
+ ## Common Mistakes to AVOID
110
+
111
+ 1. DO NOT add unnecessary .env validation - variables are populated
112
+ 2. DO NOT try to install supabase/packages - already in venv
113
+ 3. DO NOT change defaults to "thoughtspot.com" - that's for customer URLs
114
+ 4. DO NOT assume worksheets needed - they're deprecated, use models
115
+ 5. ALWAYS use venv when running Python: `source ./demo_wire/bin/activate && python`
116
+
117
+ ## Before Making Changes
118
+
119
+ 1. Check sprint documentation: `dev_notes/sprint2_102025.md`
120
+ 2. Read "Known Issues" section
121
+ 3. Verify using venv, not system Python
122
+ 4. Don't add validation that blocks working code
123
+ 5. Check file organization rules before creating files
124
+
125
+ ## Frustration Points (AVOID)
126
+
127
+ User gets frustrated when you:
128
+ - Don't trust that .env variables are correct
129
+ - Try to reinstall already-installed packages
130
+ - Make changes without understanding context
131
+ - Break working code by "simplifying" it
132
+ - **RESTART THE APPLICATION WITHOUT PERMISSION**
133
+
134
+ ## Working Patterns
135
+
136
+ - Settings save/load through Supabase works
137
+ - ThoughtSpot TML IS YAML format (yaml.dump() required)
138
+ - Models replaced worksheets
139
+ - Liveboards match "golden demo" style
140
+
141
+ ## Liveboard Creation - DUAL METHOD SYSTEM
142
+
143
+ **PRIMARY GOAL:** Both MCP and TML methods must work simultaneously with shared codebase
144
+
145
+ ### Method Selection (via environment variable)
146
+ - `USE_MCP_LIVEBOARD=true` → MCP method (default)
147
+ - `USE_MCP_LIVEBOARD=false` → TML method
148
+ - Entry point: `thoughtspot_deployer.py:1548`
149
+
150
+ ### MCP Method (AI-Driven)
151
+ - Uses Model Context Protocol with ThoughtSpot's agent.thoughtspot.app
152
+ - Natural language questions → ThoughtSpot creates visualizations
153
+ - OAuth authentication, requires npx/Node.js
154
+ - **Status:** Working well!
155
+ - **Main function:** `create_liveboard_from_model_mcp()` - line 2006
156
+
157
+ ### TML Method (Template-Based)
158
+ - Builds ThoughtSpot Modeling Language (YAML) structures
159
+ - Direct control over visualization types and layout
160
+ - REST API with token auth
161
+ - **Status:** Needs fixes for KPIs and search queries
162
+ - **Main function:** `create_liveboard_from_model()` - line 1779
163
+
164
+ ### Critical Shared Code (Changes affect BOTH methods)
165
+ - `_generate_smart_questions_with_ai()` - line 1863
166
+ - `_generate_fallback_visualizations()` - line 1442
167
+ - LLM Researcher instance
168
+ - Outlier conversion helpers
169
+
170
+ ### Method-Specific Code (Safe to change independently)
171
+ - **MCP only:**
172
+ - `_convert_outlier_to_mcp_question()` - line 1809
173
+ - `_create_kpi_question_from_outlier()` - line 1835
174
+ - MCP tool calls (getAnswer, createLiveboard)
175
+ - **TML only:**
176
+ - `LiveboardCreator` class - line 452
177
+ - `generate_search_query()` - line 539
178
+ - `create_visualization_tml()` - line 1196
179
+ - `deploy_liveboard()` - line 1670
180
+
181
+ ### KPI Requirements (BOTH methods need these)
182
+ - **For sparklines and percent change comparisons:**
183
+ - Must include time dimension (date column)
184
+ - Must specify granularity (daily, weekly, monthly, quarterly, yearly)
185
+ - Example: `[Total_revenue] [Order_date].monthly`
186
+ - **MCP:** Natural language includes time context
187
+ - **TML:** Search query must have `[measure] [date_column].granularity`
188
+
189
+ ### Terminology Clarification
190
+ - **Outliers:** Interesting data points in existing data (both methods support)
191
+ - **Data Adjuster:** Modifying values for scenarios (NOT supported by MCP, Snowflake views needed)
192
+
193
+ ### Golden Demo Structure
194
+ - **Location:** `/Users/mike.boone/cursor_demowire/DemoPrep/liveboard_demogold2/🏬 Global Retail Apparel Sales (New).liveboard.tml`
195
+ - Uses GROUPS (tabs) not text tiles for organization
196
+ - KPI structure: `[sales] [date].weekly [date].'last 8 quarters'`
197
+ - Groups organize visualizations by theme
198
+ - Brand colors applied via style_properties
199
+
200
+ ### Testing Strategy
201
+ - Test BOTH methods when changing shared code
202
+ - Use separate test files: `tests_temp/test_mcp_*.py` vs `tests_temp/test_tml_*.py`
203
+ - Same dataset, same company, compare results
204
+
205
+ ## Project Context
206
+
207
+ - Read primary context: `dev_notes/sprint2_102025.md`
208
+ - Software stored in boone's repo, will be open sourced to TS repo
209
+ - This is a living project - update understanding as you learn
210
+
211
+ ---
212
+
213
+ *Derived from CLAUDE.md - Last Updated: November 18, 2025*
214
+
.gitignore CHANGED
@@ -217,3 +217,4 @@ dev/
217
  # Development notes and sensitive documentation
218
  dev_notes/
219
  *.tml
 
 
217
  # Development notes and sensitive documentation
218
  dev_notes/
219
  *.tml
220
+ scratch/
CHAT_ARCHITECTURE_PLAN.md DELETED
@@ -1,1408 +0,0 @@
1
- # AI-Centric Chat Application Architecture Plan
2
- ## Transformation from Linear Workflow to Conversational Demo Builder
3
-
4
- **Date:** 2025-11-12
5
- **Purpose:** Transform the current button-based demo builder into a conversational AI agent that guides users through demo creation with approval gates and iterative refinement.
6
-
7
- ---
8
-
9
- ## Executive Summary
10
-
11
- Transform the existing linear workflow application into an **AI-powered conversational demo builder** where:
12
- - Users chat naturally with an AI agent to create demos
13
- - The AI interprets intent, executes actions, and asks for approval
14
- - Each stage requires explicit user approval before proceeding
15
- - Users can iterate and refine outputs at any stage
16
- - The system maintains context and state across the conversation
17
-
18
- ---
19
-
20
- ## Current Architecture Analysis
21
-
22
- ### Existing Components
23
- ```
24
- demo_prep.py (3581 lines)
25
- ├── UI Layer: Gradio interface with button-based workflow
26
- ├── Workflow Handler: progressive_workflow_handler()
27
- ├── Stage Execution: Linear progression through stages
28
- └── State Management: DemoBuilder class
29
-
30
- Supporting Modules:
31
- ├── demo_builder_class.py - State management
32
- ├── main_research.py - Company/industry research
33
- ├── schema_utils.py - DDL parsing & validation
34
- ├── cdw_connector.py - Snowflake deployment
35
- ├── thoughtspot_deployer.py - TS object creation
36
- ├── liveboard_creator.py - Visualization generation
37
- ├── model_semantic_updater.py - Model enhancement
38
- └── demo_personas.py - Business context
39
- ```
40
-
41
- ### Current Workflow Stages
42
- 1. **Research** → Company analysis + Industry research
43
- 2. **Create DDL** → Generate Snowflake schema
44
- 3. **Create Population Code** → Generate data scripts
45
- 4. **Deploy** → Push to Snowflake + Create TS connection/tables
46
- 5. **Create Model** → Generate semantic model
47
- 6. **Create Liveboard** → Generate visualizations
48
-
49
- ### Current Limitations
50
- - ❌ No approval gates - everything auto-advances
51
- - ❌ Limited iteration - can only "redo" entire stages
52
- - ❌ No conversational refinement
53
- - ❌ Button-based interaction (not chat)
54
- - ❌ No AI interpretation of user intent
55
- - ❌ Hard to make targeted changes (e.g., "change this one visualization")
56
-
57
- ---
58
-
59
- ## New Chat-Based Architecture
60
-
61
- ### Core Concept: AI Controller Pattern
62
-
63
- ```
64
- ┌─────────────────────────────────────────────────────┐
65
- │ USER INPUT │
66
- │ (Natural Language Chat) │
67
- └────────────────────┬────────────────────────────────┘
68
-
69
-
70
- ┌─────────────────────────────────────────────────────┐
71
- │ INTENT CLASSIFIER │
72
- │ (AI Router - determines what user wants) │
73
- │ │
74
- │ Categories: │
75
- │ • stage_advance (ready to move forward) │
76
- │ • stage_approval (approve current output) │
77
- │ • stage_rejection (redo current output) │
78
- │ • refinement_request (modify specific aspect) │
79
- │ • information_request (explain/show something) │
80
- │ • configuration_change (update parameters) │
81
- └────────────────────┬────────────────────────────────┘
82
-
83
-
84
- ┌─────────────────────────────────────────────────────┐
85
- │ CONVERSATION CONTROLLER │
86
- │ (Manages workflow state & execution) │
87
- │ │
88
- │ Responsibilities: │
89
- │ • Maintains conversation context │
90
- │ • Tracks current stage & substage │
91
- │ • Determines when stage is "complete" │
92
- │ • Executes appropriate actions │
93
- │ • Manages approval gates │
94
- │ • Handles error recovery │
95
- └────────────────────┬────────────────────────────────┘
96
-
97
-
98
- ┌─────────────────────────────────────────────────────┐
99
- │ STAGE EXECUTORS │
100
- │ (Specialized handlers for each workflow stage) │
101
- │ │
102
- │ • ResearchExecutor │
103
- │ • DDLExecutor │
104
- │ • PopulationExecutor │
105
- │ • DeploymentExecutor │
106
- │ • ModelExecutor │
107
- │ • LiveboardExecutor │
108
- │ • VisualizationRefiner (new!) │
109
- │ • SiteCreator (new!) │
110
- │ • BotCreator (new!) │
111
- └────────────────────┬────────────────────────────────┘
112
-
113
-
114
- ┌─────────────────────────────────────────────────────┐
115
- │ RESPONSE GENERATOR │
116
- │ (Formats output for user in chat) │
117
- │ │
118
- │ • Streaming responses │
119
- │ • Rich formatting (code blocks, tables) │
120
- │ • Action prompts ("What would you like to do?") │
121
- │ • Progress indicators │
122
- └─────────────────────────────────────────────────────┘
123
- ```
124
-
125
- ---
126
-
127
- ## Detailed Component Design
128
-
129
- ### 1. Intent Classifier (New Component)
130
-
131
- **Purpose:** Interpret user's natural language to determine their intent
132
-
133
- **Implementation:**
134
- ```python
135
- class IntentClassifier:
136
- """
137
- Uses LLM to classify user intent and extract parameters
138
- """
139
-
140
- def __init__(self, llm_provider="anthropic", model="claude-sonnet-4.5"):
141
- self.provider = llm_provider
142
- self.model = model
143
-
144
- def classify_intent(self, user_message: str, conversation_context: ConversationContext) -> Intent:
145
- """
146
- Returns Intent object with:
147
- - intent_type: enum (APPROVE, REJECT, REFINE, ADVANCE, INFO, CONFIGURE)
148
- - confidence: float
149
- - parameters: dict (extracted entities)
150
- - reasoning: str (why this classification)
151
- """
152
-
153
- system_prompt = f"""You are an intent classifier for a demo preparation workflow.
154
-
155
- Current Stage: {conversation_context.current_stage}
156
- Current Stage Status: {conversation_context.stage_status}
157
- Available Actions: {conversation_context.get_available_actions()}
158
-
159
- Classify the user's intent into one of these categories:
160
-
161
- 1. APPROVE - User approves current output and wants to proceed
162
- Examples: "looks good", "approve", "yes proceed", "let's move on"
163
-
164
- 2. REJECT - User wants to redo current stage with changes
165
- Examples: "no that's not right", "redo the DDL", "try again with..."
166
-
167
- 3. REFINE - User wants to modify specific aspect of current output
168
- Examples: "change the customer table", "make this visualization a bar chart"
169
-
170
- 4. ADVANCE - User wants to move to next stage (when no approval needed)
171
- Examples: "create the population code", "let's do the liveboard"
172
-
173
- 5. INFO - User wants information or explanation
174
- Examples: "show me the DDL", "what tables did you create", "explain this"
175
-
176
- 6. CONFIGURE - User wants to change settings/parameters
177
- Examples: "use GPT-4 instead", "increase data volume", "change company to Acme"
178
-
179
- Extract relevant parameters and return structured JSON.
180
- """
181
-
182
- # Make LLM call to classify intent
183
- response = self.llm_call(system_prompt, user_message)
184
- return Intent.from_json(response)
185
- ```
186
-
187
- **Key Features:**
188
- - Context-aware (knows current stage)
189
- - Extracts parameters (e.g., which table to modify, new company name)
190
- - Confidence scoring
191
- - Falls back to clarification questions if ambiguous
192
-
193
- ---
194
-
195
- ### 2. Conversation Controller (Enhanced DemoBuilder)
196
-
197
- **Purpose:** Orchestrates the entire demo creation conversation
198
-
199
- **Implementation:**
200
- ```python
201
- class ConversationController:
202
- """
203
- Main orchestrator for chat-based demo creation
204
- Inherits from DemoBuilder, adds conversational logic
205
- """
206
-
207
- def __init__(self, use_case: str = None, company_url: str = None):
208
- # State tracking
209
- self.conversation_history: List[Message] = []
210
- self.current_stage: Stage = Stage.INITIALIZATION
211
- self.stage_status: StageStatus = StageStatus.NOT_STARTED
212
-
213
- # Approval tracking
214
- self.pending_approval: Optional[ApprovalRequest] = None
215
- self.stage_outputs: Dict[Stage, Any] = {}
216
-
217
- # Context for AI decision making
218
- self.conversation_context = ConversationContext()
219
-
220
- # Executors for each stage
221
- self.executors = {
222
- Stage.RESEARCH: ResearchExecutor(),
223
- Stage.CREATE_DDL: DDLExecutor(),
224
- Stage.CREATE_POPULATION: PopulationExecutor(),
225
- Stage.DEPLOY: DeploymentExecutor(),
226
- Stage.CREATE_MODEL: ModelExecutor(),
227
- Stage.CREATE_LIVEBOARD: LiveboardExecutor(),
228
- Stage.REFINE_VIZS: VisualizationRefiner(),
229
- Stage.CREATE_SITE: SiteCreator(),
230
- Stage.CREATE_BOT: BotCreator()
231
- }
232
-
233
- async def process_message(self, user_message: str) -> AsyncGenerator[str, None]:
234
- """
235
- Main entry point for processing user messages
236
- Yields streaming responses
237
- """
238
- # 1. Add to conversation history
239
- self.conversation_history.append(Message(role="user", content=user_message))
240
-
241
- # 2. Classify intent
242
- intent = await self.intent_classifier.classify_intent(
243
- user_message,
244
- self.conversation_context
245
- )
246
-
247
- # 3. Update context
248
- self.conversation_context.update(intent, self.current_stage)
249
-
250
- # 4. Route to appropriate handler
251
- if intent.type == IntentType.APPROVE:
252
- async for response in self.handle_approval(intent):
253
- yield response
254
-
255
- elif intent.type == IntentType.REJECT:
256
- async for response in self.handle_rejection(intent):
257
- yield response
258
-
259
- elif intent.type == IntentType.REFINE:
260
- async for response in self.handle_refinement(intent):
261
- yield response
262
-
263
- elif intent.type == IntentType.ADVANCE:
264
- async for response in self.handle_stage_advance(intent):
265
- yield response
266
-
267
- elif intent.type == IntentType.INFO:
268
- async for response in self.handle_info_request(intent):
269
- yield response
270
-
271
- elif intent.type == IntentType.CONFIGURE:
272
- async for response in self.handle_configuration(intent):
273
- yield response
274
-
275
- # 5. Determine next action and prompt user
276
- next_prompt = self.get_next_action_prompt()
277
- yield f"\n\n{next_prompt}"
278
-
279
- def should_request_approval(self, stage: Stage) -> bool:
280
- """
281
- Determines if a stage requires user approval before proceeding
282
- """
283
- approval_required_stages = [
284
- Stage.RESEARCH,
285
- Stage.CREATE_DDL,
286
- Stage.CREATE_POPULATION,
287
- Stage.CREATE_MODEL,
288
- Stage.CREATE_LIVEBOARD
289
- ]
290
- return stage in approval_required_stages
291
-
292
- async def handle_approval(self, intent: Intent) -> AsyncGenerator[str, None]:
293
- """
294
- User approved current stage output
295
- """
296
- if not self.pending_approval:
297
- yield "There's nothing pending approval right now. What would you like to do?"
298
- return
299
-
300
- # Mark stage as approved
301
- self.stage_outputs[self.current_stage].status = OutputStatus.APPROVED
302
- yield f"✅ Great! {self.current_stage.display_name} approved.\n\n"
303
-
304
- # Advance to next stage
305
- next_stage = self.get_next_stage()
306
- if next_stage:
307
- yield f"Moving to: **{next_stage.display_name}**\n"
308
- self.current_stage = next_stage
309
-
310
- # Start next stage automatically or ask what to do
311
- if self.should_auto_start(next_stage):
312
- async for response in self.execute_stage(next_stage):
313
- yield response
314
- else:
315
- yield f"Ready to start {next_stage.display_name}. Just say 'go' or 'start' when ready!"
316
- else:
317
- yield "🎉 All stages complete! Your demo is ready."
318
-
319
- async def handle_rejection(self, intent: Intent) -> AsyncGenerator[str, None]:
320
- """
321
- User rejected current stage output and wants to redo
322
- """
323
- yield f"🔄 Got it, let me redo the {self.current_stage.display_name}.\n\n"
324
-
325
- # Extract what to change from intent parameters
326
- changes = intent.parameters.get('requested_changes', '')
327
- if changes:
328
- yield f"Incorporating your feedback: {changes}\n\n"
329
-
330
- # Re-execute current stage with modifications
331
- self.stage_status = StageStatus.IN_PROGRESS
332
- executor = self.executors[self.current_stage]
333
-
334
- async for response in executor.execute(
335
- context=self.get_execution_context(),
336
- modifications=intent.parameters
337
- ):
338
- yield response
339
-
340
- # Request approval again
341
- self.pending_approval = ApprovalRequest(stage=self.current_stage)
342
- yield "\n\n" + self.format_approval_request()
343
-
344
- async def handle_refinement(self, intent: Intent) -> AsyncGenerator[str, None]:
345
- """
346
- User wants to refine specific aspect of current output
347
- """
348
- target = intent.parameters.get('target') # e.g., "customer_table", "viz_3"
349
- modification = intent.parameters.get('modification') # e.g., "add email column"
350
-
351
- yield f"🎨 Refining {target}...\n\n"
352
-
353
- # Use specialized refiner based on what's being modified
354
- if self.current_stage == Stage.CREATE_DDL:
355
- async for response in self.refine_ddl(target, modification):
356
- yield response
357
-
358
- elif self.current_stage == Stage.CREATE_LIVEBOARD:
359
- async for response in self.refine_visualization(target, modification):
360
- yield response
361
-
362
- # Show updated output and ask if good now
363
- yield "\n\nHere's the updated version. How does this look?"
364
-
365
- def get_next_action_prompt(self) -> str:
366
- """
367
- Returns context-appropriate prompt for user's next action
368
- """
369
- if self.pending_approval:
370
- return "👉 **Please review and approve**, or tell me what to change."
371
-
372
- if self.stage_status == StageStatus.COMPLETE:
373
- next_stage = self.get_next_stage()
374
- if next_stage:
375
- return f"👉 **Ready for {next_stage.display_name}?** Say 'yes' to continue or ask me questions."
376
- else:
377
- return "🎉 **Demo complete!** What would you like to do next?"
378
-
379
- if self.stage_status == StageStatus.ERROR:
380
- return "❌ **There was an error.** Would you like me to try again?"
381
-
382
- return "💬 **What would you like to do?**"
383
- ```
384
-
385
- ---
386
-
387
- ### 3. Stage Executors (Specialized Handlers)
388
-
389
- **Purpose:** Each stage has a dedicated executor that knows how to perform that specific task
390
-
391
- **Base Executor Interface:**
392
- ```python
393
- from abc import ABC, abstractmethod
394
-
395
- class StageExecutor(ABC):
396
- """Base class for all stage executors"""
397
-
398
- @abstractmethod
399
- async def execute(
400
- self,
401
- context: ExecutionContext,
402
- modifications: Optional[Dict] = None
403
- ) -> AsyncGenerator[str, None]:
404
- """
405
- Execute this stage
406
- Yields streaming responses
407
- """
408
- pass
409
-
410
- @abstractmethod
411
- def can_refine(self, target: str) -> bool:
412
- """
413
- Returns True if this executor can refine the specified target
414
- """
415
- pass
416
-
417
- @abstractmethod
418
- async def refine(
419
- self,
420
- target: str,
421
- modification: str,
422
- context: ExecutionContext
423
- ) -> AsyncGenerator[str, None]:
424
- """
425
- Refine specific aspect of this stage's output
426
- """
427
- pass
428
- ```
429
-
430
- **Example: DDL Executor**
431
- ```python
432
- class DDLExecutor(StageExecutor):
433
- """
434
- Handles DDL generation with intelligent refinement
435
- """
436
-
437
- async def execute(
438
- self,
439
- context: ExecutionContext,
440
- modifications: Optional[Dict] = None
441
- ) -> AsyncGenerator[str, None]:
442
- """
443
- Generate DDL from research context
444
- """
445
- yield "## 🏗️ Creating Database Schema\n\n"
446
-
447
- # Build prompt with research context
448
- prompt = self.build_ddl_prompt(
449
- research=context.research_results,
450
- use_case=context.use_case,
451
- modifications=modifications
452
- )
453
-
454
- # Stream DDL generation
455
- yield "```sql\n"
456
-
457
- ddl_content = ""
458
- async for chunk in context.llm.stream(prompt):
459
- ddl_content += chunk
460
- yield chunk
461
-
462
- yield "\n```\n\n"
463
-
464
- # Validate DDL
465
- is_valid, validation_msg = validate_ddl_syntax(ddl_content)
466
- if is_valid:
467
- yield f"✅ DDL validation passed\n\n"
468
-
469
- # Parse and show table summary
470
- tables = parse_ddl_schema(ddl_content)
471
- yield f"**Generated {len(tables)} tables:**\n"
472
- for table_name, table_info in tables.items():
473
- yield f"- `{table_name}` ({len(table_info['columns'])} columns)\n"
474
- else:
475
- yield f"⚠️ Validation warning: {validation_msg}\n\n"
476
-
477
- # Store in context
478
- context.store_output(Stage.CREATE_DDL, {
479
- 'ddl': ddl_content,
480
- 'tables': tables,
481
- 'validation': validation_msg
482
- })
483
-
484
- def can_refine(self, target: str) -> bool:
485
- """Check if target is a table name or column"""
486
- tables = self.context.get_output(Stage.CREATE_DDL).get('tables', {})
487
-
488
- # Check if it's a table name
489
- if target.lower() in [t.lower() for t in tables.keys()]:
490
- return True
491
-
492
- # Check if it's a column in format "table.column"
493
- if '.' in target:
494
- table, column = target.split('.')
495
- if table in tables and column in tables[table]['columns']:
496
- return True
497
-
498
- return False
499
-
500
- async def refine(
501
- self,
502
- target: str,
503
- modification: str,
504
- context: ExecutionContext
505
- ) -> AsyncGenerator[str, None]:
506
- """
507
- Refine specific table or column
508
- """
509
- current_ddl = context.get_output(Stage.CREATE_DDL)['ddl']
510
-
511
- # Use LLM to modify just the target portion
512
- prompt = f"""You are modifying a SQL DDL schema.
513
-
514
- Current DDL:
515
- {current_ddl}
516
-
517
- User wants to modify: {target}
518
- Requested change: {modification}
519
-
520
- Return the COMPLETE updated DDL with the changes applied.
521
- Maintain all other tables unchanged.
522
- Only modify what was requested.
523
- """
524
-
525
- yield f"Updating {target}...\n\n"
526
- yield "```sql\n"
527
-
528
- updated_ddl = ""
529
- async for chunk in context.llm.stream(prompt):
530
- updated_ddl += chunk
531
- yield chunk
532
-
533
- yield "\n```\n\n"
534
-
535
- # Update stored output
536
- context.store_output(Stage.CREATE_DDL, {'ddl': updated_ddl})
537
-
538
- yield f"✅ Updated {target}\n"
539
- ```
540
-
541
- **Example: Visualization Refiner (New)**
542
- ```python
543
- class VisualizationRefiner(StageExecutor):
544
- """
545
- Specialized executor for refining visualizations
546
- """
547
-
548
- async def refine(
549
- self,
550
- target: str, # e.g., "viz_3" or "revenue over time chart"
551
- modification: str, # e.g., "change to bar chart" or "add region filter"
552
- context: ExecutionContext
553
- ) -> AsyncGenerator[str, None]:
554
- """
555
- Intelligently refine a specific visualization
556
- """
557
- liveboard = context.get_output(Stage.CREATE_LIVEBOARD)
558
-
559
- # Find the target visualization
560
- viz_index = self.find_visualization(target, liveboard)
561
- if viz_index is None:
562
- yield f"❌ Couldn't find visualization: {target}\n"
563
- return
564
-
565
- current_viz = liveboard['visualizations'][viz_index]
566
-
567
- yield f"🎨 Refining visualization: **{current_viz['title']}**\n\n"
568
-
569
- # Classify what kind of modification is needed
570
- modification_type = await self.classify_modification(modification)
571
-
572
- if modification_type == 'CHART_TYPE':
573
- # Change chart type
574
- new_chart_type = extract_chart_type(modification)
575
- yield f"Changing from {current_viz['chart_type']} to {new_chart_type}...\n"
576
-
577
- # Regenerate with new chart type
578
- updated_viz = await self.regenerate_viz_with_type(
579
- current_viz,
580
- new_chart_type,
581
- context
582
- )
583
-
584
- elif modification_type == 'DATA_FILTER':
585
- # Add/modify filter
586
- yield f"Adding filter: {modification}\n"
587
- updated_viz = await self.add_viz_filter(current_viz, modification, context)
588
-
589
- elif modification_type == 'MEASURE_CHANGE':
590
- # Change measure/dimension
591
- yield f"Updating data fields...\n"
592
- updated_viz = await self.update_viz_fields(current_viz, modification, context)
593
-
594
- # Update liveboard
595
- liveboard['visualizations'][viz_index] = updated_viz
596
- context.store_output(Stage.CREATE_LIVEBOARD, liveboard)
597
-
598
- # Show preview
599
- yield "\n**Updated Visualization:**\n"
600
- yield self.format_viz_preview(updated_viz)
601
- yield "\n"
602
- ```
603
-
604
- ---
605
-
606
- ### 4. Enhanced State Management
607
-
608
- **New Data Models:**
609
-
610
- ```python
611
- from enum import Enum
612
- from dataclasses import dataclass
613
- from typing import List, Dict, Optional, Any
614
- from datetime import datetime
615
-
616
- class Stage(Enum):
617
- """Workflow stages"""
618
- INITIALIZATION = "initialization"
619
- RESEARCH = "research"
620
- CREATE_DDL = "create_ddl"
621
- CREATE_POPULATION = "create_population"
622
- DEPLOY = "deploy"
623
- CREATE_MODEL = "create_model"
624
- CREATE_LIVEBOARD = "create_liveboard"
625
- REFINE_VIZS = "refine_visualizations"
626
- CREATE_SITE = "create_site"
627
- CREATE_BOT = "create_bot"
628
- COMPLETE = "complete"
629
-
630
- @property
631
- def display_name(self) -> str:
632
- names = {
633
- Stage.INITIALIZATION: "Setup",
634
- Stage.RESEARCH: "Research",
635
- Stage.CREATE_DDL: "DDL Creation",
636
- Stage.CREATE_POPULATION: "Population Code",
637
- Stage.DEPLOY: "Deployment",
638
- Stage.CREATE_MODEL: "Model Creation",
639
- Stage.CREATE_LIVEBOARD: "Liveboard Creation",
640
- Stage.REFINE_VIZS: "Visualization Refinement",
641
- Stage.CREATE_SITE: "Site Creation",
642
- Stage.CREATE_BOT: "Bot Creation",
643
- Stage.COMPLETE: "Complete"
644
- }
645
- return names.get(self, self.value)
646
-
647
- class StageStatus(Enum):
648
- """Status of current stage"""
649
- NOT_STARTED = "not_started"
650
- IN_PROGRESS = "in_progress"
651
- AWAITING_APPROVAL = "awaiting_approval"
652
- APPROVED = "approved"
653
- REJECTED = "rejected"
654
- COMPLETE = "complete"
655
- ERROR = "error"
656
-
657
- class IntentType(Enum):
658
- """Types of user intent"""
659
- APPROVE = "approve"
660
- REJECT = "reject"
661
- REFINE = "refine"
662
- ADVANCE = "advance"
663
- INFO = "info"
664
- CONFIGURE = "configure"
665
- CLARIFICATION_NEEDED = "clarification_needed"
666
-
667
- @dataclass
668
- class Intent:
669
- """User intent classification result"""
670
- type: IntentType
671
- confidence: float
672
- parameters: Dict[str, Any]
673
- reasoning: str
674
-
675
- @classmethod
676
- def from_json(cls, json_data: Dict) -> 'Intent':
677
- return cls(
678
- type=IntentType(json_data['type']),
679
- confidence=json_data.get('confidence', 0.0),
680
- parameters=json_data.get('parameters', {}),
681
- reasoning=json_data.get('reasoning', '')
682
- )
683
-
684
- @dataclass
685
- class Message:
686
- """Conversation message"""
687
- role: str # "user" or "assistant"
688
- content: str
689
- timestamp: datetime = None
690
- metadata: Dict = None
691
-
692
- def __post_init__(self):
693
- if self.timestamp is None:
694
- self.timestamp = datetime.now()
695
- if self.metadata is None:
696
- self.metadata = {}
697
-
698
- @dataclass
699
- class ApprovalRequest:
700
- """Pending approval for stage output"""
701
- stage: Stage
702
- output: Any
703
- timestamp: datetime = None
704
-
705
- def __post_init__(self):
706
- if self.timestamp is None:
707
- self.timestamp = datetime.now()
708
-
709
- class ConversationContext:
710
- """
711
- Maintains context for the conversation controller
712
- """
713
- def __init__(self):
714
- self.current_stage: Stage = Stage.INITIALIZATION
715
- self.stage_history: List[Stage] = []
716
- self.user_preferences: Dict = {}
717
- self.last_n_messages: List[Message] = []
718
-
719
- def update(self, intent: Intent, current_stage: Stage):
720
- """Update context based on new intent"""
721
- self.last_n_messages.append(Message(
722
- role="system",
723
- content=f"Intent: {intent.type.value}",
724
- metadata={'intent': intent}
725
- ))
726
-
727
- if len(self.last_n_messages) > 20:
728
- self.last_n_messages = self.last_n_messages[-20:]
729
-
730
- def get_available_actions(self) -> List[str]:
731
- """Returns list of actions available in current state"""
732
- actions = []
733
-
734
- if self.current_stage == Stage.INITIALIZATION:
735
- actions = ["configure", "start_research"]
736
- elif self.current_stage == Stage.RESEARCH:
737
- actions = ["approve", "reject", "show_details"]
738
- # ... etc for each stage
739
-
740
- return actions
741
- ```
742
-
743
- ---
744
-
745
- ### 5. Chat UI Design (Gradio)
746
-
747
- **New Interface Layout:**
748
-
749
- ```python
750
- def create_chat_interface():
751
- """
752
- Create conversational demo builder interface
753
- """
754
-
755
- with gr.Blocks(title="ThoughtSpot Demo Builder - Chat", theme=gr.themes.Soft()) as interface:
756
-
757
- # Initialize conversation controller
758
- controller = gr.State(ConversationController())
759
-
760
- with gr.Row():
761
- with gr.Column(scale=2):
762
- # Main chat interface
763
- gr.Markdown("# 💬 ThoughtSpot Demo Builder")
764
- gr.Markdown("Let's create an amazing demo together! Tell me about your company...")
765
-
766
- chatbot = gr.Chatbot(
767
- value=[],
768
- height=600,
769
- label="Demo Builder Assistant",
770
- avatar_images=[None, "🤖"]
771
- )
772
-
773
- with gr.Row():
774
- msg = gr.Textbox(
775
- label="Your message",
776
- placeholder="Type your message here... (e.g., 'Create a demo for Acme Corp in retail')",
777
- lines=2,
778
- scale=4
779
- )
780
- submit = gr.Button("Send", scale=1, variant="primary")
781
-
782
- # Quick action buttons (context-aware)
783
- with gr.Row(visible=True) as quick_actions:
784
- approve_btn = gr.Button("✅ Approve", visible=False)
785
- reject_btn = gr.Button("❌ Redo", visible=False)
786
- next_btn = gr.Button("➡️ Next Stage", visible=False)
787
-
788
- with gr.Column(scale=1):
789
- # Progress and context sidebar
790
- gr.Markdown("## 📊 Progress")
791
-
792
- stage_display = gr.Markdown("**Current Stage:** Initialization")
793
-
794
- # Visual progress tracker
795
- progress_html = gr.HTML("""
796
- <div style='padding: 10px;'>
797
- <div class='stage-item'>⚪ Research</div>
798
- <div class='stage-item'>⚪ Create DDL</div>
799
- <div class='stage-item'>⚪ Create Population</div>
800
- <div class='stage-item'>⚪ Deploy</div>
801
- <div class='stage-item'>⚪ Create Model</div>
802
- <div class='stage-item'>⚪ Create Liveboard</div>
803
- <div class='stage-item'>⚪ Refine</div>
804
- </div>
805
- """)
806
-
807
- gr.Markdown("## ⚙️ Current Settings")
808
- settings_display = gr.JSON(
809
- value={
810
- "company": "Not set",
811
- "use_case": "Not set",
812
- "llm": "claude-sonnet-4.5"
813
- },
814
- label="Configuration"
815
- )
816
-
817
- gr.Markdown("## 📁 Generated Assets")
818
- assets_list = gr.HTML("<i>No assets yet</i>")
819
-
820
- # Chat message handler
821
- def respond(message, chat_history, controller_state):
822
- """Process user message and generate response"""
823
-
824
- # Add user message to chat
825
- chat_history.append((message, None))
826
-
827
- # Stream AI response
828
- ai_response = ""
829
- async for chunk in controller_state.process_message(message):
830
- ai_response += chunk
831
- chat_history[-1] = (message, ai_response)
832
- yield chat_history, controller_state, update_ui_state(controller_state)
833
-
834
- return chat_history, controller_state, update_ui_state(controller_state)
835
-
836
- def update_ui_state(controller_state):
837
- """
838
- Update all UI elements based on controller state
839
- Returns: (stage_display, progress_html, settings, assets, quick_buttons_visibility)
840
- """
841
- # Update stage display
842
- stage_md = f"**Current Stage:** {controller_state.current_stage.display_name}"
843
-
844
- # Update progress HTML
845
- progress = generate_progress_html(controller_state)
846
-
847
- # Update settings
848
- settings = {
849
- "company": controller_state.company_url or "Not set",
850
- "use_case": controller_state.use_case or "Not set",
851
- "current_stage": controller_state.current_stage.value,
852
- "llm": controller_state.llm_provider
853
- }
854
-
855
- # Update assets
856
- assets = generate_assets_html(controller_state)
857
-
858
- # Update quick action button visibility
859
- show_approve = controller_state.stage_status == StageStatus.AWAITING_APPROVAL
860
- show_next = controller_state.stage_status == StageStatus.COMPLETE
861
-
862
- return (
863
- stage_md,
864
- progress,
865
- settings,
866
- assets,
867
- gr.update(visible=show_approve), # approve_btn
868
- gr.update(visible=show_approve), # reject_btn
869
- gr.update(visible=show_next) # next_btn
870
- )
871
-
872
- # Wire up handlers
873
- msg.submit(
874
- fn=respond,
875
- inputs=[msg, chatbot, controller],
876
- outputs=[chatbot, controller, stage_display, progress_html, settings_display,
877
- assets_list, approve_btn, reject_btn, next_btn]
878
- )
879
-
880
- submit.click(
881
- fn=respond,
882
- inputs=[msg, chatbot, controller],
883
- outputs=[chatbot, controller, stage_display, progress_html, settings_display,
884
- assets_list, approve_btn, reject_btn, next_btn]
885
- )
886
-
887
- # Quick action buttons
888
- approve_btn.click(
889
- fn=lambda ch, ctrl: respond("I approve this", ch, ctrl),
890
- inputs=[chatbot, controller],
891
- outputs=[chatbot, controller, stage_display, progress_html]
892
- )
893
-
894
- reject_btn.click(
895
- fn=lambda ch, ctrl: respond("Please redo this", ch, ctrl),
896
- inputs=[chatbot, controller],
897
- outputs=[chatbot, controller, stage_display, progress_html]
898
- )
899
-
900
- return interface
901
- ```
902
-
903
- ---
904
-
905
- ## Example Conversation Flows
906
-
907
- ### Flow 1: Happy Path (All Approvals)
908
-
909
- ```
910
- USER: Create a demo for Amazon.com focused on supply chain analytics
911
-
912
- AI: Great! Let me start by researching Amazon and supply chain analytics.
913
- [Streaming research results...]
914
-
915
- ✅ Research complete! I found:
916
- - Amazon is a global e-commerce and cloud computing leader
917
- - Key supply chain challenges: last-mile delivery, inventory optimization
918
- - Focus areas: predictive demand, warehouse efficiency
919
-
920
- 👉 Please review the research. Reply "approve" to proceed or tell me what to adjust.
921
-
922
- USER: looks good
923
-
924
- AI: ✅ Research approved!
925
-
926
- Moving to: **DDL Creation**
927
-
928
- Creating database schema for supply chain analytics...
929
- [Streaming SQL DDL...]
930
-
931
- ✅ Generated 5 tables:
932
- - `dim_products` (12 columns)
933
- - `dim_warehouses` (8 columns)
934
- - `dim_suppliers` (10 columns)
935
- - `dim_dates` (7 columns)
936
- - `fact_inventory_movements` (15 columns)
937
-
938
- 👉 Please review the DDL. Approve or request changes.
939
-
940
- USER: Can you add a "category" column to the products table?
941
-
942
- AI: 🎨 Refining dim_products table...
943
-
944
- ```sql
945
- CREATE TABLE dim_products (
946
- product_id INT IDENTITY(1,1) PRIMARY KEY,
947
- product_name VARCHAR(200) NOT NULL,
948
- sku VARCHAR(50) UNIQUE,
949
- category VARCHAR(100), -- ✨ Added
950
- subcategory VARCHAR(100),
951
- ...
952
- );
953
- ```
954
-
955
- ✅ Updated dim_products with category column
956
-
957
- How does this look?
958
-
959
- USER: perfect, approve
960
-
961
- AI: ✅ DDL approved!
962
-
963
- Moving to: **Population Code**
964
-
965
- Generating realistic supply chain data with strategic outliers...
966
- [Streaming Python code...]
967
-
968
- 📊 Population script ready:
969
- - 5,000 products
970
- - 20 warehouses
971
- - 100 suppliers
972
- - 50,000+ inventory movements
973
- - 5 strategic outliers for demos
974
-
975
- 👉 Approve to proceed with deployment?
976
-
977
- USER: yes
978
-
979
- AI: ✅ Population code approved!
980
-
981
- 🚀 **Deploying to Snowflake...**
982
- - Creating schema DM251112_143052_AMAZO_SUP...
983
- - Executing DDL... ✅
984
- - Populating data... ✅ (5,000 rows inserted)
985
-
986
- 🎯 **Creating ThoughtSpot objects...**
987
- - Connection... ✅
988
- - Tables imported... ✅
989
- - Model generated... ✅
990
-
991
- 📊 **Creating Liveboard...**
992
- Generated 6 visualizations:
993
- 1. Inventory Levels by Warehouse (KPI)
994
- 2. Stock Movement Trends (Line Chart)
995
- 3. Top Products by Turnover (Bar Chart)
996
- 4. Supplier Performance (Scatter)
997
- 5. Warehouse Capacity Utilization (Stacked Column)
998
- 6. Geographic Distribution (Geo Map)
999
-
1000
- 🎉 **Demo complete!**
1001
-
1002
- What would you like to do next?
1003
- - Refine visualizations
1004
- - Create demo site
1005
- - Create demo bot
1006
- - Export assets
1007
- ```
1008
-
1009
- ---
1010
-
1011
- ### Flow 2: Refinement and Iteration
1012
-
1013
- ```
1014
- USER: The bar chart for top products doesn't look right. Can you make it show the top 10 instead of top 5?
1015
-
1016
- AI: 🎨 Refining visualization: **Top Products by Turnover**
1017
-
1018
- Updating to show top 10 products...
1019
-
1020
- **Updated Visualization:**
1021
- 📊 Top Products by Turnover (Bar Chart)
1022
- - Shows: Top 10 products
1023
- - Sorted by: Total turnover (descending)
1024
- - Time period: Last 90 days
1025
-
1026
- Better?
1027
-
1028
- USER: yes but can you also change it to a horizontal bar chart?
1029
-
1030
- AI: 🎨 Refining visualization: **Top Products by Turnover**
1031
-
1032
- Changing to horizontal bar chart...
1033
-
1034
- **Updated Visualization:**
1035
- 📊 Top Products by Turnover (Horizontal Bar)
1036
- - Shows: Top 10 products
1037
- - Orientation: Horizontal (easier to read long product names)
1038
- - Sorted by: Total turnover (descending)
1039
-
1040
- How's this?
1041
-
1042
- USER: perfect!
1043
-
1044
- AI: ✅ Great! Visualization updated.
1045
-
1046
- Your liveboard now has 6 visualizations, all looking good.
1047
-
1048
- What's next?
1049
- - Export demo assets
1050
- - Create a demo site
1051
- - Create a demo bot
1052
- - Or make more refinements
1053
- ```
1054
-
1055
- ---
1056
-
1057
- ### Flow 3: Error Recovery
1058
-
1059
- ```
1060
- USER: approve the DDL
1061
-
1062
- AI: ✅ DDL approved!
1063
-
1064
- Moving to: **Population Code**
1065
-
1066
- Generating population scripts...
1067
- [Streaming...]
1068
-
1069
- ❌ Syntax validation failed:
1070
- - Line 42: Invalid Snowflake date format
1071
- - Line 156: Missing table reference
1072
-
1073
- 🔄 Let me fix these issues and regenerate...
1074
-
1075
- [Streaming corrected code...]
1076
-
1077
- ✅ Validation passed!
1078
-
1079
- 📊 Population script ready (corrected):
1080
- - Fixed date formatting to Snowflake syntax
1081
- - Added missing foreign key references
1082
-
1083
- 👉 Approve to deploy?
1084
-
1085
- USER: yes deploy
1086
-
1087
- AI: 🚀 **Deploying to Snowflake...**
1088
- - Creating schema... ✅
1089
- - Executing DDL... ✅
1090
- - Populating data... ⚠️
1091
-
1092
- ❌ Error during population:
1093
- "Foreign key constraint violation on line 1,245"
1094
-
1095
- This usually means dimension data wasn't inserted before fact table.
1096
-
1097
- 🔄 Would you like me to:
1098
- 1. Fix the insert order and retry
1099
- 2. Show you the error details
1100
- 3. Regenerate the population code
1101
-
1102
- What would you prefer?
1103
-
1104
- USER: fix and retry
1105
-
1106
- AI: 🔧 Fixing insert order...
1107
-
1108
- Problem identified: Fact table inserts attempted before dimension tables completed.
1109
-
1110
- Solution: Reordering to insert dimensions first, then facts.
1111
-
1112
- ♻️ Retrying deployment...
1113
- - Dropping existing tables... ✅
1114
- - Recreating schema... ✅
1115
- - Inserting dimensions... ✅
1116
- - Inserting facts... ✅
1117
-
1118
- ✅ **Deployment successful!**
1119
-
1120
- All 50,000+ rows inserted successfully.
1121
-
1122
- Moving to ThoughtSpot object creation...
1123
- ```
1124
-
1125
- ---
1126
-
1127
- ## Migration Strategy
1128
-
1129
- ### Phase 1: Foundation (Week 1-2)
1130
- **Goal:** Build core chat infrastructure without breaking existing functionality
1131
-
1132
- 1. **Create new components** (parallel to existing code):
1133
- - `intent_classifier.py` - Intent classification
1134
- - `conversation_controller.py` - Main orchestrator
1135
- - `stage_executors/` - Directory with executor classes
1136
- - `conversation_models.py` - New data models
1137
-
1138
- 2. **Add chat UI** (new tab in existing Gradio app):
1139
- - Keep existing button UI
1140
- - Add new "Chat Mode" tab
1141
- - Wire up basic chat → existing workflow
1142
- - No approval gates yet
1143
-
1144
- 3. **Test basic flow**:
1145
- - User can chat to trigger stages
1146
- - Stages execute using existing code
1147
- - Responses formatted nicely in chat
1148
-
1149
- ### Phase 2: Intent Classification (Week 3)
1150
- **Goal:** AI interprets user intent accurately
1151
-
1152
- 1. **Implement IntentClassifier**:
1153
- - Create prompt templates for classification
1154
- - Test with various user inputs
1155
- - Handle ambiguous cases
1156
-
1157
- 2. **Add context tracking**:
1158
- - Track conversation history
1159
- - Maintain current stage/status
1160
- - Make intent classification context-aware
1161
-
1162
- 3. **Test intent accuracy**:
1163
- - Unit tests for common intents
1164
- - Edge cases (ambiguous, multi-intent)
1165
- - Confidence thresholds
1166
-
1167
- ### Phase 3: Approval Gates (Week 4)
1168
- **Goal:** Users must approve before advancing
1169
-
1170
- 1. **Add approval workflow**:
1171
- - After each stage, request approval
1172
- - Block advancement until approved
1173
- - Handle rejection → redo
1174
-
1175
- 2. **Implement approval UX**:
1176
- - Clear approval requests in chat
1177
- - Quick action buttons
1178
- - Timeout handling (auto-approve after N minutes?)
1179
-
1180
- 3. **Test approval flows**:
1181
- - Happy path (all approvals)
1182
- - Rejection scenarios
1183
- - Multiple iterations
1184
-
1185
- ### Phase 4: Refinement (Week 5-6)
1186
- **Goal:** Users can refine specific aspects
1187
-
1188
- 1. **Implement DDL refinement**:
1189
- - Target table/column modifications
1190
- - Schema constraint validation
1191
- - Partial regeneration
1192
-
1193
- 2. **Implement viz refinement**:
1194
- - Chart type changes
1195
- - Data field modifications
1196
- - Filter additions
1197
-
1198
- 3. **Implement population refinement**:
1199
- - Outlier adjustments
1200
- - Data volume changes
1201
- - Scenario modifications
1202
-
1203
- ### Phase 5: New Stages (Week 7-8)
1204
- **Goal:** Add site and bot creation
1205
-
1206
- 1. **Create SiteCreator executor**:
1207
- - Generate demo website HTML
1208
- - Use company branding
1209
- - Embed ThoughtSpot liveboards
1210
-
1211
- 2. **Create BotCreator executor**:
1212
- - Generate chatbot config
1213
- - Train on demo data
1214
- - Integrate with ThoughtSpot API
1215
-
1216
- 3. **Test end-to-end**:
1217
- - Full workflow from research to bot
1218
- - All approval gates working
1219
- - Refinement at each stage
1220
-
1221
- ### Phase 6: Polish & Optimize (Week 9-10)
1222
- **Goal:** Production-ready
1223
-
1224
- 1. **Error handling**:
1225
- - Graceful failures
1226
- - Automatic retries
1227
- - User-friendly error messages
1228
-
1229
- 2. **Performance**:
1230
- - Streaming optimization
1231
- - Caching improvements
1232
- - Parallel execution where possible
1233
-
1234
- 3. **UX enhancements**:
1235
- - Better progress visualization
1236
- - Asset preview in chat
1237
- - Export/download in chat
1238
-
1239
- 4. **Documentation**:
1240
- - User guide
1241
- - Example conversations
1242
- - Troubleshooting
1243
-
1244
- ---
1245
-
1246
- ## Technical Considerations
1247
-
1248
- ### LLM Provider Strategy
1249
- - **Intent Classification**: Fast model (GPT-4o-mini or Claude Haiku)
1250
- - **Content Generation**: High-quality model (Claude Sonnet 4.5 or GPT-4o)
1251
- - **Refinement**: Mid-tier model (GPT-4o-mini for speed)
1252
-
1253
- ### State Persistence
1254
- - Save conversation state to database/file after each message
1255
- - Enable "resume" functionality
1256
- - Handle browser refresh gracefully
1257
-
1258
- ### Async/Streaming
1259
- - Use Python async/await throughout
1260
- - Stream all LLM responses
1261
- - Non-blocking stage execution
1262
-
1263
- ### Error Recovery
1264
- - Try/except around all stage executions
1265
- - Automatic retry logic (with exponential backoff)
1266
- - User-friendly error explanations
1267
- - Option to rollback to previous stage
1268
-
1269
- ### Testing Strategy
1270
- 1. **Unit tests** for each executor
1271
- 2. **Integration tests** for full workflows
1272
- 3. **Intent classification tests** with golden dataset
1273
- 4. **End-to-end tests** with real LLM calls
1274
- 5. **Performance tests** for streaming
1275
-
1276
- ---
1277
-
1278
- ## Success Metrics
1279
-
1280
- ### User Experience
1281
- - ✅ Users can complete a demo without reading docs
1282
- - ✅ Natural language commands work 95%+ of the time
1283
- - ✅ Approval gates prevent bad outputs from advancing
1284
- - ✅ Refinements work without full regeneration
1285
-
1286
- ### Technical
1287
- - ✅ Intent classification accuracy > 90%
1288
- - ✅ Average workflow completion time < 15 minutes
1289
- - ✅ Zero data loss from browser refresh
1290
- - ✅ Error recovery success rate > 80%
1291
-
1292
- ### Business
1293
- - ✅ Reduction in demo prep time by 50%
1294
- - ✅ Increase in demo quality (measured by win rate)
1295
- - ✅ Users prefer chat mode over button mode (survey)
1296
-
1297
- ---
1298
-
1299
- ## File Structure (New)
1300
-
1301
- ```
1302
- DemoPrep/
1303
- ├── demo_prep.py (existing - add chat tab)
1304
- ├── demo_builder_class.py (existing - enhance)
1305
-
1306
- ├── chat/
1307
- │ ├── __init__.py
1308
- │ ├── intent_classifier.py
1309
- │ ├── conversation_controller.py
1310
- │ ├── conversation_models.py
1311
- │ ├── ui.py
1312
- │ └── prompts/
1313
- │ ├── intent_classification.py
1314
- │ ├── clarification.py
1315
- │ └── approval_requests.py
1316
-
1317
- ├── executors/
1318
- │ ├── __init__.py
1319
- │ ├── base.py
1320
- │ ├── research_executor.py
1321
- │ ├── ddl_executor.py
1322
- │ ├── population_executor.py
1323
- │ ├── deployment_executor.py
1324
- │ ├── model_executor.py
1325
- │ ├── liveboard_executor.py
1326
- │ ├── visualization_refiner.py
1327
- │ ├── site_creator.py
1328
- │ └── bot_creator.py
1329
-
1330
- ├── (existing files unchanged)
1331
- │ ├── main_research.py
1332
- │ ├── schema_utils.py
1333
- │ ├── cdw_connector.py
1334
- │ ├── thoughtspot_deployer.py
1335
- │ ├── liveboard_creator.py
1336
- │ └── ...
1337
-
1338
- └── tests/
1339
- ├── test_intent_classifier.py
1340
- ├── test_conversation_controller.py
1341
- ├── test_executors.py
1342
- └── test_refinement.py
1343
- ```
1344
-
1345
- ---
1346
-
1347
- ## Risk Mitigation
1348
-
1349
- ### Risk 1: Intent Classification Accuracy
1350
- **Mitigation:**
1351
- - Start with clear examples in prompts
1352
- - Build golden dataset for testing
1353
- - Add clarification questions when confidence < 0.7
1354
- - Fallback to button UI if classification fails repeatedly
1355
-
1356
- ### Risk 2: User Confusion
1357
- **Mitigation:**
1358
- - Clear prompts about what to do next
1359
- - Quick action buttons as fallback
1360
- - Help command to explain current state
1361
- - Visual progress indicator
1362
-
1363
- ### Risk 3: Complex State Management
1364
- **Mitigation:**
1365
- - Use proven state management patterns
1366
- - Comprehensive logging
1367
- - Ability to export/import state
1368
- - Rollback functionality
1369
-
1370
- ### Risk 4: LLM Cost
1371
- **Mitigation:**
1372
- - Use cheaper models for classification
1373
- - Cache intent results when possible
1374
- - Optimize prompts for token efficiency
1375
- - Rate limiting and budgets
1376
-
1377
- ---
1378
-
1379
- ## Conclusion
1380
-
1381
- This architecture transforms your linear demo builder into an **intelligent conversational agent** that:
1382
-
1383
- 1. ✅ **Understands user intent** through natural language
1384
- 2. ✅ **Guides users** through the workflow with clear prompts
1385
- 3. ✅ **Requires approval** at key decision points
1386
- 4. ✅ **Enables refinement** without full regeneration
1387
- 5. ✅ **Handles errors** gracefully with recovery options
1388
- 6. ✅ **Maintains context** across the conversation
1389
- 7. ✅ **Streams responses** for better UX
1390
-
1391
- The migration can be done **incrementally** without breaking existing functionality, and the modular design allows for **easy extension** to new stages (site, bot creation) in the future.
1392
-
1393
- **Next Steps:**
1394
- 1. Review this plan and prioritize features
1395
- 2. Create detailed tickets for Phase 1
1396
- 3. Set up development branch
1397
- 4. Start with intent_classifier.py and basic chat UI
1398
- 5. Iterate based on user feedback
1399
-
1400
- ---
1401
-
1402
- **Questions for Discussion:**
1403
- 1. Which stages should require approval vs auto-advance?
1404
- 2. Should we use a separate LLM for intent vs content generation?
1405
- 3. Do you want to keep the button UI as an option or fully migrate to chat?
1406
- 4. What's the priority: refinement capability or new stages (site/bot)?
1407
- 5. Any specific visualization refinement features you want?
1408
-
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
CHAT_INTERFACE_GUIDE.md DELETED
@@ -1,321 +0,0 @@
1
- # Chat Interface Quick Guide
2
- ## New Conversational Demo Builder
3
-
4
- ---
5
-
6
- ## 🚀 Quick Start
7
-
8
- ### Run the Chat Interface
9
-
10
- ```bash
11
- cd /Users/mike.boone/cursor_demowire/DemoPrep
12
- source venv/bin/activate
13
- python launch_chat.py
14
- ```
15
-
16
- The interface will open at: **http://localhost:7861**
17
-
18
- ---
19
-
20
- ## 🎯 Interface Overview
21
-
22
- ### Layout
23
-
24
- ```
25
- ┌─────────────────────────────────────────────────────────┐
26
- │ ThoughtSpot Demo Builder - Chat │
27
- ├──────────────────────────────────┬──────────────────────┤
28
- │ │ 📊 Current Status │
29
- │ 💬 Chat Conversation │ ┌─────────────────┐ │
30
- │ ┌────────────────────────────┐ │ │ Stage: Init │ │
31
- │ │ 🤖: Welcome! I'm creating │ │ │ (read-only) │ │
32
- │ │ a perfect demo for... │ │ └─────────────────┘ │
33
- │ │ │ │ ┌─────────────────┐ │
34
- │ │ You: Start research │ │ │ Model: claude ▼ │ │
35
- │ │ │ │ │ (editable) │ │
36
- │ │ 🤖: Starting research... │ │ └─────────────────┘ │
37
- │ └────────────────────────────┘ │ │
38
- │ │ 🎯 Demo Settings │
39
- │ ┌────────────────────────────┐ │ Company: Amazon │
40
- │ │ Type your message... │ │ Use Case: Supply │
41
- │ └────────────────────────────┘ │ │
42
- │ [🔍 Start] [⚙️ Config] [💡 Help] │ 📈 Progress │
43
- │ │ ⚪ Research │
44
- └──────────────────────────────────┴──────────────────────┘
45
- ```
46
-
47
- ---
48
-
49
- ## 💬 How to Use
50
-
51
- ### 1. Starting Message
52
-
53
- When you open the interface, you'll see:
54
-
55
- ```
56
- 👋 Welcome to ThoughtSpot Demo Builder!
57
-
58
- I am creating a perfect ThoughtSpot demo for [company]
59
- using use case: [use case]
60
-
61
- What would you like to do?
62
- ```
63
-
64
- ### 2. Override Settings with `/over`
65
-
66
- Change company or use case on the fly:
67
-
68
- **Change Company:**
69
- ```
70
- /over company: Amazon.com
71
- ```
72
-
73
- **Change Use Case:**
74
- ```
75
- /over usecase: supply chain analytics
76
- ```
77
-
78
- **Change Both:**
79
- ```
80
- /over company: Nike.com usecase: retail analytics
81
- ```
82
-
83
- ### 3. Natural Conversation
84
-
85
- Just type naturally:
86
-
87
- ```
88
- You: Start research on this company
89
- AI: 🔍 Starting research...
90
-
91
- You: What stage are we at?
92
- AI: 📊 Current stage is Research...
93
-
94
- You: Help
95
- AI: 💡 Here's what you can do...
96
- ```
97
-
98
- ---
99
-
100
- ## 🎮 Quick Action Buttons
101
-
102
- Three buttons for common actions:
103
-
104
- - **🔍 Start Research** - Begin demo creation
105
- - **⚙️ Configure** - Adjust settings
106
- - **💡 Help** - Show available commands
107
-
108
- ---
109
-
110
- ## 🎨 Key Features
111
-
112
- ### ✅ Stage Display (Read-Only)
113
- - Shows current stage in workflow
114
- - Located in right panel
115
- - Cannot be edited (controlled by workflow)
116
- - Updates automatically as you progress
117
-
118
- ### ✅ Model Selector (Editable)
119
- - Choose AI model from dropdown
120
- - Options:
121
- - `claude-sonnet-4.5` (recommended)
122
- - `gpt-4o`
123
- - `gpt-4o-mini`
124
- - `gemini-1.5-pro`
125
- - Changes take effect immediately
126
-
127
- ### ✅ Settings Display
128
- - Shows current company
129
- - Shows current use case
130
- - Read-only (use `/over` to change)
131
-
132
- ### ✅ Progress Tracker
133
- - Visual indicator of all stages
134
- - Shows completed stages with ✅
135
- - Current stage with 🔵
136
- - Upcoming stages with ⚪
137
-
138
- ---
139
-
140
- ## 📝 Command Reference
141
-
142
- ### Special Commands
143
-
144
- | Command | Description | Example |
145
- |---------|-------------|---------|
146
- | `/over company: [name]` | Change company | `/over company: Amazon` |
147
- | `/over usecase: [case]` | Change use case | `/over usecase: supply chain` |
148
- | `/over company: [name] usecase: [case]` | Change both | `/over company: Nike usecase: retail` |
149
-
150
- ### Natural Language
151
-
152
- | What to Say | What It Does |
153
- |-------------|--------------|
154
- | "Start research" | Begin research phase |
155
- | "Configure" | Show settings options |
156
- | "Help" | Show available commands |
157
- | "What stage?" | Show current progress |
158
- | "Status" | Show full status |
159
-
160
- ---
161
-
162
- ## 🆚 Comparison: Chat vs Classic UI
163
-
164
- | Feature | Chat Interface | Classic Interface |
165
- |---------|----------------|-------------------|
166
- | **Input Method** | Natural language | Buttons & forms |
167
- | **Settings** | `/over` command | Input fields |
168
- | **Stage Display** | Visible sidebar | Button text |
169
- | **Model Selection** | Dropdown (always visible) | Dropdown in form |
170
- | **Progress** | Visual tracker | Step-by-step |
171
- | **Quick Actions** | Button shortcuts | N/A |
172
- | **Learning Curve** | Low (conversational) | Medium (structured) |
173
-
174
- ---
175
-
176
- ## 🎯 Example Workflow
177
-
178
- ### Complete Demo Creation
179
-
180
- ```
181
- 1. Open interface
182
- → See welcome message with current settings
183
-
184
- 2. Override if needed
185
- You: /over company: Amazon.com usecase: supply chain
186
- AI: ✅ Settings updated!
187
-
188
- 3. Start research
189
- You: Start research
190
- AI: 🔍 Starting research phase...
191
- [Research results stream here]
192
-
193
- 4. Continue with conversation
194
- You: That looks good, continue
195
- AI: ✅ Research complete! Moving to DDL...
196
-
197
- 5. Keep conversing through each stage
198
- [Continue naturally through the workflow]
199
- ```
200
-
201
- ---
202
-
203
- ## 🔧 Configuration
204
-
205
- ### Default Settings
206
-
207
- The interface loads settings from:
208
- 1. **Supabase** (if configured)
209
- 2. **Environment variables** (fallback)
210
- 3. **Hard-coded defaults** (last resort)
211
-
212
- ### Setting Defaults
213
-
214
- Edit your `.env` file:
215
-
216
- ```env
217
- USER_EMAIL=your.email@example.com
218
- DEFAULT_COMPANY=Amazon.com
219
- DEFAULT_USE_CASE=supply chain analytics
220
- DEFAULT_AI_MODEL=claude-sonnet-4.5
221
- ```
222
-
223
- ---
224
-
225
- ## 🚨 Troubleshooting
226
-
227
- ### Interface Won't Load
228
-
229
- ```bash
230
- # Check if port 7861 is in use
231
- lsof -ti:7861
232
-
233
- # Kill existing process
234
- lsof -ti:7861 | xargs kill -9
235
-
236
- # Try again
237
- python launch_chat.py
238
- ```
239
-
240
- ### Settings Not Loading
241
-
242
- ```bash
243
- # Check .env file exists
244
- ls -la .env
245
-
246
- # Check environment variables
247
- python -c "from dotenv import load_dotenv; load_dotenv(); import os; print(os.getenv('USER_EMAIL'))"
248
- ```
249
-
250
- ### Model Not Working
251
-
252
- - Check API keys in `.env`
253
- - Try a different model from dropdown
254
- - Check console for error messages
255
-
256
- ---
257
-
258
- ## 💡 Tips & Tricks
259
-
260
- ### 1. Use Quick Buttons
261
- Don't remember the command? Use the quick action buttons!
262
-
263
- ### 2. Watch the Stage
264
- Keep an eye on the stage indicator to know where you are
265
-
266
- ### 3. Change Model Anytime
267
- Dropdown is always accessible - change mid-workflow if needed
268
-
269
- ### 4. Natural Language Works
270
- Don't worry about exact commands - just chat naturally
271
-
272
- ### 5. Override is Powerful
273
- Use `/over` anytime to pivot to a different company or use case
274
-
275
- ---
276
-
277
- ## 🎓 Next Steps
278
-
279
- ### Phase 1 (Current)
280
- - ✅ Chat interface
281
- - ✅ `/over` command
282
- - ✅ Stage display
283
- - ✅ Model selector
284
- - ⬜ Connect to actual workflow execution
285
-
286
- ### Phase 2 (Coming Soon)
287
- - ⬜ Approval gates
288
- - ⬜ Real-time workflow execution
289
- - ⬜ Streaming research results
290
- - ⬜ Progress updates
291
-
292
- ### Phase 3 (Future)
293
- - ⬜ Refinement capability
294
- - ⬜ Undo/redo
295
- - ⬜ Save conversation
296
- - ⬜ Export demo
297
-
298
- ---
299
-
300
- ## 📞 Support
301
-
302
- **Issues?** Check the logs in the terminal where you ran `launch_chat.py`
303
-
304
- **Questions?** Refer to the main documentation:
305
- - `START_HERE.md` - Overview
306
- - `IMPLEMENTATION_ROADMAP.md` - Development guide
307
- - `CONVERSATION_PATTERNS.md` - UX patterns
308
-
309
- ---
310
-
311
- ## 🎉 That's It!
312
-
313
- You now have a clean, conversational interface for creating ThoughtSpot demos!
314
-
315
- **Try it out:**
316
- ```bash
317
- python launch_chat.py
318
- ```
319
-
320
- **Happy demo building! 🚀**
321
-
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
CHAT_TRANSFORMATION_README.md DELETED
@@ -1,558 +0,0 @@
1
- # Chat Transformation Documentation
2
- ## Complete Guide to AI-Centric Demo Builder
3
-
4
- **Last Updated:** November 12, 2025
5
-
6
- ---
7
-
8
- ## 🗺️ Documentation Map
9
-
10
- We've created a comprehensive transformation plan across 4 documents. **Start here to navigate:**
11
-
12
- ### 📘 For Executives & Product Managers
13
-
14
- **Start with:**
15
- 1. **[TRANSFORMATION_SUMMARY.md](./TRANSFORMATION_SUMMARY.md)** ⭐ START HERE
16
- - Big picture overview
17
- - Goals and benefits
18
- - Timeline and phases
19
- - Success metrics
20
-
21
- **Then review:**
22
- 2. **[CONVERSATION_PATTERNS.md](./CONVERSATION_PATTERNS.md)**
23
- - User experience examples
24
- - Conversation flows
25
- - UX patterns
26
-
27
- **Time investment:** 20 minutes
28
-
29
- ---
30
-
31
- ### 👨‍💻 For Developers & Engineers
32
-
33
- **Start with:**
34
- 1. **[IMPLEMENTATION_ROADMAP.md](./IMPLEMENTATION_ROADMAP.md)** ⭐ START HERE
35
- - Hands-on implementation guide
36
- - Code examples
37
- - Phase-by-phase tasks
38
- - Testing strategies
39
-
40
- **Then review:**
41
- 2. **[CHAT_ARCHITECTURE_PLAN.md](./CHAT_ARCHITECTURE_PLAN.md)**
42
- - Detailed technical design
43
- - Component specifications
44
- - Data models
45
- - Migration strategy
46
-
47
- **Reference as needed:**
48
- 3. **[CONVERSATION_PATTERNS.md](./CONVERSATION_PATTERNS.md)**
49
- - Intent classification examples
50
- - Response templates
51
-
52
- **Time investment:** 1-2 hours for thorough understanding
53
-
54
- ---
55
-
56
- ### 🎨 For UX Designers
57
-
58
- **Start with:**
59
- 1. **[CONVERSATION_PATTERNS.md](./CONVERSATION_PATTERNS.md)** ⭐ START HERE
60
- - All user interaction patterns
61
- - Conversation examples
62
- - Response formatting
63
-
64
- **Then review:**
65
- 2. **[TRANSFORMATION_SUMMARY.md](./TRANSFORMATION_SUMMARY.md)**
66
- - User experience transformation section
67
- - Before/after comparisons
68
-
69
- **Time investment:** 30 minutes
70
-
71
- ---
72
-
73
- ## 📚 Document Descriptions
74
-
75
- ### 1. TRANSFORMATION_SUMMARY.md
76
- **Purpose:** High-level overview and navigation hub
77
- **Length:** ~15 pages
78
- **Best for:** Understanding the big picture
79
-
80
- **Key Sections:**
81
- - Executive summary
82
- - Architecture overview
83
- - UX transformation
84
- - Implementation phases
85
- - Success metrics
86
- - Open questions
87
-
88
- **Read this if:** You need to understand what we're building and why
89
-
90
- ---
91
-
92
- ### 2. IMPLEMENTATION_ROADMAP.md
93
- **Purpose:** Practical development guide
94
- **Length:** ~20 pages
95
- **Best for:** Actually building the system
96
-
97
- **Key Sections:**
98
- - Quick win (Phase 1) with code
99
- - Phase-by-phase implementation
100
- - Code snippets library
101
- - Testing strategies
102
- - Troubleshooting
103
- - Common pitfalls
104
-
105
- **Read this if:** You're coding this transformation
106
-
107
- ---
108
-
109
- ### 3. CHAT_ARCHITECTURE_PLAN.md
110
- **Purpose:** Comprehensive technical specification
111
- **Length:** ~40 pages
112
- **Best for:** Understanding system design
113
-
114
- **Key Sections:**
115
- - Current architecture analysis
116
- - New chat-based architecture
117
- - Component design (6 major components)
118
- - Data models
119
- - Migration strategy (10 phases)
120
- - Risk mitigation
121
- - File structure
122
-
123
- **Read this if:** You need technical depth or are making architectural decisions
124
-
125
- ---
126
-
127
- ### 4. CONVERSATION_PATTERNS.md
128
- **Purpose:** User interaction catalog
129
- **Length:** ~25 pages
130
- **Best for:** Designing conversations and UX
131
-
132
- **Key Sections:**
133
- - 7 intent categories with examples
134
- - Approval/rejection patterns
135
- - Refinement patterns
136
- - Navigation patterns
137
- - Response templates
138
- - Quality checklist
139
-
140
- **Read this if:** You're designing the conversation flow or training the AI
141
-
142
- ---
143
-
144
- ## 🚀 Quick Start Guide
145
-
146
- ### I want to understand the vision
147
- → Read **TRANSFORMATION_SUMMARY.md** (15 min)
148
-
149
- ### I want to start coding
150
- → Read **IMPLEMENTATION_ROADMAP.md** Phase 1 (30 min)
151
- → Create `chat/` directory structure
152
- → Implement `intent_classifier.py`
153
- → Test basic chat flow
154
-
155
- ### I want to understand the architecture
156
- → Read **CHAT_ARCHITECTURE_PLAN.md** (1 hour)
157
- → Review component diagrams
158
- → Study data models
159
-
160
- ### I want to design conversations
161
- → Read **CONVERSATION_PATTERNS.md** (30 min)
162
- → Try example conversations
163
- → Design new patterns
164
-
165
- ---
166
-
167
- ## 🎯 Key Concepts (Glossary)
168
-
169
- **Chat Mode** - New conversational interface (vs. existing button mode)
170
-
171
- **Intent Classification** - AI determining what user wants from their message
172
-
173
- **Approval Gate** - Required checkpoint before advancing to next stage
174
-
175
- **Refinement** - Targeted modification without full regeneration
176
-
177
- **Stage Executor** - Specialized handler for each workflow stage (Research, DDL, etc.)
178
-
179
- **Conversation Controller** - Orchestrator managing workflow and state
180
-
181
- **Streaming** - Sending partial results as generated (not waiting for completion)
182
-
183
- ---
184
-
185
- ## 📊 At a Glance
186
-
187
- ### Current System
188
- ```
189
- Button → Research → Auto-advance → Button → DDL → Auto-advance → ...
190
- ```
191
- - 4 stages (Research, DDL, Population, Deploy)
192
- - Button-driven linear flow
193
- - No approval gates
194
- - Full regeneration only
195
-
196
- ### Future System
197
- ```
198
- User: "Create demo for Amazon"
199
- AI: [Research...] "Approve?"
200
- User: "Yes"
201
- AI: [DDL...] "Approve?"
202
- User: "Add email to customers"
203
- AI: [Refined...] "Better?"
204
- User: "Perfect"
205
- AI: [Population...] "Approve?"
206
- ...
207
- ```
208
- - 9 stages (+ Viz Refinement, Site Creation, Bot Creation)
209
- - Chat-driven conversational flow
210
- - Approval gates at major stages
211
- - Granular refinement capability
212
-
213
- ---
214
-
215
- ## 🏗️ Architecture in 30 Seconds
216
-
217
- ```
218
- User Input → Intent Classifier → Conversation Controller → Stage Executor → Response
219
- ```
220
-
221
- **Intent Classifier** - "What does user want?"
222
- **Conversation Controller** - "How do I orchestrate this?"
223
- **Stage Executor** - "How do I execute this stage?"
224
- **Response Formatter** - "How do I present this?"
225
-
226
- ---
227
-
228
- ## 📅 Timeline
229
-
230
- | Phase | Duration | Goal | Complexity |
231
- |-------|----------|------|------------|
232
- | **Phase 1** | 2 weeks | Basic chat foundation | 🟢 Low |
233
- | **Phase 2** | 2 weeks | Approval gates | 🟡 Medium |
234
- | **Phase 3** | 2 weeks | Refinement | 🟡 Medium |
235
- | **Phase 4** | 1 week | Viz refinement | 🟡 Medium |
236
- | **Phase 5** | 3 weeks | New stages (site/bot) | 🔴 High |
237
-
238
- **Total:** ~10 weeks to full implementation
239
-
240
- **Quick Win:** Phase 1 in 2 weeks shows working chat interface!
241
-
242
- ---
243
-
244
- ## 🎓 Learning Path
245
-
246
- ### Week 1: Foundation
247
- - [ ] Read TRANSFORMATION_SUMMARY.md
248
- - [ ] Review existing codebase (demo_prep.py, demo_builder_class.py)
249
- - [ ] Understand current workflow
250
-
251
- ### Week 2: Design
252
- - [ ] Read CHAT_ARCHITECTURE_PLAN.md
253
- - [ ] Review component designs
254
- - [ ] Sketch conversation flows
255
-
256
- ### Weeks 3-4: Phase 1 Implementation
257
- - [ ] Read IMPLEMENTATION_ROADMAP.md Phase 1
258
- - [ ] Create chat directory structure
259
- - [ ] Implement intent classifier (simple rules)
260
- - [ ] Create chat UI tab
261
- - [ ] Test basic flow
262
-
263
- ### Weeks 5-6: Phase 2 Implementation
264
- - [ ] Read IMPLEMENTATION_ROADMAP.md Phase 2
265
- - [ ] Implement approval gates
266
- - [ ] Add approval UI elements
267
- - [ ] Test approval/rejection flows
268
-
269
- ### Weeks 7-8: Phase 3 Implementation
270
- - [ ] Read IMPLEMENTATION_ROADMAP.md Phase 3
271
- - [ ] Implement DDL refinement
272
- - [ ] Implement population refinement
273
- - [ ] Test targeted modifications
274
-
275
- ### Weeks 9-12: Phases 4-5
276
- - [ ] Continue with remaining phases
277
- - [ ] Add new capabilities
278
- - [ ] Polish UX
279
- - [ ] Deploy
280
-
281
- ---
282
-
283
- ## 🧪 Testing Strategy
284
-
285
- ### Unit Tests
286
- Test individual components in isolation:
287
- - Intent classifier accuracy
288
- - Stage executor functionality
289
- - Refinement logic
290
-
291
- ### Integration Tests
292
- Test components working together:
293
- - Chat → Intent → Executor flow
294
- - State management across stages
295
- - Approval gate enforcement
296
-
297
- ### End-to-End Tests
298
- Test complete user workflows:
299
- - Full demo creation flow
300
- - Refinement iterations
301
- - Error recovery
302
-
303
- ### User Acceptance Testing
304
- Real users testing:
305
- - Natural language understanding
306
- - Conversation quality
307
- - Time to complete demo
308
-
309
- **See IMPLEMENTATION_ROADMAP.md for detailed testing scripts**
310
-
311
- ---
312
-
313
- ## 🚨 Common Pitfalls (Read This!)
314
-
315
- ### ❌ Don't rewrite existing code
316
- ✅ **DO:** Wrap existing functions
317
- ❌ **DON'T:** Duplicate functionality
318
-
319
- ### ❌ Don't over-engineer Phase 1
320
- ✅ **DO:** Use simple rules for intent classification
321
- ❌ **DON'T:** Build complex LLM system day 1
322
-
323
- ### ❌ Don't forget streaming
324
- ✅ **DO:** Use `yield` for all responses
325
- ❌ **DON'T:** Use `return` and make users wait
326
-
327
- ### ❌ Don't ignore errors
328
- ✅ **DO:** Try/except everywhere with user-friendly messages
329
- ❌ **DON'T:** Let exceptions crash the chat
330
-
331
- ### ❌ Don't break existing functionality
332
- ✅ **DO:** Keep chat in separate modules
333
- ❌ **DON'T:** Modify shared code that button UI uses
334
-
335
- **Full list in IMPLEMENTATION_ROADMAP.md**
336
-
337
- ---
338
-
339
- ## 📞 Getting Help
340
-
341
- ### Where to Look
342
-
343
- **"How do I implement X?"**
344
- → IMPLEMENTATION_ROADMAP.md
345
-
346
- **"What should the architecture look like?"**
347
- → CHAT_ARCHITECTURE_PLAN.md
348
-
349
- **"How should the user experience work?"**
350
- → CONVERSATION_PATTERNS.md
351
-
352
- **"What's the big picture?"**
353
- → TRANSFORMATION_SUMMARY.md
354
-
355
- **"What file should I read?"**
356
- → This document (CHAT_TRANSFORMATION_README.md)
357
-
358
- ---
359
-
360
- ## 🎯 Success Criteria
361
-
362
- You'll know the transformation is successful when:
363
-
364
- ### Phase 1 (Foundation)
365
- - [ ] Chat UI loads without errors
366
- - [ ] User can type natural language
367
- - [ ] System identifies basic intents
368
- - [ ] Existing workflow stages execute
369
- - [ ] Output formatted nicely in chat
370
-
371
- ### Phase 2 (Approval Gates)
372
- - [ ] User must approve before advancing
373
- - [ ] Reject/redo works
374
- - [ ] Approve/advance works
375
- - [ ] No auto-advancement
376
-
377
- ### Phase 3 (Refinement)
378
- - [ ] DDL changes without full regen
379
- - [ ] Population changes without full regen
380
- - [ ] Schema integrity maintained
381
- - [ ] Multiple refinements possible
382
-
383
- ### Phases 4-5 (Advanced)
384
- - [ ] Viz refinement works
385
- - [ ] Site generation works
386
- - [ ] Bot creation works
387
- - [ ] End-to-end flow completes
388
-
389
- ---
390
-
391
- ## 📊 Metrics to Track
392
-
393
- **Technical:**
394
- - Intent classification accuracy
395
- - Response time (first token)
396
- - Error rate
397
- - System uptime
398
-
399
- **User Experience:**
400
- - Demo completion time
401
- - Refinement iterations
402
- - Approval rate
403
- - User satisfaction score
404
-
405
- **Business:**
406
- - Adoption rate (chat vs button)
407
- - Time savings per demo
408
- - Demo quality (win rate)
409
- - Support ticket reduction
410
-
411
- **See TRANSFORMATION_SUMMARY.md for detailed metrics**
412
-
413
- ---
414
-
415
- ## 🔄 Document Update Process
416
-
417
- These documents are living and should be updated as we learn:
418
-
419
- 1. **After each phase:** Update with lessons learned
420
- 2. **When patterns emerge:** Add to CONVERSATION_PATTERNS.md
421
- 3. **When architecture changes:** Update CHAT_ARCHITECTURE_PLAN.md
422
- 4. **When timelines shift:** Update IMPLEMENTATION_ROADMAP.md
423
-
424
- **Document Owner:** [Assign owner here]
425
- **Last Review:** 2025-11-12
426
- **Next Review:** [Schedule first review]
427
-
428
- ---
429
-
430
- ## 📖 Appendix: File Structure Preview
431
-
432
- ```
433
- DemoPrep/
434
- ├── demo_prep.py # Main app (add chat tab here)
435
- ├── demo_builder_class.py # Existing state management
436
-
437
- ├── chat/ # NEW: Chat components
438
- │ ├── __init__.py
439
- │ ├── intent_classifier.py # Determine user intent
440
- │ ├── conversation_controller.py # Orchestrate workflow
441
- │ ├── ui.py # Chat UI
442
- │ └── prompts/ # Prompt templates
443
-
444
- ├── executors/ # NEW: Stage executors
445
- │ ├── __init__.py
446
- │ ├── base.py # Base executor class
447
- │ ├── research_executor.py
448
- │ ├── ddl_executor.py
449
- │ ├── population_executor.py
450
- │ ├── deployment_executor.py
451
- │ ├── model_executor.py
452
- │ ├── liveboard_executor.py
453
- │ ├── visualization_refiner.py # NEW capability
454
- │ ├── site_creator.py # NEW capability
455
- │ └── bot_creator.py # NEW capability
456
-
457
- ├── tests/
458
- │ ├── chat/ # NEW: Chat tests
459
- │ └── executors/ # NEW: Executor tests
460
-
461
- ├── TRANSFORMATION_SUMMARY.md # Overview (you are here)
462
- ├── IMPLEMENTATION_ROADMAP.md # Dev guide
463
- ├── CHAT_ARCHITECTURE_PLAN.md # Technical spec
464
- ├── CONVERSATION_PATTERNS.md # UX patterns
465
- └── CHAT_TRANSFORMATION_README.md # This navigation guide
466
- ```
467
-
468
- ---
469
-
470
- ## 🎬 Quick Command Reference
471
-
472
- ```bash
473
- # Navigate to project
474
- cd /Users/mike.boone/cursor_demowire/DemoPrep
475
-
476
- # Activate venv
477
- source venv/bin/activate
478
-
479
- # Create new directories
480
- mkdir -p chat executors tests/chat tests/executors
481
-
482
- # Create Phase 1 files
483
- touch chat/__init__.py chat/intent_classifier.py
484
- touch chat/conversation_controller.py chat/ui.py
485
-
486
- # Run with chat mode
487
- python demo_prep.py
488
-
489
- # Run tests
490
- pytest tests/chat/
491
-
492
- # Check code quality
493
- python -m pylint chat/
494
- ```
495
-
496
- ---
497
-
498
- ## ✅ Checklist: Before You Start Coding
499
-
500
- - [ ] Read TRANSFORMATION_SUMMARY.md (understand the vision)
501
- - [ ] Read IMPLEMENTATION_ROADMAP.md Phase 1 (know what to build)
502
- - [ ] Review existing demo_prep.py code (understand current system)
503
- - [ ] Set up development environment (venv, dependencies)
504
- - [ ] Create chat/ directory structure
505
- - [ ] Read relevant sections of CHAT_ARCHITECTURE_PLAN.md
506
- - [ ] Review CONVERSATION_PATTERNS.md examples
507
- - [ ] Understand approval gate concept
508
- - [ ] Understand refinement concept
509
- - [ ] Ready to code! 🚀
510
-
511
- ---
512
-
513
- ## 🌟 Vision Reminder
514
-
515
- We're building a system where:
516
-
517
- > **"A sales engineer can create a perfect ThoughtSpot demo by simply chatting with an AI assistant, receiving guidance at every step, approving outputs before they advance, and refining specific aspects without waiting for full regeneration."**
518
-
519
- **This transformation makes demos:**
520
- - ✅ Faster to create (50% time savings)
521
- - ✅ Higher quality (approval gates)
522
- - ✅ More iterative (granular refinement)
523
- - ✅ Easier to use (natural language)
524
- - ✅ More consistent (AI guidance)
525
-
526
- ---
527
-
528
- ## 📝 Quick Reference Card
529
-
530
- | I want to... | Read this... | Section... |
531
- |--------------|--------------|------------|
532
- | Understand the vision | TRANSFORMATION_SUMMARY.md | Executive Summary |
533
- | Start coding Phase 1 | IMPLEMENTATION_ROADMAP.md | Phase 1 Quick Win |
534
- | Design a conversation | CONVERSATION_PATTERNS.md | Any pattern section |
535
- | Understand architecture | CHAT_ARCHITECTURE_PLAN.md | Component Design |
536
- | See example conversation | CONVERSATION_PATTERNS.md | Example flows |
537
- | Get code snippets | IMPLEMENTATION_ROADMAP.md | Code Snippets Library |
538
- | Learn about approval gates | TRANSFORMATION_SUMMARY.md | Approval & Iteration |
539
- | Understand refinement | IMPLEMENTATION_ROADMAP.md | Phase 3 |
540
- | See success metrics | TRANSFORMATION_SUMMARY.md | Success Metrics |
541
- | Troubleshoot | IMPLEMENTATION_ROADMAP.md | Getting Unstuck |
542
-
543
- ---
544
-
545
- **Ready to transform the demo builder? Start with TRANSFORMATION_SUMMARY.md!**
546
-
547
- **Ready to code? Jump to IMPLEMENTATION_ROADMAP.md Phase 1!**
548
-
549
- **Questions? Refer back to this navigation guide!**
550
-
551
- ---
552
-
553
- *Let's build the future of demo creation! 🚀*
554
-
555
- **Last Updated:** November 12, 2025
556
- **Version:** 1.0
557
- **Status:** Ready for Implementation
558
-
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
CLAUDE.md CHANGED
@@ -46,10 +46,44 @@ The sprint document has 5 key sections you need to understand:
46
 
47
  ### Working Patterns
48
  - Settings save/load through Supabase - this WORKS when using venv
49
- - ThoughtSpot deployment uses TML format (not YAML)
50
  - Models have replaced worksheets in modern ThoughtSpot
51
  - Liveboards should match "golden demo" style (see sprint doc)
52
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
53
  ## Quick Commands
54
 
55
  ```bash
@@ -67,37 +101,66 @@ lsof -i :7860
67
 
68
  ### Where to Put Files
69
 
70
- **tests/** - Real test cases that verify functionality
71
  - Unit tests for core functions
72
- - Integration tests for workflows
73
- - Tests that run as part of CI/CD
74
- - Example: `test_connection.py`, `test_model_creation.py`
75
-
76
- **tests_temp/** - Temporary AI-generated test files
77
- - Experimental test scripts
78
- - One-off verification scripts
79
- - Files that might be deleted after testing
80
- - DO NOT commit these to git without asking
81
-
82
- **dev_notes/** - Documentation and analysis
83
- - All .md files go here (except README.md and and CLAUDE.md)
 
 
 
 
 
 
 
84
  - Research documents
85
  - Architecture notes
86
  - Sprint planning documents
87
 
88
- **Root directory** - ONLY essential project files
89
- - Main application files (demo_prep.py, etc.)
90
- - Configuration files (.env, requirements.txt)
91
- - README.md and CLAUDE.md (documentation exceptions - must stay in root)
92
- - DO NOT create random .py, .yml, .md files in root
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
93
 
94
  ### Rules for Creating Files
95
 
96
  1. **NEVER create files in root directory without asking**
97
- 2. **Test files ALWAYS go in tests_temp/** unless they're real test cases
98
  3. **Documentation ALWAYS goes in dev_notes/**
99
- 4. **Export/debug .yml/.json files go in tests_temp/** (and should be gitignored)
100
- 5. **Ask before creating ANY new file** if unsure where it belongs
101
 
102
  ### When Testing Existing Features
103
 
@@ -149,5 +212,5 @@ This software is curretnly stored in my repo but will be open sourced and move t
149
 
150
  ---
151
 
152
- *Last Updated: October 23, 2025*
153
  *This is a living document - update as you learn more about the project*
 
46
 
47
  ### Working Patterns
48
  - Settings save/load through Supabase - this WORKS when using venv
49
+ - ThoughtSpot TML IS YAML format (use yaml.dump() not json.dumps())
50
  - Models have replaced worksheets in modern ThoughtSpot
51
  - Liveboards should match "golden demo" style (see sprint doc)
52
 
53
+ ### Liveboard Creation - Dual Method System
54
+
55
+ **PRIMARY GOAL: Both MCP and TML methods must work simultaneously**
56
+
57
+ Two methods for creating liveboards:
58
+ 1. **MCP (Model Context Protocol)** - AI-driven, natural language approach
59
+ - Default method (USE_MCP_LIVEBOARD=true)
60
+ - Working well! Fast and easy.
61
+ - Function: `create_liveboard_from_model_mcp()` line 2006
62
+
63
+ 2. **TML (ThoughtSpot Modeling Language)** - Template-based approach
64
+ - Backup method (USE_MCP_LIVEBOARD=false)
65
+ - Needs KPI fixes
66
+ - Function: `create_liveboard_from_model()` line 1779
67
+
68
+ **CRITICAL:** When changing shared code (like AI prompts), test BOTH methods!
69
+
70
+ See `.cursorrules` file for detailed dual-method documentation.
71
+
72
+ ### Terminology (Important!)
73
+ - **Outliers** = Interesting data points in existing data (works with both methods)
74
+ - **Data Adjuster** = Modifying data values (NOT possible with MCP, needs Snowflake views)
75
+
76
+ ### KPI Requirements for Sparklines
77
+ Both methods need:
78
+ - Time dimension (date column)
79
+ - Granularity (daily, weekly, monthly, etc.)
80
+ - Example: `[Total_revenue] [Order_date].monthly`
81
+
82
+ ### Golden Demo Structure
83
+ - Uses GROUPS (like tabs) NOT text tiles
84
+ - Groups organize visualizations by theme
85
+ - Brand colors via style_properties
86
+
87
  ## Quick Commands
88
 
89
  ```bash
 
101
 
102
  ### Where to Put Files
103
 
104
+ **tests/** - Real, reusable test cases only
105
  - Unit tests for core functions
106
+ - Integration tests that could be automated
107
+ - Tests you'd run as part of CI/CD
108
+ - Example: `test_connection.py`, `test_deployment_flow.py`
109
+
110
+ **scratch/** - ALL temporary/experimental/debug files
111
+ - ALL experimental/debug/check/verify/analyze scripts
112
+ - One-off fixes (fix_*.py, adjust_*.py, emergency_*.py)
113
+ - Debug scripts (debug_*.py, check_*.py, verify_*.py)
114
+ - Analysis tools (analyze_*.py, get_*.py, show_*.py)
115
+ - Test files you're experimenting with
116
+ - Backup files (.bak, .bak2)
117
+ - Export/debug .yml/.json files
118
+ - **ANY script that's temporary or one-time use**
119
+ - Think of this as your "junk drawer" for work-in-progress
120
+ - DO NOT commit without cleanup/review
121
+
122
+ **dev_notes/** - All documentation and presentations
123
+ - All .md files (except README.md and CLAUDE.md in root)
124
+ - Presentation materials (.pptx, .html, .txt slides)
125
  - Research documents
126
  - Architecture notes
127
  - Sprint planning documents
128
 
129
+ **Root directory** - ONLY essential core application files
130
+ - Main entry points (demo_prep.py, launch_chat.py, thoughtspot_deployer.py)
131
+ - Core interfaces (chat_interface.py, liveboard_creator.py)
132
+ - Utilities (supabase_client.py, schema_utils.py, demo_logger.py, etc.)
133
+ - Configuration (.env, requirements.txt)
134
+ - README.md and CLAUDE.md only
135
+ - DO NOT create random files here without asking
136
+
137
+ ### Simple Decision Tree for New Files
138
+
139
+ Creating a new file? Ask yourself:
140
+
141
+ 1. **Is it a real test that should be automated?** → `tests/`
142
+ 2. **Is it documentation or presentation material?** → `dev_notes/`
143
+ 3. **Is it core application code?** → Root (but **ASK FIRST!**)
144
+ 4. **Everything else?** → `scratch/`
145
+ - Debug scripts (check_*, verify_*, analyze_*)
146
+ - Temporary tests
147
+ - One-off fixes
148
+ - Get/show info scripts
149
+ - Backups
150
+ - Experiments
151
+
152
+ ### Golden Rule: **When in doubt, PUT IT IN SCRATCH**
153
+
154
+ It's easy to move a file from scratch → tests or scratch → root later.
155
+ It's annoying to clean up root when it's cluttered.
156
 
157
  ### Rules for Creating Files
158
 
159
  1. **NEVER create files in root directory without asking**
160
+ 2. **Experimental/debug scripts ALWAYS go in scratch/**
161
  3. **Documentation ALWAYS goes in dev_notes/**
162
+ 4. **If you're not sure it belongs in tests/, put it in scratch/**
163
+ 5. **Real tests only in tests/** - automated, reusable, part of CI/CD
164
 
165
  ### When Testing Existing Features
166
 
 
212
 
213
  ---
214
 
215
+ *Last Updated: December 10, 2025*
216
  *This is a living document - update as you learn more about the project*
CONVERSATION_PATTERNS.md DELETED
@@ -1,829 +0,0 @@
1
- # Conversation Patterns & Examples
2
- ## User Intent Recognition Guide
3
-
4
- This document catalogs common conversation patterns and how the system should respond.
5
-
6
- ---
7
-
8
- ## 🎯 Pattern Categories
9
-
10
- 1. **Initialization** - Starting a new demo
11
- 2. **Approval** - Approving stage outputs
12
- 3. **Rejection** - Requesting redo
13
- 4. **Refinement** - Modifying specific aspects
14
- 5. **Navigation** - Moving between stages
15
- 6. **Information** - Asking questions
16
- 7. **Configuration** - Changing settings
17
-
18
- ---
19
-
20
- ## 1. Initialization Patterns
21
-
22
- ### Starting a New Demo
23
-
24
- **User Says:**
25
- - "Create a demo for [company] in [industry/use case]"
26
- - "I want to build a demo for [company]"
27
- - "Let's create a [use case] demo for [company]"
28
- - "Build me a supply chain demo for Amazon"
29
- - "Demo for Walmart retail analytics"
30
-
31
- **System Response:**
32
- ```
33
- Great! Let me start researching [company] for [use case].
34
-
35
- 🔍 Analyzing company website...
36
- 📊 Researching [industry] best practices...
37
-
38
- [Streaming research results...]
39
-
40
- ✅ Research complete! I found:
41
- - [Key findings]
42
- - [Industry insights]
43
- - [Relevant data points]
44
-
45
- 👉 Please review the research. Reply "approve" to proceed or tell me what to adjust.
46
- ```
47
-
48
- **State Transition:** `initial` → `researching` → `awaiting_approval[research]`
49
-
50
- ---
51
-
52
- ### Providing Just Company
53
-
54
- **User Says:**
55
- - "Amazon"
56
- - "Create demo for Nike.com"
57
- - "https://walmart.com"
58
-
59
- **System Response:**
60
- ```
61
- I'll create a demo for [company]! What use case should I focus on?
62
-
63
- Common options:
64
- - Supply Chain Analytics
65
- - Sales Performance
66
- - Customer Analytics
67
- - Inventory Management
68
- - Financial Analytics
69
-
70
- Or tell me a custom use case!
71
- ```
72
-
73
- **State Transition:** `initial` → `awaiting_use_case`
74
-
75
- ---
76
-
77
- ### Providing Just Use Case
78
-
79
- **User Says:**
80
- - "Supply chain analytics"
81
- - "Create a sales demo"
82
- - "I need customer analytics"
83
-
84
- **System Response:**
85
- ```
86
- Great! [Use case] it is. What company should I create this demo for?
87
-
88
- Examples:
89
- - "Amazon.com"
90
- - "Nike"
91
- - "A fictional retail company"
92
-
93
- You can provide a URL or company name.
94
- ```
95
-
96
- **State Transition:** `initial` → `awaiting_company`
97
-
98
- ---
99
-
100
- ## 2. Approval Patterns
101
-
102
- ### Explicit Approval
103
-
104
- **User Says:**
105
- - "Approve"
106
- - "Approved"
107
- - "Yes, approve"
108
- - "This looks good, approve"
109
- - "👍"
110
- - "LGTM" (Looks Good To Me)
111
- - "Perfect, proceed"
112
-
113
- **System Response:**
114
- ```
115
- ✅ [Stage name] approved!
116
-
117
- Moving to: **[Next Stage]**
118
-
119
- [Start next stage or ask if ready to proceed]
120
- ```
121
-
122
- **State Transition:** `awaiting_approval[stage]` → `ready_for_next_stage` or `executing_next_stage`
123
-
124
- ---
125
-
126
- ### Implicit Approval
127
-
128
- **User Says:**
129
- - "Looks good"
130
- - "Great!"
131
- - "Perfect"
132
- - "Yes"
133
- - "Okay"
134
- - "Good to go"
135
- - "Let's move on"
136
- - "Continue"
137
-
138
- **System Response:**
139
- ```
140
- ✅ Great! Considering that an approval.
141
-
142
- Moving to: **[Next Stage]**
143
-
144
- [Start next stage]
145
- ```
146
-
147
- **State Transition:** Same as explicit approval
148
-
149
- ---
150
-
151
- ### Conditional Approval
152
-
153
- **User Says:**
154
- - "Approve, but can we come back to this later?"
155
- - "Good enough for now, proceed"
156
- - "Approve with the understanding that we'll refine later"
157
-
158
- **System Response:**
159
- ```
160
- ✅ [Stage name] approved (marked for potential revision)!
161
-
162
- Don't worry, you can always ask me to go back and refine this later.
163
-
164
- Moving to: **[Next Stage]**
165
- ```
166
-
167
- **State Transition:** `awaiting_approval[stage]` → `next_stage` (+ mark stage as revisable)
168
-
169
- ---
170
-
171
- ## 3. Rejection Patterns
172
-
173
- ### Simple Rejection
174
-
175
- **User Says:**
176
- - "No"
177
- - "Reject"
178
- - "Redo"
179
- - "Try again"
180
- - "Not good"
181
- - "This isn't right"
182
- - "👎"
183
-
184
- **System Response:**
185
- ```
186
- 🔄 Got it, let me redo the [stage name].
187
-
188
- To make it better, can you tell me:
189
- - What specifically didn't you like?
190
- - What would you like to see instead?
191
-
192
- Or just say "redo" and I'll try a different approach.
193
- ```
194
-
195
- **State Transition:** `awaiting_approval[stage]` → `awaiting_rejection_details`
196
-
197
- ---
198
-
199
- ### Rejection with Reason
200
-
201
- **User Says:**
202
- - "No, this doesn't match our use case"
203
- - "Redo it with more focus on real-time data"
204
- - "Try again with simpler schema"
205
- - "This is too complex, simplify it"
206
-
207
- **System Response:**
208
- ```
209
- 🔄 Understood. Let me redo [stage name] with:
210
- - [Extracted requirement 1]
211
- - [Extracted requirement 2]
212
-
213
- [Start regenerating with modifications...]
214
- ```
215
-
216
- **State Transition:** `awaiting_approval[stage]` → `executing[stage]` (with modifications)
217
-
218
- ---
219
-
220
- ## 4. Refinement Patterns
221
-
222
- ### DDL Refinement
223
-
224
- **User Says:**
225
- - "Add a column [column_name] to [table_name]"
226
- - "Remove the [table_name] table"
227
- - "Change [column_name] data type to VARCHAR"
228
- - "Add an email field to customers"
229
- - "Make product_id the primary key"
230
-
231
- **System Response:**
232
- ```
233
- 🎨 Refining DDL...
234
-
235
- Updating: [table_name]
236
- Change: [description of change]
237
-
238
- ```sql
239
- [Updated table DDL]
240
- ```
241
-
242
- ✅ Updated [table_name]
243
-
244
- The full DDL has been updated. Approve?
245
- ```
246
-
247
- **State Transition:** `awaiting_approval[ddl]` → `refining[ddl]` → `awaiting_approval[ddl]`
248
-
249
- ---
250
-
251
- ### Visualization Refinement
252
-
253
- **User Says:**
254
- - "Change visualization 3 to a bar chart"
255
- - "Make the revenue chart show top 10 instead of top 5"
256
- - "Add a region filter to the sales viz"
257
- - "Change the KPI to show year-over-year growth"
258
- - "Make that chart horizontal"
259
-
260
- **System Response:**
261
- ```
262
- 🎨 Refining visualization: **[Viz Name]**
263
-
264
- Applying change: [description]
265
-
266
- **Updated Visualization:**
267
- 📊 [Viz Name] ([Chart Type])
268
- - [Key changes]
269
-
270
- Better? You can:
271
- - Approve to keep this change
272
- - Request more refinements
273
- - Revert to original
274
- ```
275
-
276
- **State Transition:** `awaiting_approval[liveboard]` → `refining[viz]` → `awaiting_approval[liveboard]`
277
-
278
- ---
279
-
280
- ### Population Code Refinement
281
-
282
- **User Says:**
283
- - "Increase data volume to 10,000 rows"
284
- - "Add more outliers for churn scenarios"
285
- - "Make the data more realistic"
286
- - "Change customer names to be more diverse"
287
-
288
- **System Response:**
289
- ```
290
- 🎨 Refining population code...
291
-
292
- Updating:
293
- - [Change 1]
294
- - [Change 2]
295
-
296
- ```python
297
- [Updated relevant section]
298
- ```
299
-
300
- ✅ Population code updated
301
-
302
- This will generate [new data characteristics]. Approve?
303
- ```
304
-
305
- **State Transition:** `awaiting_approval[population]` → `refining[population]` → `awaiting_approval[population]`
306
-
307
- ---
308
-
309
- ## 5. Navigation Patterns
310
-
311
- ### Skip Ahead
312
-
313
- **User Says:**
314
- - "Skip to deployment"
315
- - "I already have a DDL, let's deploy"
316
- - "Can we jump to visualization creation?"
317
-
318
- **System Response:**
319
- ```
320
- ⚠️ Jumping to [stage] requires completing:
321
- - [Missing stage 1]
322
- - [Missing stage 2]
323
-
324
- Would you like me to:
325
- 1. Auto-complete these stages with defaults
326
- 2. Go through each stage quickly
327
- 3. Import existing artifacts (if you have them)
328
-
329
- Which would you prefer?
330
- ```
331
-
332
- **State Transition:** `current_stage` → `awaiting_skip_confirmation`
333
-
334
- ---
335
-
336
- ### Go Back
337
-
338
- **User Says:**
339
- - "Go back to DDL"
340
- - "Let's revisit the research"
341
- - "Can we redo the schema?"
342
- - "I want to change the company"
343
-
344
- **System Response:**
345
- ```
346
- 📍 Going back to [stage]...
347
-
348
- ⚠️ Note: This will reset later stages:
349
- - [Stage that will be reset 1]
350
- - [Stage that will be reset 2]
351
-
352
- Continue? (yes/no)
353
- ```
354
-
355
- **State Transition:** `current_stage` → `confirming_rollback` → `previous_stage`
356
-
357
- ---
358
-
359
- ### Show Progress
360
-
361
- **User Says:**
362
- - "Where are we?"
363
- - "What's the status?"
364
- - "What have we completed?"
365
- - "Show progress"
366
-
367
- **System Response:**
368
- ```
369
- 📊 **Demo Progress for [Company] - [Use Case]**
370
-
371
- ✅ Completed:
372
- - Research (approved)
373
- - DDL Creation (approved)
374
-
375
- 🔵 Current Stage:
376
- - Population Code (awaiting approval)
377
-
378
- ⚪ Upcoming:
379
- - Deployment
380
- - Model Creation
381
- - Liveboard Creation
382
- - Refinement
383
-
384
- **Next Action:** Please approve the population code or request changes.
385
- ```
386
-
387
- **State Transition:** No change (informational)
388
-
389
- ---
390
-
391
- ## 6. Information Patterns
392
-
393
- ### View Current Output
394
-
395
- **User Says:**
396
- - "Show me the DDL"
397
- - "What did you create?"
398
- - "Let me see the schema"
399
- - "Display the current output"
400
- - "Show tables"
401
-
402
- **System Response:**
403
- ```
404
- Here's the current [artifact]:
405
-
406
- ```sql
407
- [DDL or other content]
408
- ```
409
-
410
- **Summary:**
411
- - [Key points]
412
- - [Statistics]
413
-
414
- Want to make changes or approve?
415
- ```
416
-
417
- **State Transition:** No change (informational)
418
-
419
- ---
420
-
421
- ### Explain Something
422
-
423
- **User Says:**
424
- - "Why did you create this table?"
425
- - "Explain the schema design"
426
- - "What are the outliers you created?"
427
- - "How does this work?"
428
-
429
- **System Response:**
430
- ```
431
- Let me explain [topic]:
432
-
433
- [Clear explanation with context]
434
-
435
- **Key Points:**
436
- - [Point 1]
437
- - [Point 2]
438
-
439
- **Rationale:**
440
- [Why this approach was chosen]
441
-
442
- Any other questions?
443
- ```
444
-
445
- **State Transition:** No change (informational)
446
-
447
- ---
448
-
449
- ### Ask for Help
450
-
451
- **User Says:**
452
- - "Help"
453
- - "What can I do?"
454
- - "What are my options?"
455
- - "Commands"
456
-
457
- **System Response:**
458
- ```
459
- ## 💡 Available Actions
460
-
461
- **Current Stage: [Stage Name]**
462
-
463
- You can:
464
- 1. ✅ **Approve** - Move to next stage
465
- 2. ❌ **Reject/Redo** - Regenerate this stage
466
- 3. 🎨 **Refine** - Make specific changes (e.g., "add email column")
467
- 4. 📊 **View** - Show current output
468
- 5. ❓ **Explain** - Ask me to explain something
469
- 6. ⚙️ **Configure** - Change settings
470
-
471
- **Example Commands:**
472
- - "Approve"
473
- - "Add category column to products table"
474
- - "Show me the DDL"
475
- - "Why did you create 5 tables?"
476
- - "Use GPT-4 instead"
477
-
478
- What would you like to do?
479
- ```
480
-
481
- **State Transition:** No change (informational)
482
-
483
- ---
484
-
485
- ## 7. Configuration Patterns
486
-
487
- ### Change LLM
488
-
489
- **User Says:**
490
- - "Use GPT-4"
491
- - "Switch to Claude"
492
- - "Change AI model to [model name]"
493
- - "Use a faster model"
494
-
495
- **System Response:**
496
- ```
497
- ⚙️ Switching to [new model]...
498
-
499
- ✅ Now using: **[Model Name]**
500
-
501
- Note: This will apply to future stages. Current stage won't be affected unless you redo it.
502
-
503
- Continue with current stage?
504
- ```
505
-
506
- **State Transition:** No change (update settings)
507
-
508
- ---
509
-
510
- ### Change Company/Use Case
511
-
512
- **User Says:**
513
- - "Actually, let's use Nike instead"
514
- - "Change use case to financial analytics"
515
- - "Switch company to Adidas"
516
-
517
- **System Response:**
518
- ```
519
- ⚠️ Changing [company/use case] will reset the demo.
520
-
521
- Current progress will be lost:
522
- - [Stage 1] ❌
523
- - [Stage 2] ❌
524
-
525
- Are you sure? (yes/no)
526
- ```
527
-
528
- **State Transition:** `current_stage` → `confirming_restart` → `initial`
529
-
530
- ---
531
-
532
- ### Change Data Volume
533
-
534
- **User Says:**
535
- - "Increase data to 50,000 rows"
536
- - "Make the demo data smaller"
537
- - "Use more realistic data volumes"
538
-
539
- **System Response:**
540
- ```
541
- ⚙️ Updating data volume setting...
542
-
543
- **New Configuration:**
544
- - Data volume: [new amount]
545
- - Expected rows: ~[estimate]
546
-
547
- This will apply when we generate/regenerate the population code.
548
-
549
- Current stage: [stage]. Want to apply this now?
550
- ```
551
-
552
- **State Transition:** No change (update settings for future use)
553
-
554
- ---
555
-
556
- ## 🎯 Intent Classification Rules
557
-
558
- ### Priority Order (when ambiguous)
559
-
560
- 1. **Approval/Rejection** (if pending approval) - highest priority
561
- 2. **Refinement** (if specific target mentioned)
562
- 3. **Navigation** (if moving to different stage)
563
- 4. **Information** (if question words present)
564
- 5. **Configuration** (if settings mentioned)
565
- 6. **Clarification** (if unclear) - lowest priority
566
-
567
- ### Confidence Thresholds
568
-
569
- - **High (> 0.9)**: Execute immediately
570
- - **Medium (0.7 - 0.9)**: Execute with confirmation
571
- - **Low (< 0.7)**: Ask for clarification
572
-
573
- ### Clarification Examples
574
-
575
- **When Intent Unclear:**
576
- ```
577
- 🤔 I'm not sure I understand. Did you want to:
578
-
579
- 1. Approve the current DDL
580
- 2. Make a specific change to the DDL
581
- 3. Redo the entire DDL
582
- 4. Move to a different stage
583
-
584
- Please clarify!
585
- ```
586
-
587
- **When Multiple Intents Detected:**
588
- ```
589
- I think you want to do multiple things:
590
-
591
- 1. [Intent 1]
592
- 2. [Intent 2]
593
-
594
- Which should I do first?
595
- ```
596
-
597
- ---
598
-
599
- ## 🔀 Complex Conversation Flows
600
-
601
- ### Multi-Step Refinement
602
-
603
- ```
604
- User: "Add email to customers"
605
- AI: ✅ Added email column. [Shows DDL]
606
-
607
- User: "Also add phone number"
608
- AI: ✅ Added phone number too. [Shows updated DDL]
609
-
610
- User: "Actually make phone number required"
611
- AI: ✅ Updated phone number to NOT NULL. [Shows DDL]
612
-
613
- User: "Perfect, approve"
614
- AI: ✅ DDL approved! Moving to Population Code...
615
- ```
616
-
617
- ### Refinement After Deployment
618
-
619
- ```
620
- [After deployment complete]
621
-
622
- User: "Wait, can we change visualization 2 to a line chart?"
623
- AI: 🎨 Sure! Updating viz 2...
624
- ⚠️ This requires redeploying to ThoughtSpot. Continue? (yes/no)
625
-
626
- User: "yes"
627
- AI: 🔄 Redeploying liveboard...
628
- ✅ Updated! Visualization 2 is now a line chart.
629
- ```
630
-
631
- ### Iterative Approval
632
-
633
- ```
634
- User: "Create demo for Amazon"
635
- AI: [Starts research...] "Use case?"
636
-
637
- User: "Supply chain"
638
- AI: [Completes research...] "Approve?"
639
-
640
- User: "Actually, can you focus more on last-mile delivery?"
641
- AI: 🔄 Redoing research with focus on last-mile...
642
- [Updated research...] "Better?"
643
-
644
- User: "Yes, much better"
645
- AI: ✅ Approved! Moving to DDL...
646
- ```
647
-
648
- ---
649
-
650
- ## 📝 Response Templates
651
-
652
- ### Generic Success
653
- ```
654
- ✅ [Action] completed successfully!
655
-
656
- [Details or summary]
657
-
658
- 👉 [Next action prompt]
659
- ```
660
-
661
- ### Generic Error
662
- ```
663
- ❌ [Action] failed: [error message]
664
-
665
- **Possible causes:**
666
- - [Cause 1]
667
- - [Cause 2]
668
-
669
- **What to try:**
670
- 1. [Solution 1]
671
- 2. [Solution 2]
672
-
673
- Would you like me to try again?
674
- ```
675
-
676
- ### Generic Clarification
677
- ```
678
- 🤔 I'm not sure I understood correctly.
679
-
680
- Did you mean:
681
- - [Option A]
682
- - [Option B]
683
- - Something else (please clarify)
684
- ```
685
-
686
- ### Generic Confirmation
687
- ```
688
- ⚠️ This action will [consequence].
689
-
690
- Are you sure? (yes/no)
691
- ```
692
-
693
- ---
694
-
695
- ## 🎨 Formatting Guide
696
-
697
- ### Code Blocks
698
- - SQL: \`\`\`sql ... \`\`\`
699
- - Python: \`\`\`python ... \`\`\`
700
- - JSON: \`\`\`json ... \`\`\`
701
-
702
- ### Emphasis
703
- - **Bold** for important actions or names
704
- - *Italic* for notes or asides
705
- - `Code font` for technical terms
706
-
707
- ### Emojis (consistent usage)
708
- - ✅ Success/approved
709
- - ❌ Error/rejected
710
- - ⚠️ Warning
711
- - 🔄 Redoing/retrying
712
- - 🎨 Refining
713
- - 📊 Data/statistics
714
- - 🔍 Research/analyzing
715
- - 🏗️ Creating/building
716
- - 🚀 Deploying
717
- - 💬 Information/help
718
- - ⚙️ Configuration
719
- - 🤔 Clarification needed
720
- - 👉 Next action prompt
721
- - 📁 Assets/files
722
-
723
- ---
724
-
725
- ## 🧪 Testing Conversation Patterns
726
-
727
- ### Test Script
728
-
729
- ```python
730
- def test_conversation_patterns():
731
- """Test various conversation patterns"""
732
-
733
- controller = ConversationController()
734
-
735
- # Test initialization
736
- assert_response_contains(
737
- controller.process_message("Create demo for Amazon in supply chain"),
738
- ["research", "Amazon", "supply chain"]
739
- )
740
-
741
- # Test approval
742
- assert_response_contains(
743
- controller.process_message("approve"),
744
- ["approved", "DDL"]
745
- )
746
-
747
- # Test refinement
748
- assert_response_contains(
749
- controller.process_message("add email column to customers"),
750
- ["refining", "email", "customers"]
751
- )
752
-
753
- # Test rejection
754
- assert_response_contains(
755
- controller.process_message("no, redo this"),
756
- ["redo", "again"]
757
- )
758
- ```
759
-
760
- ---
761
-
762
- ## 🎯 Conversation Quality Checklist
763
-
764
- For each system response, ensure:
765
-
766
- - [ ] **Clear**: User knows what happened
767
- - [ ] **Actionable**: User knows what to do next
768
- - [ ] **Concise**: Not too verbose
769
- - [ ] **Formatted**: Uses proper markdown/code blocks
770
- - [ ] **Contextual**: References previous conversation
771
- - [ ] **Helpful**: Offers guidance if user seems stuck
772
- - [ ] **Consistent**: Uses same emojis/format throughout
773
-
774
- ---
775
-
776
- ## 📊 Analytics & Metrics
777
-
778
- Track these conversation metrics:
779
-
780
- 1. **Intent Classification Accuracy**
781
- - % of messages correctly classified
782
- - % requiring clarification
783
-
784
- 2. **User Satisfaction Indicators**
785
- - Approval rate (approvals / total stages)
786
- - Refinement frequency (refinements / total stages)
787
- - Rejection rate (rejections / total stages)
788
-
789
- 3. **Efficiency Metrics**
790
- - Time to first approval
791
- - Number of messages per stage
792
- - Stages completed per session
793
-
794
- 4. **Common Patterns**
795
- - Most frequent refinement requests
796
- - Most common points of confusion
797
- - Most popular conversation flows
798
-
799
- ---
800
-
801
- ## 🔮 Future Conversation Patterns
802
-
803
- ### Voice Commands (Future)
804
- ```
805
- User: [Voice] "Add email to customers"
806
- AI: 🎤 Voice command recognized: "Add email column to customers table"
807
- Proceeding...
808
- ```
809
-
810
- ### Multi-Modal (Future)
811
- ```
812
- User: [Uploads CSV] "Use this data structure"
813
- AI: 📁 Analyzing uploaded file...
814
- Detected: 5 tables, 43 columns
815
- Generating DDL from your schema...
816
- ```
817
-
818
- ### Proactive Suggestions (Future)
819
- ```
820
- AI: 💡 I notice you're creating a sales demo. Would you like me to:
821
- - Add common sales KPIs automatically?
822
- - Include typical sales outliers?
823
- - Use industry-standard table names?
824
- ```
825
-
826
- ---
827
-
828
- **This is a living document. Update as we discover new patterns!**
829
-
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
DEVELOPMENT_NOTES.md DELETED
@@ -1,73 +0,0 @@
1
- # Development Notes & Future Ideas
2
- ## Chat-Based Demo Builder
3
-
4
- **Last Updated:** November 12, 2025
5
-
6
- ---
7
-
8
- ## 🔮 Future Enhancements
9
-
10
- ### Workflow Order: DDL-First Approach
11
-
12
- **Date:** 2025-11-12
13
- **Status:** Under Consideration
14
-
15
- **Idea:**
16
- Instead of Research → DDL → Population, consider:
17
- DDL → Research → Population
18
-
19
- **Rationale:**
20
- - Start with DDL schema definition
21
- - Once we know the tables/structure, we can do more targeted research
22
- - Research can then focus on finding data patterns that fit the schema
23
- - May lead to better alignment between research and actual demo structure
24
-
25
- **Questions to Explore:**
26
- - Does this work for all use cases?
27
- - How do we guide DDL creation without research context?
28
- - Could we do: Basic Research → DDL → Detailed Research → Population?
29
-
30
- **Next Steps:**
31
- - Prototype this workflow
32
- - Test with a few demos
33
- - Compare quality of output vs current approach
34
-
35
- ---
36
-
37
- ## 📋 Backlog
38
-
39
- ### High Priority
40
- - [ ] Connect chat interface to actual workflow execution
41
- - [ ] Implement approval gates
42
- - [ ] Add streaming for real-time feedback
43
-
44
- ### Medium Priority
45
- - [ ] Settings page with modern UI
46
- - [ ] Save/load conversation state
47
- - [ ] Export demo artifacts
48
-
49
- ### Low Priority
50
- - [ ] Voice commands
51
- - [ ] Multi-modal inputs (upload CSV for schema)
52
- - [ ] Proactive AI suggestions
53
-
54
- ---
55
-
56
- ## 🐛 Known Issues
57
-
58
- - Supabase integration optional (shows warning)
59
- - Settings currently load from .env as fallback
60
-
61
- ---
62
-
63
- ## 💡 Design Decisions Log
64
-
65
- ### 2025-11-12: Chat Interface Foundation
66
- - Used Gradio for consistency with existing app
67
- - Port 7862 to avoid conflicts
68
- - Kept settings loading simple for MVP
69
-
70
- ---
71
-
72
- **Add your notes here as we discover new ideas!**
73
-
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
IMPLEMENTATION_ROADMAP.md DELETED
@@ -1,773 +0,0 @@
1
- # Implementation Roadmap: Chat-Based Demo Builder
2
- ## Quick Start Guide for Development
3
-
4
- ---
5
-
6
- ## 🎯 Vision Summary
7
-
8
- Transform this:
9
- ```
10
- [Button: Start Research] → Auto → [Button: Create DDL] → Auto → [Button: Deploy]
11
- ```
12
-
13
- Into this:
14
- ```
15
- User: "Create a demo for Amazon in supply chain"
16
- AI: [Research...] "Here's what I found. Approve?"
17
- User: "Looks good"
18
- AI: [DDL...] "Created 5 tables. Approve?"
19
- User: "Add a category column to products"
20
- AI: "Done! Approve now?"
21
- User: "Yes"
22
- AI: [Deploy...] "Demo live! Create visualizations?"
23
- ```
24
-
25
- ---
26
-
27
- ## 🚀 Quick Win: Phase 1 (Week 1-2)
28
-
29
- ### Goal
30
- Basic chat interface that triggers existing workflow (no approval gates yet)
31
-
32
- ### What to Build
33
-
34
- 1. **chat/intent_classifier.py** (~200 lines)
35
- ```python
36
- """Simple intent classifier to route user messages"""
37
-
38
- class SimpleIntentClassifier:
39
- """
40
- V1: Rule-based classifier (no LLM needed for MVP)
41
- """
42
-
43
- def classify(self, message: str, context: dict) -> str:
44
- """
45
- Returns one of: 'start_research', 'create_ddl', 'create_population',
46
- 'deploy', 'show_status', 'help'
47
- """
48
- message_lower = message.lower()
49
-
50
- # Simple keyword matching for MVP
51
- if any(word in message_lower for word in ['research', 'analyze', 'study']):
52
- return 'start_research'
53
-
54
- if any(word in message_lower for word in ['ddl', 'schema', 'tables']):
55
- return 'create_ddl'
56
-
57
- if any(word in message_lower for word in ['populate', 'data', 'generate']):
58
- return 'create_population'
59
-
60
- if any(word in message_lower for word in ['deploy', 'push', 'create']):
61
- return 'deploy'
62
-
63
- if any(word in message_lower for word in ['status', 'progress', 'where']):
64
- return 'show_status'
65
-
66
- return 'help'
67
- ```
68
-
69
- 2. **chat/conversation_controller.py** (~300 lines)
70
- ```python
71
- """Simple controller that bridges chat to existing workflow"""
72
-
73
- class ConversationControllerV1:
74
- """
75
- V1: Simple wrapper around existing DemoBuilder
76
- Just adds chat formatting, no approval gates
77
- """
78
-
79
- def __init__(self):
80
- self.demo_builder = None
81
- self.classifier = SimpleIntentClassifier()
82
- self.state = 'initial' # 'initial', 'researching', 'creating_ddl', etc.
83
-
84
- async def process_message(self, message: str):
85
- """
86
- Process user message and yield chat responses
87
- """
88
- intent = self.classifier.classify(message, {'state': self.state})
89
-
90
- if intent == 'start_research':
91
- # Extract company URL from message
92
- url = extract_url(message)
93
- use_case = extract_use_case(message) or "analytics"
94
-
95
- yield "🔍 Starting research...\n\n"
96
-
97
- # Call existing research code
98
- self.demo_builder = DemoBuilder(use_case, url)
99
- async for chunk in self.run_research_stage():
100
- yield chunk
101
-
102
- yield "\n\n✅ Research complete! Type 'create ddl' to continue."
103
-
104
- elif intent == 'create_ddl':
105
- yield "🏗️ Creating DDL...\n\n"
106
- async for chunk in self.run_ddl_stage():
107
- yield chunk
108
- yield "\n\n✅ DDL created! Type 'create population' to continue."
109
-
110
- # ... etc for other intents
111
-
112
- async def run_research_stage(self):
113
- """
114
- Wrap existing research code in streaming format
115
- """
116
- # Call existing progressive_workflow_handler for research stage
117
- # Format output for chat
118
- pass
119
- ```
120
-
121
- 3. **chat/ui.py** (~150 lines)
122
- ```python
123
- """Gradio chat interface"""
124
-
125
- def create_chat_tab():
126
- """
127
- Add chat tab to existing Gradio app
128
- """
129
- with gr.Tab("💬 Chat Mode"):
130
- gr.Markdown("## AI-Powered Demo Builder")
131
- gr.Markdown("Just tell me what you want to create!")
132
-
133
- chatbot = gr.Chatbot(height=600)
134
- msg = gr.Textbox(
135
- placeholder="E.g., 'Create a supply chain demo for Amazon.com'",
136
- lines=2
137
- )
138
-
139
- controller = gr.State(ConversationControllerV1())
140
-
141
- async def respond(message, history, ctrl):
142
- history.append((message, ""))
143
-
144
- response = ""
145
- async for chunk in ctrl.process_message(message):
146
- response += chunk
147
- history[-1] = (message, response)
148
- yield history, ctrl
149
-
150
- msg.submit(respond, [msg, chatbot, controller], [chatbot, controller])
151
-
152
- return chatbot, msg, controller
153
- ```
154
-
155
- 4. **Update demo_prep.py** (10 lines added)
156
- ```python
157
- # Add at the end of create_demo_prep_interface()
158
-
159
- from chat.ui import create_chat_tab
160
-
161
- with gr.Tabs():
162
- with gr.Tab("🎛️ Classic Mode"):
163
- # ... existing button interface ...
164
-
165
- # NEW: Add chat tab
166
- chat_interface = create_chat_tab()
167
-
168
- return interface
169
- ```
170
-
171
- ### Testing Phase 1
172
-
173
- ```bash
174
- # Test basic chat interaction
175
- python demo_prep.py
176
-
177
- # In browser, go to "Chat Mode" tab
178
- # Type: "Create a demo for Amazon.com in supply chain"
179
- # Should trigger research stage
180
- # Should see streaming output in chat
181
- # Should prompt for next action
182
- ```
183
-
184
- **Success Criteria:**
185
- - ✅ Chat UI appears in new tab
186
- - ✅ User can type natural language
187
- - ✅ System identifies intent (research, ddl, etc.)
188
- - ✅ Existing workflow stages execute
189
- - ✅ Output formatted nicely in chat
190
-
191
- ---
192
-
193
- ## 🎯 Phase 2: Approval Gates (Week 3-4)
194
-
195
- ### What to Build
196
-
197
- 1. **Add approval state to controller**
198
- ```python
199
- class ConversationControllerV2:
200
- def __init__(self):
201
- self.pending_approval = None # What's waiting for approval
202
- self.state = 'initial'
203
-
204
- async def process_message(self, message: str):
205
- # Check if message is approval/rejection
206
- if self.pending_approval:
207
- if is_approval(message):
208
- await self.handle_approval()
209
- elif is_rejection(message):
210
- await self.handle_rejection()
211
- else:
212
- yield "⏸️ Please approve or reject the current output first."
213
- return
214
-
215
- # Otherwise, process as intent
216
- # ...
217
- ```
218
-
219
- 2. **Add approval UI elements**
220
- ```python
221
- def create_chat_tab():
222
- # ... existing code ...
223
-
224
- with gr.Row():
225
- approve_btn = gr.Button("✅ Approve", visible=False)
226
- reject_btn = gr.Button("❌ Redo", visible=False)
227
-
228
- # Update visibility based on controller state
229
- def update_buttons(ctrl):
230
- has_pending = ctrl.pending_approval is not None
231
- return gr.update(visible=has_pending), gr.update(visible=has_pending)
232
- ```
233
-
234
- 3. **Request approval after each stage**
235
- ```python
236
- async def run_ddl_stage(self):
237
- # ... generate DDL ...
238
-
239
- # Store for approval
240
- self.pending_approval = {
241
- 'stage': 'ddl',
242
- 'output': ddl_results,
243
- 'timestamp': datetime.now()
244
- }
245
-
246
- yield "\n\n---\n"
247
- yield "👉 **Please review the DDL above.**\n"
248
- yield "Reply 'approve' to continue or 'redo' to regenerate.\n"
249
- ```
250
-
251
- **Test:** User must approve before moving to next stage
252
-
253
- ---
254
-
255
- ## 🎨 Phase 3: Refinement (Week 5-6)
256
-
257
- ### What to Build
258
-
259
- 1. **DDL Refinement**
260
- ```python
261
- class DDLRefiner:
262
- """Handles targeted DDL modifications"""
263
-
264
- async def refine(self, current_ddl: str, instruction: str):
265
- """
266
- E.g., instruction = "add email column to customers table"
267
- """
268
- # Use LLM to modify specific part
269
- prompt = f"""Modify this DDL:
270
- {current_ddl}
271
-
272
- Change requested: {instruction}
273
-
274
- Return the complete updated DDL.
275
- """
276
-
277
- # Stream modified DDL
278
- ```
279
-
280
- 2. **Detect refinement intent**
281
- ```python
282
- def classify(self, message: str, context: dict) -> str:
283
- # ... existing code ...
284
-
285
- # Check for refinement keywords
286
- if context.get('pending_approval'):
287
- if 'add' in message or 'change' in message or 'modify' in message:
288
- return 'refine'
289
-
290
- # ...
291
- ```
292
-
293
- 3. **Handle refinement in controller**
294
- ```python
295
- async def process_message(self, message: str):
296
- # ... existing code ...
297
-
298
- if intent == 'refine' and self.pending_approval:
299
- stage = self.pending_approval['stage']
300
-
301
- if stage == 'ddl':
302
- yield "🎨 Refining DDL...\n\n"
303
- refined = await self.ddl_refiner.refine(
304
- self.pending_approval['output'],
305
- message
306
- )
307
- self.pending_approval['output'] = refined
308
- yield "\n\nUpdated! Approve now?"
309
- ```
310
-
311
- **Test:** User can say "add category column" and get targeted update
312
-
313
- ---
314
-
315
- ## 📊 Phase 4: Visualization Refinement (Week 7)
316
-
317
- ### What to Build
318
-
319
- ```python
320
- class VisualizationRefiner:
321
- """Refine specific visualizations without regenerating all"""
322
-
323
- async def refine_viz(self, liveboard: dict, viz_id: int, instruction: str):
324
- """
325
- E.g., instruction = "change to bar chart" or "add region filter"
326
- """
327
- current_viz = liveboard['visualizations'][viz_id]
328
-
329
- # Classify refinement type
330
- if 'chart' in instruction or 'type' in instruction:
331
- return await self.change_chart_type(current_viz, instruction)
332
- elif 'filter' in instruction:
333
- return await self.add_filter(current_viz, instruction)
334
- elif 'measure' in instruction or 'metric' in instruction:
335
- return await self.change_measures(current_viz, instruction)
336
- ```
337
-
338
- **Test:** User can say "change viz 3 to bar chart" after liveboard created
339
-
340
- ---
341
-
342
- ## 🌐 Phase 5: New Stages (Week 8-10)
343
-
344
- ### Site Creator
345
-
346
- ```python
347
- class SiteCreator:
348
- """Generate demo website with embedded ThoughtSpot"""
349
-
350
- async def create_site(self, demo_context: dict):
351
- """
352
- Generates HTML site with:
353
- - Company branding
354
- - Embedded liveboard
355
- - Demo narrative
356
- """
357
- html = f"""
358
- <!DOCTYPE html>
359
- <html>
360
- <head>
361
- <title>{demo_context['company']} Demo</title>
362
- <style>{self.generate_css(demo_context['brand_colors'])}</style>
363
- </head>
364
- <body>
365
- <h1>{demo_context['use_case']} Analytics</h1>
366
-
367
- <!-- Embedded ThoughtSpot -->
368
- <div id="thoughtspot-embed"></div>
369
-
370
- <script>
371
- // ThoughtSpot embed code
372
- </script>
373
- </body>
374
- </html>
375
- """
376
- return html
377
- ```
378
-
379
- ### Bot Creator
380
-
381
- ```python
382
- class BotCreator:
383
- """Generate chatbot config for demo"""
384
-
385
- async def create_bot(self, demo_context: dict):
386
- """
387
- Creates bot that can:
388
- - Answer questions about the data
389
- - Generate ThoughtSpot searches
390
- - Explain visualizations
391
- """
392
- bot_config = {
393
- 'name': f"{demo_context['company']} Demo Bot",
394
- 'knowledge_base': self.build_knowledge_base(demo_context),
395
- 'thoughtspot_connection': demo_context['ts_connection'],
396
- 'sample_questions': self.generate_sample_questions(demo_context)
397
- }
398
- return bot_config
399
- ```
400
-
401
- ---
402
-
403
- ## 📦 Deliverables by Phase
404
-
405
- ### Phase 1 (Weeks 1-2): Foundation
406
- - [ ] `chat/intent_classifier.py`
407
- - [ ] `chat/conversation_controller.py`
408
- - [ ] `chat/ui.py`
409
- - [ ] Update `demo_prep.py` with chat tab
410
- - [ ] Basic intent recognition (rule-based)
411
- - [ ] Chat triggers existing workflow
412
- - [ ] Formatted output in chat
413
-
414
- ### Phase 2 (Weeks 3-4): Approval Gates
415
- - [ ] Approval state management
416
- - [ ] Approve/reject buttons in UI
417
- - [ ] Block advancement without approval
418
- - [ ] Redo functionality
419
- - [ ] Tests for approval flow
420
-
421
- ### Phase 3 (Weeks 5-6): Refinement
422
- - [ ] DDL refinement
423
- - [ ] Population code refinement
424
- - [ ] Intent detection for refinements
425
- - [ ] Partial regeneration
426
- - [ ] Tests for refinement accuracy
427
-
428
- ### Phase 4 (Week 7): Viz Refinement
429
- - [ ] Visualization refiner class
430
- - [ ] Chart type changes
431
- - [ ] Filter modifications
432
- - [ ] Measure/dimension updates
433
- - [ ] Tests for viz refinement
434
-
435
- ### Phase 5 (Weeks 8-10): New Stages
436
- - [ ] Site creator executor
437
- - [ ] Bot creator executor
438
- - [ ] HTML generation
439
- - [ ] Bot config generation
440
- - [ ] End-to-end tests
441
-
442
- ---
443
-
444
- ## 🧪 Testing Strategy
445
-
446
- ### Unit Tests
447
- ```python
448
- # test_intent_classifier.py
449
- def test_research_intent():
450
- classifier = SimpleIntentClassifier()
451
- assert classifier.classify("research Amazon", {}) == 'start_research'
452
- assert classifier.classify("analyze this company", {}) == 'start_research'
453
-
454
- def test_approval_intent():
455
- classifier = SimpleIntentClassifier()
456
- context = {'pending_approval': True}
457
- assert classifier.classify("looks good", context) == 'approve'
458
- assert classifier.classify("no redo it", context) == 'reject'
459
- ```
460
-
461
- ### Integration Tests
462
- ```python
463
- # test_conversation_flow.py
464
- async def test_full_workflow():
465
- controller = ConversationControllerV2()
466
-
467
- # Start research
468
- responses = []
469
- async for chunk in controller.process_message("Create demo for Amazon in supply chain"):
470
- responses.append(chunk)
471
-
472
- assert "research" in "".join(responses).lower()
473
- assert controller.state == 'awaiting_approval'
474
-
475
- # Approve
476
- async for chunk in controller.process_message("approve"):
477
- pass
478
-
479
- assert controller.state == 'ready_for_ddl'
480
- ```
481
-
482
- ### Manual Test Script
483
- ```markdown
484
- 1. Open app in browser
485
- 2. Go to Chat Mode tab
486
- 3. Type: "Create a demo for Amazon.com in supply chain"
487
- 4. ✅ Verify research starts
488
- 5. ✅ Verify streaming output appears
489
- 6. ✅ Verify approval prompt appears
490
- 7. Type: "approve"
491
- 8. ✅ Verify advances to DDL
492
- 9. Type: "add category column to products"
493
- 10. ✅ Verify DDL is refined (not fully regenerated)
494
- 11. ✅ Verify updated DDL shown
495
- 12. Type: "approve"
496
- 13. ✅ Verify advances to population
497
- ... continue through all stages
498
- ```
499
-
500
- ---
501
-
502
- ## 🔧 Development Tips
503
-
504
- ### 1. Use Existing Code
505
- Don't rewrite what works! Wrap existing functions:
506
-
507
- ```python
508
- # Good: Reuse existing
509
- async def run_research_stage(self):
510
- for result in progressive_workflow_handler(...):
511
- yield self.format_for_chat(result)
512
-
513
- # Bad: Rewrite from scratch
514
- async def run_research_stage(self):
515
- # ... 200 lines of duplicated code ...
516
- ```
517
-
518
- ### 2. Start Simple
519
- Phase 1 doesn't need LLM for intent classification:
520
-
521
- ```python
522
- # V1: Simple rules
523
- if 'research' in message:
524
- return 'start_research'
525
-
526
- # V2 (later): LLM-based
527
- intent = await llm.classify(message, context)
528
- ```
529
-
530
- ### 3. Stream Everything
531
- Users want to see progress:
532
-
533
- ```python
534
- # Good
535
- async def generate_ddl(self):
536
- yield "Creating schema...\n"
537
- for chunk in llm.stream(...):
538
- yield chunk
539
- yield "\n✅ Done!\n"
540
-
541
- # Bad
542
- async def generate_ddl(self):
543
- result = llm.complete(...) # User waits...
544
- return result
545
- ```
546
-
547
- ### 4. Test in Isolation
548
- Each component should work standalone:
549
-
550
- ```python
551
- # Test classifier alone
552
- classifier = SimpleIntentClassifier()
553
- assert classifier.classify("research Amazon", {}) == 'start_research'
554
-
555
- # Test controller alone (mock LLM)
556
- controller = ConversationControllerV1()
557
- controller.llm = MockLLM()
558
- ```
559
-
560
- ---
561
-
562
- ## 🚨 Common Pitfalls
563
-
564
- ### Pitfall 1: Over-engineering Phase 1
565
- **Problem:** Trying to build perfect intent classification from day 1
566
- **Solution:** Start with simple rules, add LLM later
567
-
568
- ### Pitfall 2: Breaking Existing Functionality
569
- **Problem:** Modifying shared code breaks button UI
570
- **Solution:** Keep chat in separate modules, wrap existing code
571
-
572
- ### Pitfall 3: No Streaming
573
- **Problem:** User waits 30 seconds with no feedback
574
- **Solution:** Yield partial results frequently
575
-
576
- ### Pitfall 4: Complex State Management
577
- **Problem:** State gets out of sync between UI and controller
578
- **Solution:** Single source of truth (controller), UI just displays
579
-
580
- ### Pitfall 5: Ignoring Errors
581
- **Problem:** LLM fails, app crashes
582
- **Solution:** Try/except everywhere, graceful error messages
583
-
584
- ---
585
-
586
- ## 📝 Code Snippets Library
587
-
588
- ### Extract Company URL from Message
589
- ```python
590
- import re
591
-
592
- def extract_url(message: str) -> Optional[str]:
593
- """Extract URL from user message"""
594
- # Match various URL formats
595
- url_pattern = r'https?://(?:www\.)?[\w\-\.]+\.[\w]{2,}'
596
- match = re.search(url_pattern, message)
597
- if match:
598
- return match.group(0)
599
-
600
- # Match domain names
601
- domain_pattern = r'(?:www\.)?[\w\-]+\.com|\.io|\.net|\.org'
602
- match = re.search(domain_pattern, message)
603
- if match:
604
- return f"https://{match.group(0)}"
605
-
606
- return None
607
- ```
608
-
609
- ### Format Output for Chat
610
- ```python
611
- def format_for_chat(workflow_output: str) -> str:
612
- """Convert workflow output to chat-friendly format"""
613
-
614
- # Add emoji for status
615
- output = workflow_output
616
- output = output.replace("✅", "✅") # Already good
617
- output = output.replace("ERROR:", "❌ ERROR:")
618
- output = output.replace("WARNING:", "⚠️")
619
-
620
- # Format code blocks
621
- if "CREATE TABLE" in output:
622
- output = f"```sql\n{output}\n```"
623
- elif "import " in output or "def " in output:
624
- output = f"```python\n{output}\n```"
625
-
626
- return output
627
- ```
628
-
629
- ### Check if Message is Approval
630
- ```python
631
- def is_approval(message: str) -> bool:
632
- """Check if user is approving"""
633
- approval_words = [
634
- 'approve', 'approved', 'yes', 'looks good', 'perfect',
635
- 'great', 'proceed', 'continue', 'go ahead', 'lgtm',
636
- 'ok', 'okay', 'good', '👍', 'yes'
637
- ]
638
- message_lower = message.lower().strip()
639
- return any(word in message_lower for word in approval_words)
640
-
641
- def is_rejection(message: str) -> bool:
642
- """Check if user is rejecting"""
643
- rejection_words = [
644
- 'no', 'reject', 'redo', 'retry', 'again', 'wrong',
645
- 'incorrect', 'bad', 'not good', "don't like", '👎'
646
- ]
647
- message_lower = message.lower().strip()
648
- return any(word in message_lower for word in rejection_words)
649
- ```
650
-
651
- ---
652
-
653
- ## 🎯 Success Metrics
654
-
655
- ### Phase 1
656
- - [ ] Chat UI loads without errors
657
- - [ ] User can complete research via chat
658
- - [ ] Output formatted readably
659
- - [ ] No regression in button UI
660
-
661
- ### Phase 2
662
- - [ ] User must approve before advancing
663
- - [ ] Reject → redo works
664
- - [ ] Approve → advance works
665
- - [ ] State persists across reloads
666
-
667
- ### Phase 3
668
- - [ ] DDL refinement doesn't break schema
669
- - [ ] Changes are targeted (not full regen)
670
- - [ ] User can iterate multiple times
671
- - [ ] Refinement intent detected accurately
672
-
673
- ### Phase 4
674
- - [ ] Viz changes reflected in ThoughtSpot
675
- - [ ] Multiple vizs can be refined independently
676
- - [ ] Chart type changes work
677
- - [ ] Filter additions work
678
-
679
- ### Phase 5
680
- - [ ] Site HTML generated and valid
681
- - [ ] Bot config valid and usable
682
- - [ ] Branding applied correctly
683
- - [ ] End-to-end flow completes
684
-
685
- ---
686
-
687
- ## 🆘 Help & Resources
688
-
689
- ### When Things Break
690
-
691
- 1. **Check logs**
692
- ```python
693
- import logging
694
- logging.basicConfig(level=logging.DEBUG)
695
- logger = logging.getLogger(__name__)
696
-
697
- logger.debug(f"Intent classified as: {intent}")
698
- logger.debug(f"Controller state: {self.state}")
699
- ```
700
-
701
- 2. **Test in isolation**
702
- ```python
703
- # Don't test full flow, isolate the issue
704
- classifier = SimpleIntentClassifier()
705
- result = classifier.classify("your problematic message", {})
706
- print(result) # What did it return?
707
- ```
708
-
709
- 3. **Use print statements liberally**
710
- ```python
711
- async def process_message(self, message: str):
712
- print(f"[DEBUG] Received: {message}")
713
- intent = self.classifier.classify(message, self.context)
714
- print(f"[DEBUG] Intent: {intent}")
715
- print(f"[DEBUG] State: {self.state}")
716
- ```
717
-
718
- ### Getting Unstuck
719
-
720
- **Problem:** Intent classification not working
721
- **Solution:** Add debug output to see what's being classified
722
-
723
- **Problem:** Approval state not updating
724
- **Solution:** Check if `is_approval()` function detecting your test phrase
725
-
726
- **Problem:** Streaming not showing in UI
727
- **Solution:** Verify `yield` is used (not `return`)
728
-
729
- **Problem:** Error on LLM call
730
- **Solution:** Check API key, rate limits, try with mock first
731
-
732
- ---
733
-
734
- ## 🎬 Quick Start Commands
735
-
736
- ```bash
737
- # Set up dev environment
738
- cd /Users/mike.boone/cursor_demowire/DemoPrep
739
- source venv/bin/activate
740
-
741
- # Create new directory structure
742
- mkdir -p chat executors tests/chat tests/executors
743
-
744
- # Create Phase 1 files
745
- touch chat/__init__.py
746
- touch chat/intent_classifier.py
747
- touch chat/conversation_controller.py
748
- touch chat/ui.py
749
-
750
- # Run with chat mode
751
- python demo_prep.py
752
-
753
- # Run tests
754
- pytest tests/chat/
755
-
756
- # Check for errors
757
- python -m pylint chat/
758
- ```
759
-
760
- ---
761
-
762
- ## 📞 Next Steps
763
-
764
- 1. ✅ Read this roadmap
765
- 2. ✅ Review CHAT_ARCHITECTURE_PLAN.md for detailed design
766
- 3. ⬜ Create `chat/` directory structure
767
- 4. ⬜ Implement Phase 1 (intent classifier + basic chat UI)
768
- 5. ⬜ Test Phase 1 manually
769
- 6. ⬜ Get feedback on UX
770
- 7. ⬜ Move to Phase 2 (approval gates)
771
-
772
- **Let's build this! 🚀**
773
-
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
MCP_VS_TML_ANALYSIS.md DELETED
@@ -1,501 +0,0 @@
1
- # MCP vs TML Liveboard Creation: Deep Dive Analysis
2
-
3
- ## Executive Summary
4
-
5
- **Goal:** Make MCP liveboard creation produce results as compelling as TML-based creation for realistic demos.
6
-
7
- **Current State:**
8
- - ✅ MCP works functionally (creates liveboards)
9
- - ❌ MCP creates "ugly" basic visualizations
10
- - ✅ TML creates beautiful, well-designed visualizations
11
- - ❌ The two methods are completely disconnected
12
-
13
- **Key Finding:** The TML method has YEARS of intelligence built in (outlier parsing, AI visualization generation, chart type inference, smart layouts, color palettes) while MCP just asks generic questions. We need to bring all that intelligence INTO the MCP workflow.
14
-
15
- ---
16
-
17
- ## Detailed Comparison
18
-
19
- ### TML Method (The Good One)
20
-
21
- #### **Workflow:**
22
- ```
23
- 1. Parse outliers from population script (structured comments)
24
- 2. Use outliers to create targeted visualizations
25
- - Extract semantic types (measure_type, dimension_type)
26
- - Map to specific chart types (VIZ_TYPE)
27
- - Generate ThoughtSpot search queries
28
- - Create companion KPIs
29
- 3. If more viz needed, use AI (GPT-4) to generate additional ones
30
- 4. Add styled text tiles for context
31
- 5. Create smart layout based on chart types
32
- 6. Apply color palettes and formatting
33
- 7. Deploy via TML import
34
- ```
35
-
36
- #### **Key Intelligence:**
37
-
38
- 1. **Outlier-Driven Viz** (lines 686-780):
39
- - Reads structured comments from population script:
40
- ```python
41
- # DEMO_OUTLIER: High-Value Customers at Risk
42
- # INSIGHT: Top 5 customers (>$50K LTV) showing declining satisfaction
43
- # VIZ_TYPE: COLUMN
44
- # VIZ_MEASURE_TYPE: customer_lifetime_value, satisfaction_score
45
- # VIZ_DIMENSION_TYPES: customer_name, customer_segment
46
- # SHOW_ME: Show customers where lifetime_value > 50000 and satisfaction < 3
47
- # KPI_METRIC: total_at_risk_revenue
48
- # IMPACT: $250K annual revenue at risk
49
- # TALKING_POINT: Notice how ThoughtSpot surfaces...
50
- ```
51
- - Maps semantic types to actual model columns
52
- - Generates precise ThoughtSpot search queries
53
- - Creates companion KPIs automatically
54
- - Infers chart types intelligently (geo detection, etc.)
55
-
56
- 2. **AI-Driven Generation** (lines 1175-1265):
57
- - Uses GPT-4 with context about:
58
- - Company name and use case
59
- - Available measures, dimensions, dates from model
60
- - Generates diverse chart types (KPI, LINE, COLUMN, etc.)
61
- - Creates business-friendly titles
62
- - Applies time filters appropriately
63
- - Ensures chart type diversity
64
-
65
- 3. **Smart Layout** (lines 782-861):
66
- - 12-column grid system
67
- - Chart-type-specific sizing:
68
- - KPIs: 3 cols × 3 height (compact)
69
- - Maps: 12 cols × 7 height (full width)
70
- - Scatter: 6 cols × 7 height
71
- - Tables: 8 cols × 5 height
72
- - Auto-wrapping rows
73
- - Text tiles for context
74
-
75
- 4. **Professional Styling** (lines 886-1003):
76
- - Curated color palettes (teal, purple, pink, blue, etc.)
77
- - KPI-specific: sparklines, comparisons, anomalies
78
- - Geo maps: gradient heat maps
79
- - Stacked charts: diverse region colors
80
- - Chart-specific configurations
81
-
82
- 5. **Text Tiles** (lines 1363-1383):
83
- - Adds context tiles with markdown
84
- - Colored backgrounds (#2E3D4D, #85016b)
85
- - Dashboard overview, key insights
86
-
87
- ---
88
-
89
- ### MCP Method (The Current One)
90
-
91
- #### **Workflow:**
92
- ```
93
- 1. Call getRelevantQuestions with use_case (e.g., "sales analytics")
94
- 2. Get back 5 generic questions
95
- 3. Call getAnswer for each question
96
- 4. Send all answers to createLiveboard
97
- 5. Done (no control over viz types, layout, styling)
98
- ```
99
-
100
- #### **What It Does:**
101
- - Lines 1696-1710: Calls `getRelevantQuestions` with simple query
102
- - Lines 1726-1753: Gets answers for questions
103
- - Lines 1775-1798: Creates basic HTML note tile
104
- - Lines 1815-1819: Calls `createLiveboard` with answers
105
-
106
- #### **What It DOESN'T Do:**
107
- - ❌ No outlier awareness
108
- - ❌ No chart type control
109
- - ❌ No layout control
110
- - ❌ No color/styling control
111
- - ❌ No AI-driven question generation
112
- - ❌ No companion KPIs
113
- - ❌ No text tiles/context
114
- - ❌ Generic questions not tied to actual data patterns
115
-
116
- ---
117
-
118
- ## The Problem
119
-
120
- **MCP delegates everything to ThoughtSpot's AI**, which:
121
- - Generates generic questions ("What is total sales?")
122
- - Creates default visualizations
123
- - Uses default layout
124
- - Applies basic styling
125
-
126
- **Result:** Functional but not demo-worthy.
127
-
128
- ---
129
-
130
- ## The Solution: Hybrid Intelligent MCP
131
-
132
- ### Strategy: Bring TML Intelligence into MCP
133
-
134
- Instead of asking MCP generic questions, we:
135
- 1. ✅ **Use outliers** to generate targeted questions
136
- 2. ✅ **Use AI (GPT-4)** to generate additional smart questions
137
- 3. ✅ **Specify chart types** in questions (if MCP supports it)
138
- 4. ✅ **Create text tiles** separately or in note tile
139
- 5. ⚠️ **Layout** - may be limited by MCP API
140
-
141
- ### Implementation Plan
142
-
143
- #### **Phase 1: Outlier Integration** (HIGHEST IMPACT)
144
-
145
- **Goal:** Feed outlier-driven questions to MCP instead of generic ones.
146
-
147
- **Changes to `create_liveboard_from_model_mcp()`:**
148
-
149
- ```python
150
- def create_liveboard_from_model_mcp(
151
- ts_client,
152
- model_id: str,
153
- model_name: str,
154
- company_data: Dict,
155
- use_case: str,
156
- num_visualizations: int = 6,
157
- liveboard_name: str = None,
158
- outliers: Optional[List[Dict]] = None # ← ADD THIS
159
- ) -> Dict:
160
- ```
161
-
162
- **New Logic:**
163
- ```python
164
- # BEFORE calling getRelevantQuestions:
165
-
166
- questions_to_ask = []
167
-
168
- # 1. If we have outliers, use them first
169
- if outliers:
170
- for outlier in outliers:
171
- # Convert outlier to MCP-friendly question
172
- question = _convert_outlier_to_question(outlier)
173
- questions_to_ask.append(question)
174
-
175
- # Add companion KPI if specified
176
- if outlier.get('kpi_companion'):
177
- kpi_question = _create_kpi_question(outlier)
178
- questions_to_ask.append(kpi_question)
179
-
180
- # 2. If we need more questions, use AI to generate them
181
- remaining = num_visualizations - len(questions_to_ask)
182
- if remaining > 0:
183
- ai_questions = _generate_smart_questions_with_ai(
184
- company_data, use_case, model_columns, remaining
185
- )
186
- questions_to_ask.extend(ai_questions)
187
-
188
- # 3. Use MCP getAnswer directly with our smart questions
189
- answers = []
190
- for question in questions_to_ask:
191
- answer = await session.call_tool("getAnswer", {
192
- "question": question,
193
- "datasourceId": model_id
194
- })
195
- answers.append(answer)
196
-
197
- # 4. Create liveboard (MCP controls viz types, but at least questions are smart)
198
- ```
199
-
200
- **Helper Functions Needed:**
201
- ```python
202
- def _convert_outlier_to_question(outlier: Dict) -> str:
203
- """
204
- Convert outlier metadata to natural language question for MCP.
205
-
206
- Input: {
207
- 'title': 'High-Value Customers at Risk',
208
- 'show_me_query': 'Show customers where lifetime_value > 50000 and satisfaction < 3',
209
- 'viz_type': 'COLUMN',
210
- 'viz_measure_types': 'customer_lifetime_value, satisfaction_score',
211
- 'viz_dimension_types': 'customer_name'
212
- }
213
-
214
- Output: "Show me customers with lifetime value greater than 50000 and satisfaction less than 3"
215
- """
216
- # Parse SHOW_ME query into natural language
217
- # Remove quotes, standardize
218
- return outlier['show_me_query'].replace('"', '').replace("'", '')
219
-
220
- def _create_kpi_question(outlier: Dict) -> str:
221
- """Create companion KPI question"""
222
- # Example: "What is the total revenue for high-value at-risk customers?"
223
- kpi_metric = outlier.get('kpi_metric', '')
224
- return f"What is the {kpi_metric}?"
225
-
226
- def _generate_smart_questions_with_AI(
227
- company_data: Dict,
228
- use_case: str,
229
- model_columns: List[Dict],
230
- num_questions: int
231
- ) -> List[str]:
232
- """
233
- Use GPT-4 to generate smart, targeted questions
234
- (Similar to generate_visualizations_from_research but for questions)
235
- """
236
- # Extract measures, dimensions from model_columns
237
- # Prompt GPT-4: "Generate X business questions for [company] [use_case]"
238
- # Return natural language questions
239
- ```
240
-
241
- ---
242
-
243
- #### **Phase 2: Enhanced Note Tile** (QUICK WIN)
244
-
245
- **Current:** Basic HTML gradient box with generic text.
246
-
247
- **Better:** Rich dashboard header with:
248
- - Company logo (if available)
249
- - Use case-specific insights
250
- - Key metrics summary
251
- - Outlier highlights
252
-
253
- ```python
254
- def _create_rich_note_tile(
255
- company_data: Dict,
256
- use_case: str,
257
- outliers: List[Dict],
258
- num_viz: int
259
- ) -> str:
260
- """Create compelling dashboard header"""
261
-
262
- outlier_highlights = ""
263
- if outliers:
264
- outlier_highlights = "<h3>🎯 Strategic Insights</h3><ul>"
265
- for outlier in outliers[:3]: # Top 3
266
- outlier_highlights += f"<li><strong>{outlier['title']}</strong>: {outlier['insight']}</li>"
267
- outlier_highlights += "</ul>"
268
-
269
- return f"""
270
- <div style="background: linear-gradient(135deg, #667eea 0%, #764ba2 100%);
271
- padding: 40px; border-radius: 20px; color: white;">
272
- <h1 style="margin: 0 0 20px 0; font-size: 40px;">
273
- {company_data.get('name', 'Company')} {use_case} Analytics
274
- </h1>
275
- <p style="font-size: 18px; opacity: 0.9;">
276
- {company_data.get('description', 'AI-powered insights and analytics')}
277
- </p>
278
- {outlier_highlights}
279
- <div style="margin-top: 30px; padding: 20px; background: rgba(255,255,255,0.1);
280
- border-radius: 12px;">
281
- <p style="margin: 0; font-size: 14px;">
282
- 📊 {num_viz} AI-generated visualizations |
283
- 🎯 {len(outliers)} strategic outliers |
284
- ⚡ Created with MCP
285
- </p>
286
- </div>
287
- </div>
288
- """
289
- ```
290
-
291
- ---
292
-
293
- #### **Phase 3: Question Quality Enhancement** (MEDIUM EFFORT)
294
-
295
- **Problem:** Generic questions like "What is total sales?" aren't compelling.
296
-
297
- **Solution:** Generate questions that:
298
- - Reference specific time periods ("last quarter", "this year vs last year")
299
- - Include filters and segments ("top 10 products", "high-value customers")
300
- - Ask about trends and patterns ("how has X changed over time?")
301
- - Use business terminology from the use case
302
-
303
- **Example Transformation:**
304
- ```
305
- ❌ Generic: "What is total sales?"
306
- ✅ Better: "What are the top 10 products by sales in the last quarter?"
307
-
308
- ❌ Generic: "Show sales by region"
309
- ✅ Better: "How do sales compare across regions this year versus last year?"
310
-
311
- ❌ Generic: "What is customer count?"
312
- ✅ Better: "Which customers have lifetime value over $50K but declining satisfaction?"
313
- ```
314
-
315
- **Implementation:**
316
- ```python
317
- def _enhance_question_quality(question: str, use_case: str) -> str:
318
- """Add business context and time filters to questions"""
319
-
320
- # Use GPT-4 to enhance:
321
- prompt = f"""
322
- Enhance this analytics question for a {use_case} demo:
323
- "{question}"
324
-
325
- Make it more specific by:
326
- - Adding time filters (last quarter, this year vs last year, etc.)
327
- - Adding top N limits where appropriate
328
- - Using business terminology
329
- - Making it sound like a real business question
330
-
331
- Return only the enhanced question, no explanation.
332
- """
333
-
334
- response = openai.chat.completions.create(
335
- model="gpt-4o",
336
- messages=[{"role": "user", "content": prompt}]
337
- )
338
-
339
- return response.choices[0].message.content.strip()
340
- ```
341
-
342
- ---
343
-
344
- #### **Phase 4: Model Column Introspection** (IMPORTANT)
345
-
346
- **Problem:** MCP questions should reference actual columns in the model.
347
-
348
- **Solution:** Fetch model schema first, use it to generate questions.
349
-
350
- ```python
351
- async def _create_mcp_liveboard():
352
- # Step 0: Fetch model columns (like TML method does)
353
- model_columns = _fetch_model_columns_for_mcp(model_id, ts_client)
354
-
355
- measures = [col for col in model_columns if col['type'] == 'MEASURE']
356
- dimensions = [col for col in model_columns if col['type'] == 'ATTRIBUTE']
357
- date_columns = [col for col in model_columns if col['type'] == 'DATE']
358
-
359
- # Step 1: Generate questions using actual column names
360
- questions = _generate_questions_from_outliers_and_columns(
361
- outliers, measures, dimensions, date_columns, use_case
362
- )
363
-
364
- # Rest of MCP workflow...
365
- ```
366
-
367
- ---
368
-
369
- ## Comparison Table
370
-
371
- | Feature | TML Method | MCP Method (Current) | MCP Method (Proposed) |
372
- |---------|-----------|---------------------|---------------------|
373
- | **Outlier Integration** | ✅ Full support | ❌ None | ✅ Full support |
374
- | **Question Quality** | ✅ AI-generated, precise | ❌ Generic | ✅ AI-enhanced |
375
- | **Chart Type Control** | ✅ Full control | ❌ MCP decides | ⚠️ Via questions? |
376
- | **Layout Control** | ✅ Smart grid | ❌ MCP decides | ⚠️ Limited |
377
- | **Color/Styling** | ✅ Professional | ❌ Default | ⚠️ Limited |
378
- | **Text Tiles** | ✅ Multiple styled | ⚠️ One note tile | ✅ Rich note tile |
379
- | **Companion KPIs** | ✅ Auto-created | ❌ None | ✅ From outliers |
380
- | **Model Introspection** | ✅ Full schema | ❌ None | ✅ Add it |
381
- | **Demo Readiness** | ✅ Beautiful | ❌ Basic | ✅ Much better |
382
-
383
- ---
384
-
385
- ## Recommendations
386
-
387
- ### **MUST DO (Phase 1 & 2):**
388
- 1. ✅ Add `outliers` parameter to `create_liveboard_from_model_mcp()`
389
- 2. ✅ Convert outliers to targeted questions
390
- 3. ✅ Add AI question generation for additional viz
391
- 4. ✅ Enhance note tile with outlier highlights
392
- 5. ✅ Fetch model schema to reference actual columns
393
-
394
- **Estimated Effort:** 4-6 hours
395
- **Impact:** 🔥🔥🔥 Massive improvement in demo quality
396
-
397
- ### **SHOULD DO (Phase 3):**
398
- 6. ✅ Add question quality enhancement
399
- 7. ✅ Create helper functions for outlier→question conversion
400
- 8. ✅ Add companion KPI generation
401
-
402
- **Estimated Effort:** 2-3 hours
403
- **Impact:** 🔥🔥 Significant improvement
404
-
405
- ### **NICE TO HAVE (Phase 4):**
406
- 9. ⚠️ Investigate if MCP supports chart type hints
407
- 10. ⚠️ See if MCP allows layout customization
408
- 11. ⚠️ Test color palette specifications
409
-
410
- **Estimated Effort:** 2-4 hours research + implementation
411
- **Impact:** 🔥 Moderate (depends on MCP API capabilities)
412
-
413
- ---
414
-
415
- ## Implementation Priority
416
-
417
- ### **Immediate (Today):**
418
- ```python
419
- # 1. Update function signature
420
- def create_liveboard_from_model_mcp(..., outliers=None):
421
-
422
- # 2. Add outlier→question conversion
423
- if outliers:
424
- questions = [_outlier_to_question(o) for o in outliers]
425
- else:
426
- # Fall back to generic questions
427
- questions = await getRelevantQuestions(use_case)
428
-
429
- # 3. Use questions with MCP getAnswer
430
- ```
431
-
432
- ### **Next (This Week):**
433
- - Add AI question generation for remaining viz
434
- - Enhance note tile with outlier highlights
435
- - Fetch model schema first
436
-
437
- ### **Future (Nice to Have):**
438
- - Research MCP API capabilities for styling/layout
439
- - Add question quality enhancement pass
440
- - Create comprehensive outlier→viz mapping
441
-
442
- ---
443
-
444
- ## Key Insight
445
-
446
- **The TML method is smart because of the DATA (outliers) and AI (GPT-4), not because of TML itself.**
447
-
448
- We can bring the same intelligence to MCP by:
449
- 1. Feeding it outlier-driven questions (not generic ones)
450
- 2. Using AI to generate additional smart questions
451
- 3. Referencing actual model columns
452
- 4. Creating a rich note tile
453
-
454
- **MCP can be just as good as TML if we give it the same quality inputs!**
455
-
456
- ---
457
-
458
- ## Next Steps
459
-
460
- 1. **Review this document** with stakeholder
461
- 2. **Decide on priority** (recommend: Phase 1 + 2)
462
- 3. **Implement changes** to `create_liveboard_from_model_mcp()`
463
- 4. **Test with real demo** to compare quality
464
- 5. **Iterate** based on results
465
-
466
- ---
467
-
468
- ## Questions to Explore
469
-
470
- 1. Does MCP `getAnswer` support chart type hints?
471
- - Can we say: "Show X as a bar chart"?
472
-
473
- 2. Does MCP `createLiveboard` allow layout specification?
474
- - Or does it always auto-layout?
475
-
476
- 3. Can we pass multiple note tiles?
477
- - Or just one `noteTile` parameter?
478
-
479
- 4. Does MCP respect viz ordering?
480
- - If we pass answers in a specific order, does it layout in that order?
481
-
482
- 5. What's the model ID compatibility issue?
483
- - Why do newer models not work with MCP?
484
- - Is there a model configuration we need to set?
485
-
486
- ---
487
-
488
- ## Conclusion
489
-
490
- **Current State:** MCP creates functional but basic liveboards.
491
-
492
- **Root Cause:** We're asking MCP generic questions instead of leveraging the outlier intelligence and AI that makes TML liveboards compelling.
493
-
494
- **Solution:** Hybrid approach - use TML's intelligence (outliers, AI, schema introspection) to generate smart questions for MCP.
495
-
496
- **Expected Outcome:** MCP liveboards that are just as demo-ready as TML liveboards, with the added benefit of MCP's AI-driven answer generation.
497
-
498
- **Time Investment:** ~6-10 hours total for Phase 1-3.
499
-
500
- **ROI:** 🎯 Transform "ugly" basic liveboards into compelling demo assets.
501
-
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
MCP_liveboard_creation.md DELETED
@@ -1,530 +0,0 @@
1
- # ThoughtSpot MCP Implementation Guide
2
-
3
- ## Overview
4
- This document provides a comprehensive guide for implementing ThoughtSpot's Model Context Protocol (MCP) to create automated, AI-driven analytics liveboards.
5
-
6
- ---
7
-
8
- ## Table of Contents
9
- 1. [What is MCP](#what-is-mcp)
10
- 2. [Architecture](#architecture)
11
- 3. [Prerequisites](#prerequisites)
12
- 4. [Available MCP Tools](#available-mcp-tools)
13
- 5. [Implementation Workflow](#implementation-workflow)
14
- 6. [Code Examples](#code-examples)
15
- 7. [Best Practices](#best-practices)
16
- 8. [Troubleshooting](#troubleshooting)
17
-
18
- ---
19
-
20
- ## What is MCP
21
-
22
- **Model Context Protocol (MCP)** is a standardized protocol that enables AI agents and applications to interact with ThoughtSpot's analytics capabilities programmatically.
23
-
24
- ### Key Benefits
25
- - 🤖 **AI-Native**: Designed for AI agents like Claude, ChatGPT, etc.
26
- - 🔄 **Standardized**: Uses JSON-RPC over stdio (stdin/stdout)
27
- - 🎯 **Intent-Based**: Converts natural language queries into precise data questions
28
- - 📊 **End-to-End**: From question generation to liveboard creation
29
-
30
- ### Communication Method
31
- - **NOT HTTP/REST** - MCP uses stdio (subprocess communication)
32
- - Uses `mcp-remote` proxy for OAuth authentication
33
- - Spawns MCP server as subprocess, communicates via stdin/stdout
34
-
35
- ---
36
-
37
- ## Architecture
38
-
39
- ```
40
- ┌─────────────────┐
41
- │ Your Python │
42
- │ Application │
43
- └────────┬────────┘
44
-
45
-
46
- ┌─────────────────┐
47
- │ MCP Python │
48
- │ SDK │
49
- └────────┬────────┘
50
- │ stdio
51
-
52
- ┌─────────────────┐
53
- │ mcp-remote │
54
- │ (OAuth Proxy) │
55
- └────────┬────────┘
56
- │ HTTPS
57
-
58
- ┌─────────────────┐
59
- │ ThoughtSpot │
60
- │ MCP Server │
61
- └─────────────────┘
62
- ```
63
-
64
- ### Components
65
- 1. **Your Application**: Python code using MCP SDK
66
- 2. **MCP Python SDK**: Handles stdio client communication
67
- 3. **mcp-remote**: npx package that handles OAuth and proxies requests
68
- 4. **ThoughtSpot MCP Server**: `https://agent.thoughtspot.app/mcp`
69
-
70
- ---
71
-
72
- ## Prerequisites
73
-
74
- ### Required Software
75
- - **Python**: 3.8 or higher
76
- - **Node.js/NPX**: For running `mcp-remote`
77
- - **MCP Python SDK**: `pip install mcp`
78
-
79
- ### Required Credentials
80
- - ThoughtSpot instance URL (e.g., `se-thoughtspot-cloud.thoughtspot.cloud`)
81
- - ThoughtSpot username and password (for OAuth)
82
- - Datasource/Model GUIDs from your ThoughtSpot instance
83
-
84
- ### Environment Setup
85
- ```bash
86
- # Install MCP SDK
87
- pip install mcp
88
-
89
- # Verify npx is available
90
- npx --version
91
- ```
92
-
93
- ---
94
-
95
- ## Available MCP Tools
96
-
97
- ThoughtSpot MCP provides 4 core tools:
98
-
99
- ### 1. ping
100
- **Purpose**: Health check to verify connection
101
-
102
- **Parameters**: None
103
-
104
- **Returns**: "Pong"
105
-
106
- **Example**:
107
- ```python
108
- result = await session.call_tool("ping", {})
109
- # Returns: "Pong"
110
- ```
111
-
112
- ---
113
-
114
- ### 2. getRelevantQuestions
115
- **Purpose**: Convert vague queries into precise, answerable questions based on datasource schema
116
-
117
- **Parameters**:
118
- - `query` (string, **required**): High-level question or task (e.g., "sales performance", "top products")
119
- - `datasourceIds` (array, **required**): Array of datasource/model GUIDs
120
- - `additionalContext` (string, optional): Extra context to improve question generation
121
-
122
- **Returns**: JSON array of suggested questions
123
- ```json
124
- {
125
- "questions": [
126
- {
127
- "question": "What is the product with the highest total sales amount?",
128
- "datasourceId": "eb600ad2-ad91-4640-819a-f953602bd4c1"
129
- }
130
- ]
131
- }
132
- ```
133
-
134
- **Use Case**: Turn user's natural language into specific data queries
135
-
136
- ---
137
-
138
- ### 3. getAnswer
139
- **Purpose**: Execute a question against ThoughtSpot and retrieve data/visualization
140
-
141
- **Parameters**:
142
- - `question` (string, **required**): The specific question to answer (typically from `getRelevantQuestions`)
143
- - `datasourceId` (string, **required**): Single datasource/model GUID
144
-
145
- **Returns**: JSON with data, metadata, and viewing URL
146
- ```json
147
- {
148
- "data": "CSV formatted data...",
149
- "question": "What is the product with the highest total sales amount?",
150
- "session_identifier": "uuid",
151
- "generation_number": 2,
152
- "frame_url": "https://instance.thoughtspot.cloud/#/embed/..."
153
- }
154
- ```
155
-
156
- **Use Case**: Get actual data and visualizations for specific questions
157
-
158
- ---
159
-
160
- ### 4. createLiveboard
161
- **Purpose**: Create a ThoughtSpot liveboard (dashboard) with multiple visualizations
162
-
163
- **Parameters**:
164
- - `name` (string, **required**): Liveboard title
165
- - `answers` (array, **required**): Array of answer objects from `getAnswer` calls
166
- - `noteTile` (string, **required**): HTML content for summary/note tile
167
-
168
- **Returns**: Success message with liveboard URL
169
- ```json
170
- {
171
- "message": "Liveboard created successfully",
172
- "url": "https://instance.thoughtspot.cloud/#/pinboard/[GUID]"
173
- }
174
- ```
175
-
176
- **Use Case**: Build comprehensive dashboards from multiple analyses
177
-
178
- ---
179
-
180
- ## Implementation Workflow
181
-
182
- ### Standard 4-Step Process
183
-
184
- ```
185
- 1. ping → Verify connection
186
- 2. getRelevantQuestions → Generate data questions
187
- 3. getAnswer (multiple) → Get data for each question
188
- 4. createLiveboard → Build dashboard
189
- ```
190
-
191
- ### Detailed Flow
192
-
193
- ```python
194
- # Step 1: Connect and verify
195
- session = ClientSession(...)
196
- await session.call_tool("ping", {})
197
-
198
- # Step 2: Generate questions
199
- questions = await session.call_tool("getRelevantQuestions", {
200
- "query": "sales performance",
201
- "datasourceIds": ["datasource-guid"]
202
- })
203
-
204
- # Step 3: Get answers for each question
205
- answers = []
206
- for q in questions:
207
- answer = await session.call_tool("getAnswer", {
208
- "question": q['question'],
209
- "datasourceId": q['datasourceId']
210
- })
211
- answers.append(answer)
212
-
213
- # Step 4: Create liveboard
214
- liveboard = await session.call_tool("createLiveboard", {
215
- "name": "Sales Performance Dashboard",
216
- "answers": answers,
217
- "noteTile": "<html>...</html>"
218
- })
219
- ```
220
-
221
- ---
222
-
223
- ## Code Examples
224
-
225
- ### Minimal Working Example
226
-
227
- ```python
228
- import asyncio
229
- from mcp import ClientSession, StdioServerParameters
230
- from mcp.client.stdio import stdio_client
231
-
232
- async def create_liveboard():
233
- # Configure MCP connection
234
- server_params = StdioServerParameters(
235
- command="npx",
236
- args=["mcp-remote@latest", "https://agent.thoughtspot.app/mcp"]
237
- )
238
-
239
- async with stdio_client(server_params) as (read, write):
240
- async with ClientSession(read, write) as session:
241
- await session.initialize()
242
-
243
- # Your datasource GUID
244
- datasource_id = "your-datasource-guid-here"
245
-
246
- # Get relevant questions
247
- result = await session.call_tool("getRelevantQuestions", {
248
- "query": "top products",
249
- "datasourceIds": [datasource_id]
250
- })
251
-
252
- # Parse questions
253
- import json
254
- data = json.loads(result.content[0].text)
255
- questions = data['questions']
256
-
257
- # Get answer for first question
258
- answer_result = await session.call_tool("getAnswer", {
259
- "question": questions[0]['question'],
260
- "datasourceId": datasource_id
261
- })
262
-
263
- answer_data = json.loads(answer_result.content[0].text)
264
-
265
- # Create liveboard
266
- liveboard_result = await session.call_tool("createLiveboard", {
267
- "name": "Product Analysis",
268
- "answers": [answer_data],
269
- "noteTile": "<h2>Product Analysis</h2><p>Top products by sales</p>"
270
- })
271
-
272
- print(liveboard_result.content[0].text)
273
-
274
- asyncio.run(create_liveboard())
275
- ```
276
-
277
- ### Comprehensive Multi-Visualization Example
278
-
279
- ```python
280
- async def create_comprehensive_analysis():
281
- server_params = StdioServerParameters(
282
- command="npx",
283
- args=["mcp-remote@latest", "https://agent.thoughtspot.app/mcp"]
284
- )
285
-
286
- async with stdio_client(server_params) as (read, write):
287
- async with ClientSession(read, write) as session:
288
- await session.initialize()
289
-
290
- datasource_id = "your-datasource-guid"
291
-
292
- # Multiple query perspectives
293
- queries = [
294
- "top selling products",
295
- "sales trends over time",
296
- "product performance comparison"
297
- ]
298
-
299
- all_questions = []
300
- all_answers = []
301
-
302
- # Generate questions from multiple angles
303
- for query in queries:
304
- result = await session.call_tool("getRelevantQuestions", {
305
- "query": query,
306
- "datasourceIds": [datasource_id]
307
- })
308
-
309
- data = json.loads(result.content[0].text)
310
- all_questions.extend(data['questions'][:3]) # Top 3 from each
311
-
312
- # Get answers for all questions
313
- for q in all_questions[:10]: # Limit to 10 visualizations
314
- try:
315
- answer = await session.call_tool("getAnswer", {
316
- "question": q['question'],
317
- "datasourceId": datasource_id
318
- })
319
- answer_data = json.loads(answer.content[0].text)
320
- all_answers.append(answer_data)
321
- except Exception as e:
322
- print(f"Failed to get answer: {e}")
323
-
324
- # Create rich liveboard
325
- note_tile = """
326
- <div style="background: linear-gradient(135deg, #1e3a8a 0%, #3b82f6 100%);
327
- padding: 40px; border-radius: 20px; color: white;">
328
- <h1>📊 Comprehensive Sales Analysis</h1>
329
- <div style="background: rgba(255,255,255,0.15); padding: 25px;
330
- border-radius: 15px; margin: 20px 0;">
331
- <h2>🎯 Executive Summary</h2>
332
- <p>Analysis of product performance across multiple dimensions</p>
333
- </div>
334
- <div style="margin-top: 20px;">
335
- <h3>🔍 Key Findings</h3>
336
- <ul>
337
- <li>Top product performance metrics</li>
338
- <li>Sales trends and patterns</li>
339
- <li>Comparative analysis across products</li>
340
- </ul>
341
- </div>
342
- </div>
343
- """
344
-
345
- liveboard = await session.call_tool("createLiveboard", {
346
- "name": "📊 Comprehensive Product Analysis",
347
- "answers": all_answers,
348
- "noteTile": note_tile
349
- })
350
-
351
- return liveboard.content[0].text
352
-
353
- asyncio.run(create_comprehensive_analysis())
354
- ```
355
-
356
- ---
357
-
358
- ## Best Practices
359
-
360
- ### 1. Query Design
361
- - ✅ Use broad, natural language queries: "sales performance", "customer trends"
362
- - ❌ Avoid overly specific SQL-like queries
363
- - ✅ Let ThoughtSpot's AI interpret the schema
364
- - ✅ Use multiple query angles for comprehensive analysis
365
-
366
- ### 2. Error Handling
367
- ```python
368
- try:
369
- answer = await session.call_tool("getAnswer", {...})
370
- except Exception as e:
371
- print(f"Question failed: {str(e)}")
372
- # Continue with other questions
373
- ```
374
-
375
- ### 3. Datasource Selection
376
- - Use models (joined tables) instead of single tables when possible
377
- - Models provide richer context for question generation
378
- - Verify datasource has data before using
379
-
380
- ### 4. Liveboard Design
381
- - Include rich HTML note tiles with:
382
- - Executive summary
383
- - Key findings
384
- - Visual styling (gradients, colors, emojis)
385
- - Methodology explanation
386
- - Aim for 7-10 visualizations for comprehensive analysis
387
- - Group related visualizations together
388
-
389
- ### 5. Authentication
390
- - OAuth is handled automatically by `mcp-remote`
391
- - Browser will open for first-time authentication
392
- - Subsequent calls reuse the session
393
- - OAuth server runs on `localhost:9414`
394
-
395
- ---
396
-
397
- ## Troubleshooting
398
-
399
- ### Common Issues
400
-
401
- #### 1. "No answer found for your query"
402
- **Cause**: Datasource is empty or question doesn't match schema
403
-
404
- **Solution**:
405
- - Verify datasource has data
406
- - Use system tables (TS: Search, TS: Database) for testing
407
- - Try simpler questions first
408
-
409
- #### 2. "Expected object, received string" (createLiveboard)
410
- **Cause**: Passing string instead of parsed JSON object
411
-
412
- **Solution**:
413
- ```python
414
- # ❌ Wrong
415
- answers = [result.content[0].text]
416
-
417
- # ✅ Correct
418
- import json
419
- answer_data = json.loads(result.content[0].text)
420
- answers = [answer_data]
421
- ```
422
-
423
- #### 3. Connection timeouts
424
- **Cause**: Network issues or MCP server unavailable
425
-
426
- **Solution**:
427
- - Test with `ping` first
428
- - Verify npx is installed: `npx --version`
429
- - Check ThoughtSpot instance is accessible
430
-
431
- #### 4. Authentication loop
432
- **Cause**: OAuth token expired or not saved
433
-
434
- **Solution**:
435
- - Close browser and restart
436
- - Clear OAuth cache at `~/.mcp-remote/`
437
- - Ensure OAuth callback server on 9414 is not blocked
438
-
439
- ---
440
-
441
- ## Getting Datasource GUIDs
442
-
443
- ### Method 1: ThoughtSpot UI
444
- 1. Log into ThoughtSpot instance
445
- 2. Navigate to **Data** → **Connections** or **Models**
446
- 3. Click on datasource/model
447
- 4. Copy GUID from URL or details page
448
-
449
- ### Method 2: REST API
450
- ```python
451
- import requests
452
-
453
- # Authenticate
454
- auth_url = f"https://{ts_instance}/api/rest/2.0/auth/token/full"
455
- response = requests.post(auth_url, json={
456
- "username": "your_username",
457
- "password": "your_password"
458
- })
459
- token = response.json()['token']
460
-
461
- # List datasources
462
- search_url = f"https://{ts_instance}/api/rest/2.0/metadata/search"
463
- response = requests.post(search_url,
464
- headers={"Authorization": f"Bearer {token}"},
465
- json={"metadata": [{"type": "LOGICAL_TABLE"}]}
466
- )
467
-
468
- for item in response.json():
469
- print(f"{item['metadata_name']}: {item['metadata_id']}")
470
- ```
471
-
472
- ---
473
-
474
- ## File Structure
475
-
476
- Recommended project structure:
477
-
478
- ```
479
- project/
480
- ├── mcp/
481
- │ ├── mcp_working_example.py # Basic example
482
- │ ├── test_get_questions.py # Comprehensive example
483
- │ ├── list_mcp_tools.py # Tool documentation
484
- │ └── get_datasources.py # Helper to get GUIDs
485
- ├── .env # ThoughtSpot credentials
486
- └── requirements.txt # mcp, python-dotenv
487
- ```
488
-
489
- ---
490
-
491
- ## Environment Variables
492
-
493
- ```properties
494
- # .env file
495
- THOUGHTSPOT_URL=your-instance.thoughtspot.cloud
496
- THOUGHTSPOT_USERNAME=your_username
497
- THOUGHTSPOT_PASSWORD=your_password
498
- ```
499
-
500
- ---
501
-
502
- ## Complete Reference Implementation
503
-
504
- See `test_get_questions.py` in this repository for a complete, production-ready implementation with:
505
- - Multiple query generation
506
- - Error handling
507
- - Rich HTML formatting
508
- - 7+ visualizations
509
- - Professional liveboard styling
510
-
511
- ---
512
-
513
- ## Support & Resources
514
-
515
- - **ThoughtSpot MCP Server**: https://agent.thoughtspot.app/mcp
516
- - **MCP Python SDK**: https://github.com/modelcontextprotocol/python-sdk
517
- - **ThoughtSpot REST API Docs**: https://developers.thoughtspot.com
518
-
519
- ---
520
-
521
- ## Version History
522
-
523
- - **v1.0** (November 2025): Initial implementation guide
524
- - MCP SDK version: 1.21.1
525
- - mcp-remote version: 0.1.30
526
-
527
- ---
528
-
529
- *Document created: November 14, 2025*
530
- *Last updated: November 14, 2025*
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
POPULATION_FIX_SUMMARY.md DELETED
@@ -1,160 +0,0 @@
1
- # Population Code Generation Fix - Summary
2
-
3
- ## Problem
4
- The population code was failing with "unexpected indent" errors on line 75, despite template generating clean code.
5
-
6
- ## Root Causes Identified
7
-
8
- ### 1. **Code Modification After Generation**
9
- - `execute_population_script()` was applying dangerous string replacements to clean template code
10
- - These replacements (lines 352-381 in demo_prep.py) were breaking indentation
11
-
12
- ### 2. **Template Logic Bug**
13
- - Table names were being added to the list BEFORE validating columns
14
- - This caused function calls to non-existent functions
15
- - Result: incomplete try/except/finally blocks
16
-
17
- ### 3. **No Distinction Between Template vs LLM Code**
18
- - All code was treated the same way
19
- - Template code doesn't need the safety fixes that LLM code needs
20
-
21
- ## Solutions Implemented
22
-
23
- ### Solution 1: Flag System for Code Source ✅
24
- **Files:** `demo_prep.py`, `chat_interface.py`
25
-
26
- - Added `skip_modifications` parameter to `execute_population_script()`
27
- - Template code now bypasses all dangerous string replacements
28
- - Only does safe schema name replacement
29
- - LLM code still gets safety fixes
30
-
31
- **Usage:**
32
- ```python
33
- execute_population_script(code, schema_name, skip_modifications=True) # For template code
34
- execute_population_script(code, schema_name, skip_modifications=False) # For LLM code
35
- ```
36
-
37
- ### Solution 2: Comprehensive Diagnostics ✅
38
- **Files:** `demo_prep.py`
39
-
40
- Saves code at each step to `/tmp/demowire_debug/`:
41
- - `1_original_code.py` - Code before any modifications
42
- - `2_after_modifications.py` - After string replacements
43
- - `3_validated_code.py` - Final validated code
44
-
45
- **Benefits:**
46
- - Easy to see exactly what code is being executed
47
- - Can debug indentation issues visually
48
- - Compare before/after modifications
49
-
50
- ### Solution 3: Bulletproof Template Generator ✅
51
- **Files:** `chat_interface.py`
52
-
53
- Improvements:
54
- 1. **Column Validation Before Table Addition**
55
- - Only adds table names after validating it has insertable columns
56
- - Prevents orphaned function calls
57
-
58
- 2. **Better Type Handling**
59
- - Handles VARCHAR(n) length specifications
60
- - Supports BIGINT, DOUBLE, NUMERIC, BOOLEAN
61
- - Auto-detects IDENTITY/AUTOINCREMENT columns
62
- - More robust column name filtering
63
-
64
- 3. **Safety Check**
65
- - Raises clear error if no valid tables found
66
- - Prevents generation of empty main() functions
67
-
68
- ### Solution 4: Source Tracking ✅
69
- **Files:** `chat_interface.py`
70
-
71
- - Added `demo_builder.population_code_source` attribute
72
- - Tracks whether code came from "template" or "llm"
73
- - All execution paths now check this flag
74
-
75
- ## Testing
76
-
77
- ### Debug Scripts Created:
78
- 1. `debug_template_generation.py` - Test template with sample DDL
79
- 2. `debug_execution_modifications.py` - Trace code modifications
80
-
81
- ### Test Results:
82
- - Template generates clean, valid Python (59-72 lines)
83
- - Code compiles successfully before modifications
84
- - Modified code only fails when replacements break indentation
85
-
86
- ## Next Steps
87
-
88
- ### Completed ✅:
89
- 1. ✅ Fix template approach - make bulletproof
90
- 2. ✅ Stop execute_population_script from modifying template code
91
- 3. ✅ Add comprehensive diagnostics
92
-
93
- ### Remaining:
94
- 1. Add hybrid LLM approach as fallback (if template fails)
95
- 2. Test with actual user DDL
96
-
97
- ## How to Use
98
-
99
- ### For Template Code:
100
- ```python
101
- # Generation
102
- code = interface.get_fallback_population_code(schema_info)
103
- interface.demo_builder.population_code_source = "template"
104
-
105
- # Execution
106
- success, msg = execute_population_script(
107
- code,
108
- schema_name,
109
- skip_modifications=True
110
- )
111
- ```
112
-
113
- ### For LLM Code:
114
- ```python
115
- # Generation (via LLM)
116
- code = generate_from_llm(...)
117
- interface.demo_builder.population_code_source = "llm"
118
-
119
- # Execution (with safety fixes)
120
- success, msg = execute_population_script(
121
- code,
122
- schema_name,
123
- skip_modifications=False
124
- )
125
- ```
126
-
127
- ## Debugging
128
-
129
- If errors still occur:
130
- 1. Check `/tmp/demowire_debug/` for saved code files
131
- 2. Compare the 3 versions to see what changed
132
- 3. Look for console output showing which path was taken:
133
- - "🎯 Template-generated code detected"
134
- - "⚠️ LLM-generated code - applying safety fixes"
135
-
136
- ## Key Files Modified
137
-
138
- 1. **demo_prep.py**
139
- - Lines 302-309: Added `skip_modifications` parameter
140
- - Lines 346-355: Added debug file saving
141
- - Lines 356-382: Added conditional modification logic
142
- - Lines 473-476: Added validated code saving
143
-
144
- 2. **chat_interface.py**
145
- - Line 1251: Added `population_code_source` tracking
146
- - Lines 1040-1106: Improved template column/type handling
147
- - Lines 1315-1359: Added source checking before execution
148
- - Multiple locations: Updated all execute_population_script calls
149
-
150
- ## Summary
151
-
152
- The fix ensures that:
153
- - ✅ Template code stays clean (no modifications)
154
- - ✅ LLM code gets safety fixes
155
- - ✅ All code is saved for debugging
156
- - ✅ Template handles edge cases better
157
- - ✅ Clear distinction between code sources
158
-
159
- The template approach is now production-ready!
160
-
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
START_HERE.md DELETED
@@ -1,523 +0,0 @@
1
- # 🚀 START HERE: Chat Transformation Guide
2
- ## Your 5-Minute Orientation to the AI-Centric Demo Builder
3
-
4
- ---
5
-
6
- ## 🎯 What Are We Building?
7
-
8
- Transforming your ThoughtSpot demo builder from **button-driven** to **conversation-driven**.
9
-
10
- ### Before ❌
11
- ```
12
- ┌──────────────────────────────────────────┐
13
- │ [Input: Company URL] │
14
- │ [Input: Use Case] │
15
- │ [Button: Start Research] ────────► │
16
- │ ↓ (auto-advance) │
17
- │ [Button: Create DDL] ────────► │
18
- │ ↓ (auto-advance) │
19
- │ [Button: Create Population] ────────► │
20
- │ ↓ (auto-advance) │
21
- │ [Button: Deploy] ────────► │
22
- │ ↓ │
23
- │ Done ✓ │
24
- └──────────────────────────────────────────┘
25
-
26
- Problems:
27
- - No control over outputs
28
- - Can't make small changes
29
- - No approval before advancing
30
- - Errors propagate
31
- ```
32
-
33
- ### After ✅
34
- ```
35
- ┌──────────────────────────────────────────┐
36
- │ 💬 Chat: "Create supply chain demo │
37
- │ for Amazon.com" │
38
- │ ↓ │
39
- │ 🤖 AI: [Researches...] │
40
- │ "Here's what I found. Approve?" │
41
- │ ↓ │
42
- │ 💬 You: "Looks good but focus more on │
43
- │ last-mile delivery" │
44
- │ ↓ │
45
- │ 🤖 AI: [Refines...] "Better?" │
46
- │ ↓ │
47
- │ 💬 You: "Perfect, approve" │
48
- │ ↓ │
49
- │ 🤖 AI: [Creates DDL...] "5 tables │
50
- │ created. Approve?" │
51
- │ ↓ │
52
- │ 💬 You: "Add email to customers" │
53
- │ ↓ │
54
- │ 🤖 AI: [Updates...] "Done! Approve?" │
55
- │ ↓ │
56
- │ 💬 You: "Yes" │
57
- │ ↓ │
58
- │ [Continue conversationally...] │
59
- └──────────────────────────────────────────┘
60
-
61
- Benefits:
62
- - Full control with approval gates ✅
63
- - Targeted refinements ✅
64
- - Natural language interface ✅
65
- - Guided workflow ✅
66
- ```
67
-
68
- ---
69
-
70
- ## 📚 Documentation (5 Documents)
71
-
72
- ### 1️⃣ **START_HERE.md** ← You are here!
73
- **Purpose:** 5-minute overview
74
- **Read this:** Right now
75
-
76
- ### 2️⃣ **CHAT_TRANSFORMATION_README.md**
77
- **Purpose:** Navigation hub for all docs
78
- **Read this:** Next (5 min)
79
-
80
- ### 3️⃣ **TRANSFORMATION_SUMMARY.md**
81
- **Purpose:** Strategic overview
82
- **Read this:** If you're a PM/executive (15 min)
83
-
84
- ### 4️⃣ **IMPLEMENTATION_ROADMAP.md** ⭐ DEVELOPERS START HERE
85
- **Purpose:** Hands-on coding guide
86
- **Read this:** If you're building this (1 hour)
87
-
88
- ### 5️⃣ **CHAT_ARCHITECTURE_PLAN.md**
89
- **Purpose:** Deep technical specification
90
- **Read this:** For architectural decisions (2 hours)
91
-
92
- ### 6️⃣ **CONVERSATION_PATTERNS.md**
93
- **Purpose:** UX patterns and examples
94
- **Read this:** For UX design (30 min)
95
-
96
- ---
97
-
98
- ## 🎯 Your Role? Start Here:
99
-
100
- ### 👔 I'm a Product Manager / Executive
101
- 1. Read **TRANSFORMATION_SUMMARY.md** (15 min)
102
- - Understand goals, timeline, ROI
103
- 2. Review **CONVERSATION_PATTERNS.md** for UX (15 min)
104
- 3. Decide on priorities and timeline
105
- 4. Review open questions in TRANSFORMATION_SUMMARY.md
106
-
107
- **Total time:** 30 minutes
108
-
109
- ---
110
-
111
- ### 👨‍💻 I'm a Developer / Engineer
112
- 1. Read **IMPLEMENTATION_ROADMAP.md** (1 hour)
113
- - Focus on Phase 1: Foundation
114
- - Review code examples
115
- - Understand testing strategy
116
- 2. Skim **CHAT_ARCHITECTURE_PLAN.md** for architecture (30 min)
117
- 3. Start coding Phase 1!
118
-
119
- **Total time:** 1.5 hours to start coding
120
-
121
- ---
122
-
123
- ### 🎨 I'm a UX Designer
124
- 1. Read **CONVERSATION_PATTERNS.md** (30 min)
125
- - Study all user interaction patterns
126
- - Review conversation examples
127
- 2. Read **TRANSFORMATION_SUMMARY.md** UX section (10 min)
128
- 3. Design conversation flows
129
-
130
- **Total time:** 40 minutes
131
-
132
- ---
133
-
134
- ## 🏗️ Architecture (30 Second Version)
135
-
136
- ```
137
- USER MESSAGE
138
-
139
- "Create demo for Amazon"
140
-
141
- INTENT CLASSIFIER ──────► What does user want?
142
- ↓ (approve, reject, refine, advance, info, config)
143
- CONVERSATION CONTROLLER ► Orchestrate workflow
144
- ↓ Manage state & approvals
145
- STAGE EXECUTOR ───���─────► Execute specific stage
146
- ↓ (Research, DDL, Population, etc.)
147
- RESPONSE FORMATTER ─────► Format for chat
148
-
149
- AI RESPONSE
150
- ```
151
-
152
- **Key Insight:** Each component has a single responsibility, making the system modular and testable.
153
-
154
- ---
155
-
156
- ## 🚀 Implementation Timeline
157
-
158
- ```
159
- ┌─────────────┬──────────────┬─────────────────┬──────────┐
160
- │ Phase 1 │ Phase 2 │ Phase 3 │ Phase 4-5│
161
- │ Foundation │ Approval │ Refinement │ New │
162
- │ (2 weeks) │ Gates │ (2 weeks) │ Stages │
163
- │ │ (2 weeks) │ │ (4 weeks)│
164
- ├─────────────┼──────────────┼─────────────────┼──────────┤
165
- │ • Chat UI │ • Approval │ • DDL refine │ • Viz │
166
- │ • Intent │ state │ • Population │ refine │
167
- │ classify │ • Approve/ │ refine │ • Site │
168
- │ • Bridge to │ reject │ • Targeted │ creator│
169
- │ existing │ buttons │ changes │ • Bot │
170
- │ workflow │ • Block │ │ creator│
171
- │ │ auto- │ │ │
172
- │ │ advance │ │ │
173
- └─────────────┴──────────────┴─────────────────┴──────────┘
174
-
175
- Total: ~10 weeks
176
- Quick Win: Phase 1 in 2 weeks!
177
- ```
178
-
179
- ---
180
-
181
- ## ✨ Key Features
182
-
183
- ### 1. **Natural Language Interface**
184
- ```
185
- ❌ Before: Click buttons, fill forms
186
- ✅ After: Just chat naturally
187
- ```
188
-
189
- ### 2. **Approval Gates**
190
- ```
191
- ❌ Before: Auto-advances (can't stop)
192
- ✅ After: Must approve before advancing
193
- ```
194
-
195
- ### 3. **Granular Refinement**
196
- ```
197
- ❌ Before: Redo entire DDL
198
- ✅ After: "Add email column to customers"
199
- ```
200
-
201
- ### 4. **AI Guidance**
202
- ```
203
- ❌ Before: Must know what to click
204
- ✅ After: AI guides you through
205
- ```
206
-
207
- ### 5. **Error Recovery**
208
- ```
209
- ❌ Before: Manual troubleshooting
210
- ✅ After: AI suggests fixes
211
- ```
212
-
213
- ---
214
-
215
- ## 🎯 Success Metrics
216
-
217
- **User Experience:**
218
- - 50% faster demo creation ⚡
219
- - 80% first-time success 🎯
220
- - 4.5/5 user satisfaction ⭐
221
-
222
- **Technical:**
223
- - 90%+ intent accuracy 🎯
224
- - <500ms response time ⚡
225
- - 85%+ test coverage ✅
226
-
227
- **Business:**
228
- - 70% adoption rate 📈
229
- - 40% fewer support tickets 📉
230
- - Higher win rates 💰
231
-
232
- ---
233
-
234
- ## 💡 Core Concepts (Learn These)
235
-
236
- ### Intent Classification
237
- AI determines what user wants from their message.
238
-
239
- **Example:**
240
- - "Looks good" → APPROVE intent
241
- - "Add email column" → REFINE intent
242
- - "Show me the DDL" → INFO intent
243
-
244
- ### Approval Gate
245
- Required checkpoint before advancing.
246
-
247
- **Example:**
248
- ```
249
- AI: "DDL created. Approve?"
250
- [Blocks here until user responds]
251
- User: "Approve"
252
- AI: "Moving to Population..."
253
- ```
254
-
255
- ### Refinement
256
- Targeted modification without full regeneration.
257
-
258
- **Example:**
259
- ```
260
- User: "Add email to customers"
261
- AI: [Modifies just that table, not all DDL]
262
- ```
263
-
264
- ### Stage Executor
265
- Specialized handler for each workflow stage.
266
-
267
- **Example:**
268
- - ResearchExecutor - Handles research
269
- - DDLExecutor - Handles DDL creation
270
- - PopulationExecutor - Handles data population
271
-
272
- ---
273
-
274
- ## 🚦 Getting Started Checklist
275
-
276
- ### Before You Begin
277
- - [ ] Read this document (5 min) ← You're doing it!
278
- - [ ] Read CHAT_TRANSFORMATION_README.md (5 min)
279
- - [ ] Identify your role (PM, Dev, UX)
280
- - [ ] Read role-specific documentation
281
-
282
- ### For Developers
283
- - [ ] Read IMPLEMENTATION_ROADMAP.md Phase 1
284
- - [ ] Review existing codebase
285
- - [ ] Set up development environment
286
- - [ ] Create chat/ directory
287
- - [ ] Implement Phase 1
288
-
289
- ### For Product/UX
290
- - [ ] Read TRANSFORMATION_SUMMARY.md
291
- - [ ] Review conversation examples
292
- - [ ] Define success criteria
293
- - [ ] Plan user testing
294
-
295
- ---
296
-
297
- ## 🎬 Quick Start (Developers)
298
-
299
- ```bash
300
- # 1. Navigate to project
301
- cd /Users/mike.boone/cursor_demowire/DemoPrep
302
-
303
- # 2. Activate virtual environment
304
- source venv/bin/activate
305
-
306
- # 3. Create directory structure
307
- mkdir -p chat executors tests/chat tests/executors
308
-
309
- # 4. Create Phase 1 files
310
- touch chat/__init__.py
311
- touch chat/intent_classifier.py
312
- touch chat/conversation_controller.py
313
- touch chat/ui.py
314
-
315
- # 5. Open IMPLEMENTATION_ROADMAP.md
316
- # Follow Phase 1 code examples
317
-
318
- # 6. Run app
319
- python demo_prep.py
320
-
321
- # 7. Test chat interface
322
- # Go to browser → Chat Mode tab
323
- ```
324
-
325
- ---
326
-
327
- ## 🔥 What Makes This Special?
328
-
329
- ### 1. **Approval Gates = Quality Control**
330
- Bad outputs can't advance. User stays in control.
331
-
332
- ### 2. **Refinement = Speed**
333
- Don't regenerate everything. Just fix what's wrong.
334
-
335
- ### 3. **Chat = Natural**
336
- No training needed. Just talk to the AI.
337
-
338
- ### 4. **Modular = Extensible**
339
- Easy to add new stages (site, bot, etc.)
340
-
341
- ### 5. **Streaming = Responsive**
342
- See progress in real-time, not after it's done.
343
-
344
- ---
345
-
346
- ## 🎓 5-Minute Tutorial
347
-
348
- ### Conversation Example
349
-
350
- **Step 1: Initialize**
351
- ```
352
- You: "Create a supply chain demo for Amazon"
353
- AI: 🔍 Starting research...
354
- [streams findings]
355
- ✅ Research complete!
356
- 👉 Approve or request changes?
357
- ```
358
-
359
- **Step 2: Approve**
360
- ```
361
- You: "Looks good"
362
- AI: ✅ Research approved!
363
- 🏗️ Creating database schema...
364
- [streams DDL]
365
- Generated 5 tables. Approve?
366
- ```
367
-
368
- **Step 3: Refine**
369
- ```
370
- You: "Add category column to products"
371
- AI: 🎨 Adding category to products...
372
- ✅ Updated! Approve?
373
- ```
374
-
375
- **Step 4: Continue**
376
- ```
377
- You: "Yes"
378
- AI: ✅ DDL approved!
379
- Moving to population code...
380
- [continues...]
381
- ```
382
-
383
- **That's it!** Natural conversation, full control.
384
-
385
- ---
386
-
387
- ## 🎯 Next Steps
388
-
389
- ### Right Now (5 min)
390
- 1. ✅ Finish reading this document
391
- 2. ⬜ Open **CHAT_TRANSFORMATION_README.md**
392
- 3. ⬜ Find your role-specific path
393
-
394
- ### Today (30-60 min)
395
- 1. ⬜ Read role-appropriate documentation
396
- 2. ⬜ Understand the architecture
397
- 3. ⬜ Review conversation examples
398
-
399
- ### This Week
400
- 1. ⬜ Set up development environment
401
- 2. ⬜ Review open questions
402
- 3. ⬜ Make go/no-go decision
403
- 4. ⬜ If go: Start Phase 1 implementation
404
-
405
- ---
406
-
407
- ## 🤔 Common Questions
408
-
409
- **Q: Do we keep the button UI?**
410
- A: Yes initially (for safety). Can deprecate later based on adoption.
411
-
412
- **Q: How long to see results?**
413
- A: Phase 1 (basic chat) works in 2 weeks!
414
-
415
- **Q: What if intent classification fails?**
416
- A: Fallback to clarification questions. See CONVERSATION_PATTERNS.md.
417
-
418
- **Q: Does this replace all existing code?**
419
- A: No! We wrap existing functions. Low risk.
420
-
421
- **Q: What about LLM costs?**
422
- A: Use cheap models for classification, premium for generation.
423
-
424
- **Q: Can users still use buttons?**
425
- A: Yes! Button UI stays during transition.
426
-
427
- ---
428
-
429
- ## 🆘 Help & Resources
430
-
431
- **Question: "Where do I start coding?"**
432
- Answer: IMPLEMENTATION_ROADMAP.md → Phase 1
433
-
434
- **Question: "How should conversations work?"**
435
- Answer: CONVERSATION_PATTERNS.md → Examples
436
-
437
- **Question: "What's the architecture?"**
438
- Answer: CHAT_ARCHITECTURE_PLAN.md → Component Design
439
-
440
- **Question: "What's the big picture?"**
441
- Answer: TRANSFORMATION_SUMMARY.md → Executive Summary
442
-
443
- **Question: "I'm confused, what should I read?"**
444
- Answer: CHAT_TRANSFORMATION_README.md → Your role path
445
-
446
- ---
447
-
448
- ## 🎯 TL;DR (Too Long; Didn't Read)
449
-
450
- **What:** Transform demo builder to chat interface
451
- **Why:** Faster, easier, higher quality demos
452
- **How:** 5 phases over 10 weeks
453
- **Quick Win:** Basic chat in 2 weeks
454
-
455
- **Key Features:**
456
- - ✅ Natural language chat
457
- - ✅ Approval gates for quality
458
- - ✅ Granular refinement for speed
459
- - ✅ AI guidance throughout
460
-
461
- **Next:** Read CHAT_TRANSFORMATION_README.md (5 min)
462
-
463
- ---
464
-
465
- ## 📊 Visual Summary
466
-
467
- ```
468
- ┌─────────────────────────────────────────────────────┐
469
- │ TRANSFORMATION │
470
- │ │
471
- │ FROM: Linear Button Workflow │
472
- │ ❌ No control │
473
- │ ❌ Full regeneration only │
474
- │ ❌ Auto-advances │
475
- │ │
476
- │ TO: Conversational AI Workflow │
477
- │ ✅ Approval gates │
478
- │ ✅ Granular refinement │
479
- │ ✅ Natural language │
480
- │ ✅ Error recovery │
481
- │ │
482
- │ RESULT: 50% faster, higher quality demos │
483
- └─────────────────────────────────────────────────────┘
484
- ```
485
-
486
- ---
487
-
488
- ## 🏁 Ready?
489
-
490
- **Your next step depends on your role:**
491
-
492
- 👔 **Product/Executive?**
493
- → Open **TRANSFORMATION_SUMMARY.md**
494
-
495
- 👨‍💻 **Developer?**
496
- → Open **IMPLEMENTATION_ROADMAP.md**
497
-
498
- 🎨 **UX Designer?**
499
- → Open **CONVERSATION_PATTERNS.md**
500
-
501
- ❓ **Not sure?**
502
- → Open **CHAT_TRANSFORMATION_README.md**
503
-
504
- ---
505
-
506
- **Welcome to the future of demo creation! 🚀**
507
-
508
- *This will change how demos are made. Let's build it together.*
509
-
510
- ---
511
-
512
- **Remember:** Start small (Phase 1), iterate quickly, ship incrementally.
513
-
514
- **You've got this!** 💪
515
-
516
- ---
517
-
518
- Created: November 12, 2025
519
- Version: 1.0
520
- Status: Ready for Implementation
521
-
522
- **Next → CHAT_TRANSFORMATION_README.md**
523
-
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
TRANSFORMATION_SUMMARY.md DELETED
@@ -1,574 +0,0 @@
1
- # Chat-Based Demo Builder: Transformation Summary
2
- ## From Linear Workflow to AI-Centric Conversational System
3
-
4
- **Created:** November 12, 2025
5
- **Status:** Ready for Implementation
6
-
7
- ---
8
-
9
- ## 📋 Executive Summary
10
-
11
- We're transforming your ThoughtSpot demo preparation tool from a button-driven linear workflow into an **AI-powered conversational assistant** that guides users through demo creation via natural language chat.
12
-
13
- ### Current State ❌
14
- - Button-based UI with fixed progression
15
- - No approval gates - auto-advances through stages
16
- - Limited iteration capability (full stage redo only)
17
- - Difficult to make targeted changes
18
- - Linear: Research → DDL → Population → Deploy
19
-
20
- ### Future State ✅
21
- - Natural language chat interface
22
- - AI interprets user intent and takes appropriate action
23
- - Approval required at each major stage
24
- - Granular refinement without regeneration
25
- - Iterative: Create, review, refine, approve, advance
26
- - New capabilities: Visualization refinement, site creation, bot creation
27
-
28
- ---
29
-
30
- ## 🎯 Core Transformation Goals
31
-
32
- | Goal | Current | Target | Impact |
33
- |------|---------|--------|--------|
34
- | **User Experience** | Button clicks | Natural conversation | ⭐⭐⭐⭐⭐ |
35
- | **Approval Gates** | None (auto-advance) | Required at each stage | ⭐⭐⭐⭐⭐ |
36
- | **Refinement** | Full regeneration only | Targeted modifications | ⭐⭐⭐⭐ |
37
- | **Error Recovery** | Manual | AI-assisted | ⭐⭐⭐ |
38
- | **Iteration Speed** | Slow (full regen) | Fast (incremental) | ⭐⭐⭐⭐ |
39
- | **Learning Curve** | Requires training | Self-explanatory | ⭐⭐⭐⭐ |
40
-
41
- ---
42
-
43
- ## 🏗️ Architecture Overview
44
-
45
- ### High-Level Flow
46
-
47
- ```
48
- ┌─────────────────────────────────────────────────────────────┐
49
- │ USER │
50
- │ "Create a supply chain demo for Amazon" │
51
- └───────────────────────────┬─────────────────────────────────┘
52
-
53
-
54
- ┌─────────────────────────────────────────────────────────────┐
55
- │ INTENT CLASSIFIER │
56
- │ What does the user want? (start, approve, refine...) │
57
- └───────────────────────────┬─────────────────────────────────┘
58
-
59
-
60
- ┌─────────────────────────────────────────────────────────────┐
61
- │ CONVERSATION CONTROLLER │
62
- │ Maintains state, orchestrates workflow, manages approvals │
63
- └───────────────────────────┬─────────────────────────────────┘
64
-
65
-
66
- ┌─────────────────────────────────────────────────────────────┐
67
- │ STAGE EXECUTORS │
68
- │ ResearchExecutor │ DDLExecutor │ PopulationExecutor │ ... │
69
- └───────────────────────────┬─────────────────────────────────┘
70
-
71
-
72
- ┌─────────────────────────────────────────────────────────────┐
73
- │ RESPONSE FORMATTER │
74
- │ Streams formatted chat responses to user │
75
- └─────────────────────────────────────────────────────────────┘
76
- ```
77
-
78
- ### Key Components
79
-
80
- 1. **Intent Classifier** (`chat/intent_classifier.py`)
81
- - Determines what user wants from their message
82
- - Types: APPROVE, REJECT, REFINE, ADVANCE, INFO, CONFIGURE
83
- - Phase 1: Simple rule-based
84
- - Phase 2: LLM-powered with context
85
-
86
- 2. **Conversation Controller** (`chat/conversation_controller.py`)
87
- - Orchestrates entire workflow
88
- - Manages stage progression
89
- - Enforces approval gates
90
- - Tracks conversation history
91
- - Routes to appropriate executor
92
-
93
- 3. **Stage Executors** (`executors/`)
94
- - Specialized handlers for each workflow stage
95
- - Research, DDL, Population, Deploy, Model, Liveboard, etc.
96
- - Each knows how to execute and refine its stage
97
- - Reuses existing code where possible
98
-
99
- 4. **Chat UI** (`chat/ui.py`)
100
- - Gradio-based conversational interface
101
- - Streaming responses
102
- - Quick action buttons (Approve/Reject)
103
- - Progress visualization
104
- - Asset display
105
-
106
- ---
107
-
108
- ## 🎨 User Experience Transformation
109
-
110
- ### Example: Creating a Demo
111
-
112
- **BEFORE (Button UI):**
113
- ```
114
- [Input company URL]
115
- [Input use case]
116
- [Click: Start Research]
117
- ... wait ...
118
- [Auto-advances to DDL]
119
- [Click: Create DDL]
120
- ... wait ...
121
- [Auto-advances to Population]
122
- [If error: manual troubleshooting]
123
- ```
124
-
125
- **AFTER (Chat UI):**
126
- ```
127
- User: Create a supply chain demo for Amazon.com
128
-
129
- AI: 🔍 Starting research on Amazon and supply chain analytics...
130
- [Streams findings in real-time]
131
- ✅ Research complete!
132
- 👉 Please review and approve, or tell me what to adjust.
133
-
134
- User: Looks good but focus more on last-mile delivery
135
-
136
- AI: 🔄 Refining research with focus on last-mile delivery...
137
- [Streams updated research]
138
- Better?
139
-
140
- User: Perfect, approve
141
-
142
- AI: ✅ Research approved!
143
- 🏗️ Creating database schema...
144
- [Streams DDL]
145
- Generated 5 tables. Approve?
146
-
147
- User: Add an email column to the customers table
148
-
149
- AI: 🎨 Adding email column to customers...
150
- [Shows updated table]
151
- ✅ Updated! Approve now?
152
-
153
- User: Yes
154
-
155
- AI: ✅ DDL approved!
156
- [Continues to next stage...]
157
- ```
158
-
159
- ---
160
-
161
- ## 📊 New Workflow Stages
162
-
163
- Expanding beyond the current 4-stage workflow:
164
-
165
- | Stage | Current | New Capability |
166
- |-------|---------|----------------|
167
- | **Research** | ✅ Exists | ✅ + Approval gate + Refinement |
168
- | **DDL Creation** | ✅ Exists | ✅ + Approval gate + Targeted edits |
169
- | **Population Code** | ✅ Exists | ✅ + Approval gate + Parameter tuning |
170
- | **Deployment** | ✅ Exists | ✅ + Error recovery + Rollback |
171
- | **Model Creation** | ✅ Exists | ✅ + Approval gate |
172
- | **Liveboard Creation** | ✅ Exists | ✅ + Approval gate |
173
- | **Visualization Refinement** | ❌ New | 🆕 Chart type changes, filters, measures |
174
- | **Site Creation** | ❌ New | 🆕 Generate demo website with branding |
175
- | **Bot Creation** | ❌ New | 🆕 Create demo chatbot |
176
-
177
- ---
178
-
179
- ## 🔄 Approval & Iteration Flow
180
-
181
- ### Approval Gate Pattern
182
-
183
- ```
184
- ┌──────────────────────────┐
185
- │ Stage Executes │
186
- │ (streams results) │
187
- └──────────┬───────────────┘
188
-
189
-
190
- ┌──────────────────────────┐
191
- │ Present Results │
192
- │ Request Approval │
193
- └──────────┬───────────────┘
194
-
195
-
196
- ┌─────┴─────┐
197
- │ │
198
- ▼ ▼
199
- ┌─────────┐ ┌──────────┐
200
- │ APPROVE │ │ REJECT/ │
201
- │ │ │ REFINE │
202
- └────┬────┘ └────┬─────┘
203
- │ │
204
- │ ▼
205
- │ ┌──────────┐
206
- │ │ Re-exec │
207
- │ │ with mods│
208
- │ └────┬─────┘
209
- │ │
210
- │ ▼
211
- │ ┌──────────┐
212
- │ │ Present │
213
- │ │ Updated │
214
- │ └────┬─────┘
215
- │ │
216
- └────────────┘
217
-
218
-
219
- ┌─────────────┐
220
- │ Next Stage │
221
- └─────────────┘
222
- ```
223
-
224
- ### Refinement Pattern
225
-
226
- ```
227
- User: "Add email column to customers"
228
-
229
- Classify: REFINE intent
230
-
231
- Extract: target="customers", modification="add email column"
232
-
233
- Route to: DDLExecutor.refine()
234
-
235
- Execute: Targeted modification (not full regen)
236
-
237
- Validate: Check schema integrity
238
-
239
- Present: Updated DDL
240
-
241
- Request: Approval (again)
242
- ```
243
-
244
- ---
245
-
246
- ## 📁 Documentation Structure
247
-
248
- We've created 4 comprehensive documents:
249
-
250
- ### 1. **CHAT_ARCHITECTURE_PLAN.md** (Main Technical Design)
251
- - Detailed architecture diagrams
252
- - Component specifications
253
- - Data models and state management
254
- - Migration strategy (10 phases)
255
- - Risk mitigation
256
- - Success metrics
257
-
258
- **Use this for:** Understanding the full system design and technical decisions
259
-
260
- ### 2. **IMPLEMENTATION_ROADMAP.md** (Practical Guide)
261
- - Phase-by-phase implementation steps
262
- - Code snippets and examples
263
- - Testing strategies
264
- - Quick wins (Phase 1 in 2 weeks)
265
- - Common pitfalls to avoid
266
- - Help & troubleshooting
267
-
268
- **Use this for:** Actually building the system, day-to-day development
269
-
270
- ### 3. **CONVERSATION_PATTERNS.md** (UX Patterns)
271
- - User intent categories
272
- - Example conversations
273
- - Response templates
274
- - Clarification strategies
275
- - Edge case handling
276
- - Quality checklist
277
-
278
- **Use this for:** Designing conversation flows, training intent classifier
279
-
280
- ### 4. **TRANSFORMATION_SUMMARY.md** (This Document)
281
- - High-level overview
282
- - Strategic goals
283
- - Document roadmap
284
- - Quick reference
285
-
286
- **Use this for:** Understanding the big picture, presenting to stakeholders
287
-
288
- ---
289
-
290
- ## 🚀 Implementation Phases
291
-
292
- ### Phase 1: Foundation (Weeks 1-2) ⭐ QUICK WIN
293
- **Goal:** Basic chat that triggers existing workflow
294
-
295
- **Deliverables:**
296
- - Chat UI in new Gradio tab
297
- - Simple rule-based intent classification
298
- - Bridge to existing workflow functions
299
- - Formatted streaming output
300
-
301
- **Complexity:** 🟢 Low
302
- **Risk:** 🟢 Low (no changes to existing code)
303
- **Value:** 🟡 Medium (proves concept)
304
-
305
- ---
306
-
307
- ### Phase 2: Approval Gates (Weeks 3-4) ⭐ HIGH VALUE
308
- **Goal:** Users must approve before advancing
309
-
310
- **Deliverables:**
311
- - Approval state management
312
- - Approve/reject buttons
313
- - Block auto-advancement
314
- - Redo functionality
315
-
316
- **Complexity:** 🟡 Medium
317
- **Risk:** 🟡 Medium (state management)
318
- **Value:** 🟢 High (major UX improvement)
319
-
320
- ---
321
-
322
- ### Phase 3: Refinement (Weeks 5-6) ⭐ HIGH VALUE
323
- **Goal:** Targeted modifications without full regeneration
324
-
325
- **Deliverables:**
326
- - DDL refinement (table/column changes)
327
- - Population refinement (data params)
328
- - Refinement intent detection
329
- - Partial regeneration logic
330
-
331
- **Complexity:** 🟡 Medium
332
- **Risk:** 🟡 Medium (must maintain integrity)
333
- **Value:** 🟢 High (huge time saver)
334
-
335
- ---
336
-
337
- ### Phase 4: Viz Refinement (Week 7)
338
- **Goal:** Modify visualizations independently
339
-
340
- **Deliverables:**
341
- - Visualization refiner class
342
- - Chart type changes
343
- - Filter modifications
344
- - Measure/dimension updates
345
-
346
- **Complexity:** 🟡 Medium
347
- **Risk:** 🟢 Low (isolated feature)
348
- **Value:** 🟡 Medium (nice to have)
349
-
350
- ---
351
-
352
- ### Phase 5: New Stages (Weeks 8-10)
353
- **Goal:** Site and bot creation
354
-
355
- **Deliverables:**
356
- - Site creator (HTML generation)
357
- - Bot creator (config generation)
358
- - Branding application
359
- - End-to-end flow
360
-
361
- **Complexity:** 🔴 High
362
- **Risk:** 🟡 Medium (new territory)
363
- **Value:** 🟢 High (differentiator)
364
-
365
- ---
366
-
367
- ## 📊 Success Metrics
368
-
369
- ### User Experience Metrics
370
- - [ ] **Demo completion time**: Reduce by 40%
371
- - [ ] **User errors**: Reduce by 60%
372
- - [ ] **Refinement iterations**: Enable 3+ per stage
373
- - [ ] **User satisfaction**: > 4.5/5 rating
374
- - [ ] **First-time success**: 80% complete without help
375
-
376
- ### Technical Metrics
377
- - [ ] **Intent accuracy**: > 90% correct classification
378
- - [ ] **Response time**: < 500ms to first token
379
- - [ ] **Uptime**: 99.5% availability
380
- - [ ] **Error recovery**: 80% auto-resolved
381
- - [ ] **Test coverage**: > 85%
382
-
383
- ### Business Metrics
384
- - [ ] **Adoption rate**: 70% of users prefer chat mode
385
- - [ ] **Demo quality**: Higher win rates
386
- - [ ] **Time to value**: 50% faster
387
- - [ ] **Support tickets**: 40% reduction
388
-
389
- ---
390
-
391
- ## 🎯 Key Design Decisions
392
-
393
- ### 1. Keep Existing Code
394
- **Decision:** Wrap existing functions, don't rewrite
395
- **Rationale:** Lower risk, faster implementation, proven functionality
396
- **Trade-off:** Some technical debt, not perfectly optimized
397
-
398
- ### 2. Gradio for Chat UI
399
- **Decision:** Extend existing Gradio app with chat tab
400
- **Rationale:** Consistent UX, faster development, user familiarity
401
- **Trade-off:** Gradio limitations vs custom UI
402
-
403
- ### 3. Approval Required for Major Stages
404
- **Decision:** Research, DDL, Population, Liveboard require approval
405
- **Rationale:** Prevent bad outputs from propagating, give user control
406
- **Trade-off:** More clicks, but better quality
407
-
408
- ### 4. Two-Tier Intent Classification
409
- **Decision:** Phase 1 = rules, Phase 2 = LLM
410
- **Rationale:** Quick MVP, then upgrade as needed
411
- **Trade-off:** Phase 1 less accurate, but faster/cheaper
412
-
413
- ### 5. Streaming Everything
414
- **Decision:** All AI responses stream
415
- **Rationale:** Better perceived performance, real-time feedback
416
- **Trade-off:** More complex code, state management
417
-
418
- ### 6. Stage Executors Pattern
419
- **Decision:** One executor class per stage
420
- **Rationale:** Separation of concerns, easier to test, extend
421
- **Trade-off:** More files, need consistent interface
422
-
423
- ---
424
-
425
- ## 🚧 Risks & Mitigation
426
-
427
- | Risk | Probability | Impact | Mitigation |
428
- |------|------------|--------|------------|
429
- | **Intent misclassification** | High | Medium | Clarification questions, confidence thresholds |
430
- | **State management bugs** | Medium | High | Comprehensive testing, state snapshots |
431
- | **LLM API failures** | Medium | High | Retry logic, fallback options, error messages |
432
- | **User confusion** | Medium | Medium | Clear prompts, help command, tutorials |
433
- | **Performance degradation** | Low | Medium | Caching, async operations, load testing |
434
- | **Breaking existing functionality** | Low | High | Separate modules, integration tests |
435
-
436
- ---
437
-
438
- ## 🎓 Learning Resources
439
-
440
- ### For Developers
441
-
442
- **Must Read:**
443
- 1. IMPLEMENTATION_ROADMAP.md - Start here
444
- 2. Phase 1 code examples - Understand the pattern
445
- 3. Stage executor interface - See how to extend
446
-
447
- **Reference:**
448
- - CHAT_ARCHITECTURE_PLAN.md - Full technical design
449
- - CONVERSATION_PATTERNS.md - UX patterns
450
- - Existing codebase - Reuse existing functions
451
-
452
- ### For Product/UX
453
-
454
- **Must Read:**
455
- 1. This document (TRANSFORMATION_SUMMARY.md)
456
- 2. CONVERSATION_PATTERNS.md - All user interactions
457
- 3. Phase 2 approval gates - Key UX improvement
458
-
459
- **Reference:**
460
- - Example conversations in CONVERSATION_PATTERNS.md
461
- - Success metrics section
462
- - User experience transformation section
463
-
464
- ---
465
-
466
- ## 📞 Next Steps
467
-
468
- ### Immediate (This Week)
469
- 1. ✅ Review all documentation
470
- 2. ⬜ Discuss priorities and timeline
471
- 3. ⬜ Approve Phase 1 scope
472
- 4. ⬜ Set up development environment
473
- 5. ⬜ Create GitHub issues for Phase 1
474
-
475
- ### Short Term (Weeks 1-2)
476
- 1. ⬜ Implement Phase 1 (foundation)
477
- 2. ⬜ Internal testing and feedback
478
- 3. ⬜ Iterate on conversation patterns
479
- 4. ⬜ Plan Phase 2 approval gates
480
-
481
- ### Medium Term (Weeks 3-6)
482
- 1. ⬜ Implement Phases 2-3 (approval + refinement)
483
- 2. ⬜ User testing with pilot group
484
- 3. ⬜ Gather metrics on usage patterns
485
- 4. ⬜ Refine intent classification
486
-
487
- ### Long Term (Weeks 7-10)
488
- 1. ⬜ Implement Phases 4-5 (viz + new stages)
489
- 2. ⬜ Full rollout to users
490
- 3. ⬜ Monitor adoption and satisfaction
491
- 4. ⬜ Plan next enhancements
492
-
493
- ---
494
-
495
- ## 🤔 Open Questions for Discussion
496
-
497
- ### Strategy
498
- 1. Should we keep button UI as an option or fully migrate to chat?
499
- 2. Which stages require approval vs auto-advance?
500
- 3. Priority: refinement capability or new stages (site/bot)?
501
-
502
- ### Technical
503
- 1. Which LLM for intent classification (speed vs accuracy)?
504
- 2. Should we cache intent classification results?
505
- 3. How to handle very long conversations (context limits)?
506
-
507
- ### UX
508
- 1. How much guidance vs. letting AI interpret freely?
509
- 2. Should quick action buttons always be visible?
510
- 3. What's the ideal refinement UX (buttons, forms, pure chat)?
511
-
512
- ### Business
513
- 1. Timeline constraints or flexibility?
514
- 2. Budget for LLM API costs?
515
- 3. Success criteria for each phase?
516
-
517
- ---
518
-
519
- ## 📖 Glossary
520
-
521
- **Stage** - A major step in the workflow (e.g., Research, DDL Creation)
522
-
523
- **Intent** - What the user wants to do (e.g., approve, refine, get info)
524
-
525
- **Approval Gate** - A required checkpoint where user must explicitly approve before proceeding
526
-
527
- **Refinement** - Targeted modification of output without full regeneration
528
-
529
- **Executor** - A specialized handler for a particular stage
530
-
531
- **Controller** - The orchestrator that manages workflow and state
532
-
533
- **Streaming** - Sending partial results as they're generated (not waiting for completion)
534
-
535
- **Context** - The current state of the conversation and workflow
536
-
537
- ---
538
-
539
- ## 🎉 Vision Statement
540
-
541
- **"Create perfect ThoughtSpot demos through natural conversation, not button clicks."**
542
-
543
- Imagine a future where:
544
- - Sales engineers chat naturally with AI to create demos
545
- - No training required - the AI guides them
546
- - Bad outputs never make it to production (approval gates)
547
- - Refinement is instant (no waiting for regeneration)
548
- - Demos are created in 15 minutes instead of hours
549
- - Quality is consistent and high across all demos
550
-
551
- **This transformation makes that future possible.**
552
-
553
- ---
554
-
555
- ## 📝 Version History
556
-
557
- | Version | Date | Changes |
558
- |---------|------|---------|
559
- | 1.0 | 2025-11-12 | Initial comprehensive plan created |
560
-
561
- ---
562
-
563
- **Questions? Start with IMPLEMENTATION_ROADMAP.md for hands-on guidance!**
564
-
565
- **Ready to build? Begin with Phase 1 foundation!**
566
-
567
- **Need technical details? See CHAT_ARCHITECTURE_PLAN.md!**
568
-
569
- **Designing conversations? Check CONVERSATION_PATTERNS.md!**
570
-
571
- ---
572
-
573
- *This transformation will revolutionize how demos are created. Let's build it! 🚀*
574
-
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
chat_data_adjuster.py ADDED
@@ -0,0 +1,163 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Simple Chat Interface for Data Adjustment
3
+
4
+ A basic command-line chat to test the conversational data adjuster.
5
+ User can keep making adjustments until they type 'done' or 'exit'.
6
+ """
7
+
8
+ from dotenv import load_dotenv
9
+ import os
10
+ from conversational_data_adjuster import ConversationalDataAdjuster
11
+
12
+ load_dotenv()
13
+
14
+
15
+ def chat_loop():
16
+ """Main chat loop for data adjustment"""
17
+
18
+ print("""
19
+ ╔════════════════════════════════════════════════════════════╗
20
+ ║ ║
21
+ ║ Data Adjustment Chat ║
22
+ ║ ║
23
+ ╚════════════════════════════════════════════════════════════╝
24
+
25
+ Commands:
26
+ - Type your adjustment request naturally
27
+ - "done" or "exit" to quit
28
+ - "help" for examples
29
+
30
+ Examples:
31
+ - "increase 1080p Webcam sales to 50B"
32
+ - "set profit margin to 25% for electronics"
33
+ - "make Tablet revenue 100 billion"
34
+
35
+ """)
36
+
37
+ # Initialize adjuster
38
+ database = os.getenv('SNOWFLAKE_DATABASE')
39
+ schema = "20251116_140933_AMAZO_SAL"
40
+ model_id = "3c97b0d6-448b-440a-b628-bac1f3d73049"
41
+
42
+ print(f"📊 Connected to: {database}.{schema}")
43
+ print(f"🎯 Model: {model_id}\n")
44
+
45
+ adjuster = ConversationalDataAdjuster(database, schema, model_id)
46
+ adjuster.connect()
47
+
48
+ tables = adjuster.get_available_tables()
49
+ print(f"📋 Available tables: {', '.join(tables)}\n")
50
+ print("="*80)
51
+ print("Ready! Type your adjustment request...")
52
+ print("="*80 + "\n")
53
+
54
+ while True:
55
+ # Get user input
56
+ user_input = input("\n💬 You: ").strip()
57
+
58
+ if not user_input:
59
+ continue
60
+
61
+ # Check for exit commands
62
+ if user_input.lower() in ['done', 'exit', 'quit', 'bye']:
63
+ print("\n👋 Goodbye!")
64
+ break
65
+
66
+ # Check for help
67
+ if user_input.lower() == 'help':
68
+ print("""
69
+ 📚 Help - How to make adjustments:
70
+
71
+ Format: "make/increase/set [entity] [metric] to [value]"
72
+
73
+ Examples:
74
+ ✅ "increase 1080p Webcam revenue to 50 billion"
75
+ ✅ "set profit margin to 25% for electronics"
76
+ ✅ "make Laptop sales 100B"
77
+ ✅ "increase customer segment premium revenue by 30%"
78
+
79
+ You'll see 3 strategy options - type A, B, or C to pick one.
80
+ """)
81
+ continue
82
+
83
+ try:
84
+ # Step 1: Parse the request
85
+ print(f"\n🤔 Parsing your request...")
86
+ adjustment = adjuster.parse_adjustment_request(user_input, tables)
87
+
88
+ if 'error' in adjustment:
89
+ print(f"❌ {adjustment['error']}")
90
+ print("💡 Try rephrasing or type 'help' for examples")
91
+ continue
92
+
93
+ # Step 2: Analyze current data
94
+ print(f"📊 Analyzing current data...")
95
+ analysis = adjuster.analyze_current_data(adjustment)
96
+
97
+ if analysis['current_total'] == 0:
98
+ print(f"⚠️ No data found for '{adjustment['entity_value']}'")
99
+ print("💡 Try a different product/entity name")
100
+ continue
101
+
102
+ # Step 3: Generate strategies
103
+ strategies = adjuster.generate_strategy_options(adjustment, analysis)
104
+
105
+ # Step 4: Present options
106
+ adjuster.present_options(adjustment, analysis, strategies)
107
+
108
+ # Step 5: Get user's strategy choice
109
+ print("\n" + "="*80)
110
+ choice = input("Which strategy? [A/B/C or 'skip']: ").strip().upper()
111
+
112
+ if choice == 'SKIP' or not choice:
113
+ print("⏭️ Skipping this adjustment")
114
+ continue
115
+
116
+ # Find the chosen strategy
117
+ chosen = None
118
+ for s in strategies:
119
+ if s['id'] == choice:
120
+ chosen = s
121
+ break
122
+
123
+ if not chosen:
124
+ print(f"❌ Invalid choice: {choice}")
125
+ continue
126
+
127
+ # Step 6: Confirm
128
+ print(f"\n⚠️ About to execute: {chosen['name']}")
129
+ print(f" This will affect {chosen.get('details', {}).get('rows_affected', 'some')} rows")
130
+ confirm = input(" Confirm? [yes/no]: ").strip().lower()
131
+
132
+ if confirm not in ['yes', 'y']:
133
+ print("❌ Cancelled")
134
+ continue
135
+
136
+ # Step 7: Execute
137
+ result = adjuster.execute_strategy(chosen)
138
+
139
+ if result['success']:
140
+ print(f"\n✅ {result['message']}")
141
+ print(f"🔄 Data updated! Refresh your ThoughtSpot liveboard to see changes.")
142
+ else:
143
+ print(f"\n❌ Failed: {result.get('error')}")
144
+
145
+ except KeyboardInterrupt:
146
+ print("\n\n⚠️ Interrupted")
147
+ break
148
+ except Exception as e:
149
+ print(f"\n❌ Error: {e}")
150
+ import traceback
151
+ print(traceback.format_exc())
152
+
153
+ # Cleanup
154
+ adjuster.close()
155
+ print("\n✅ Connection closed")
156
+
157
+
158
+ if __name__ == "__main__":
159
+ try:
160
+ chat_loop()
161
+ except KeyboardInterrupt:
162
+ print("\n\n👋 Goodbye!")
163
+
chat_interface.py CHANGED
@@ -42,7 +42,8 @@ class ChatDemoInterface:
42
  'model': 'claude-sonnet-4.5',
43
  'fact_table_size': '1000',
44
  'dim_table_size': '100',
45
- 'stage': 'initialization'
 
46
  }
47
 
48
  try:
@@ -64,6 +65,8 @@ class ChatDemoInterface:
64
  defaults['fact_table_size'] = settings.get('fact_table_size')
65
  if settings.get('dim_table_size'):
66
  defaults['dim_table_size'] = settings.get('dim_table_size')
 
 
67
  except Exception as e:
68
  print(f"Could not load settings from Supabase: {e}")
69
 
@@ -507,6 +510,9 @@ Cannot deploy to ThoughtSpot without tables."""
507
  # Get currently selected model
508
  llm_model = self.settings.get('model', 'claude-sonnet-4.5')
509
 
 
 
 
510
  results = deployer.deploy_all(
511
  ddl=ddl,
512
  database=database,
@@ -515,6 +521,7 @@ Cannot deploy to ThoughtSpot without tables."""
515
  use_case=use_case,
516
  liveboard_name=liveboard_name,
517
  llm_model=llm_model, # Pass selected model
 
518
  progress_callback=progress_callback
519
  )
520
 
@@ -739,144 +746,211 @@ Type **'done'** to finish."""
739
 
740
  # Handle deployment errors (usually population failures)
741
  if hasattr(self, '_last_population_error'):
742
- if 'retry' in message_lower:
 
743
  # Retry with same code
744
  chat_history[-1] = (message, "🔄 Retrying population...")
745
  yield chat_history, current_stage, current_model, company, use_case, ""
746
 
747
- from demo_prep import execute_population_script
748
- is_template = getattr(self.demo_builder, 'population_code_source', 'llm') == 'template'
749
- success, msg = execute_population_script(
750
- self.demo_builder.data_population_results,
751
- self._last_schema_name,
752
- skip_modifications=is_template
753
- )
754
-
755
- if success:
756
- response = f" **Population Successful!**\n\n{msg}\n\nDemo deployed to Snowflake! 🎉"
757
- del self._last_population_error
758
- del self._last_schema_name
759
- else:
760
- response = f"❌ Still failed: {msg[:200]}...\n\nTry 'truncate' or 'fix'?"
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
761
 
762
  chat_history[-1] = (message, response)
763
  yield chat_history, current_stage, current_model, company, use_case, ""
764
  return
765
 
766
- elif 'truncate' in message_lower:
767
  # Truncate tables and retry
768
  chat_history[-1] = (message, "🗑️ Truncating tables and retrying...")
769
  yield chat_history, current_stage, current_model, company, use_case, ""
770
 
771
- from cdw_connector import SnowflakeDeployer
772
- from demo_prep import execute_population_script
773
-
774
- deployer = SnowflakeDeployer()
775
- deployer.connect()
776
-
777
- # Truncate all tables in schema
778
  try:
779
- cursor = deployer.connection.cursor()
780
- cursor.execute(f"USE SCHEMA {self._last_schema_name}")
781
- cursor.execute("SHOW TABLES")
782
- tables = cursor.fetchall()
 
 
 
 
 
 
 
 
783
 
784
- for table in tables:
785
- table_name = table[1]
786
- self.log_feedback(f"Truncating {table_name}...")
787
- cursor.execute(f"TRUNCATE TABLE {table_name}")
788
 
789
- cursor.close()
790
- self.log_feedback("✅ Tables truncated")
791
- except Exception as e:
792
- self.log_feedback(f"⚠️ Truncate warning: {e}")
793
-
794
- # Retry population
795
- is_template = getattr(self.demo_builder, 'population_code_source', 'llm') == 'template'
796
- success, msg = execute_population_script(
797
- self.demo_builder.data_population_results,
798
- self._last_schema_name,
799
- skip_modifications=is_template
800
- )
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
801
 
802
- if success:
803
- response = f"✅ **Population Successful!**\n\n{msg}\n\nDemo deployed to Snowflake! 🎉"
804
- del self._last_population_error
805
- del self._last_schema_name
806
- else:
807
- response = f"❌ Still failed: {msg[:200]}...\n\nTry 'fix' to let AI correct the code?"
808
 
809
  chat_history[-1] = (message, response)
810
  yield chat_history, current_stage, current_model, company, use_case, ""
811
  return
812
 
813
- elif 'fix' in message_lower:
814
  # Regenerate the code using the fixed template
815
  chat_history[-1] = (message, "🔧 Regenerating population code with fixed template...")
816
  yield chat_history, current_stage, current_model, company, use_case, ""
817
 
818
- self.log_feedback("🔧 Regenerating population code from scratch...")
819
-
820
- # Regenerate using the reliable template
821
- from schema_utils import parse_ddl_schema
822
-
823
- schema_info = parse_ddl_schema(self.demo_builder.schema_generation_results)
824
- if not schema_info:
825
- response = "❌ Failed to parse DDL schema. Cannot regenerate."
826
- chat_history[-1] = (message, response)
827
- yield chat_history, current_stage, current_model, company, use_case, ""
828
- return
829
-
830
- # Generate new code using the template (which includes all fixes)
831
- fixed_code = self.get_fallback_population_code(schema_info)
832
-
833
- # Validate it compiles
834
  try:
835
- compile(fixed_code, '<regenerated>', 'exec')
836
- self.log_feedback("✅ Regenerated code validated")
837
- except SyntaxError as e:
838
- response = f"❌ Template generation bug: {e}\n\nPlease contact support."
839
- chat_history[-1] = (message, response)
840
- yield chat_history, current_stage, current_model, company, use_case, ""
841
- return
842
-
843
- # Update the code and mark as template-generated
844
- self.demo_builder.data_population_results = fixed_code
845
- self.population_code = fixed_code
846
- self.demo_builder.population_code_source = "template" # Mark as template
847
-
848
- self.log_feedback("🔧 Code regenerated, retrying deployment...")
849
-
850
- # Truncate and retry
851
- from cdw_connector import SnowflakeDeployer
852
- from demo_prep import execute_population_script
853
-
854
- deployer = SnowflakeDeployer()
855
- deployer.connect()
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
856
 
857
- try:
858
- cursor = deployer.connection.cursor()
859
- cursor.execute(f"USE SCHEMA {self._last_schema_name}")
860
- cursor.execute("SHOW TABLES")
861
- tables = cursor.fetchall()
862
- for table in tables:
863
- cursor.execute(f"TRUNCATE TABLE {table[1]}")
864
- cursor.close()
865
  except Exception as e:
866
- self.log_feedback(f"⚠️ Truncate warning: {e}")
867
-
868
- success, msg = execute_population_script(
869
- fixed_code,
870
- self._last_schema_name,
871
- skip_modifications=True # Template code, don't modify
872
- )
873
-
874
- if success:
875
- response = f"✅ **Fixed and Successful!**\n\n{msg}\n\nDemo deployed to Snowflake! 🎉"
876
- del self._last_population_error
877
- del self._last_schema_name
878
- else:
879
- response = f"❌ AI fix didn't work: {msg[:200]}...\n\nTry 'fix' again or 'retry'?"
880
 
881
  chat_history[-1] = (message, response)
882
  yield chat_history, current_stage, current_model, company, use_case, ""
@@ -1082,16 +1156,20 @@ Try a different request or type **'done'** to finish."""
1082
  import re
1083
 
1084
  # Pattern: "company: XYZ" or "for the company: XYZ"
 
1085
  patterns = [
1086
- r'company:\s*([^,\n]+?)(?:\s+and|\s+for|$)',
1087
- r'for\s+(?:the\s+)?company:\s*([^,\n]+?)(?:\s+and|\s+for|$)',
1088
- r'demo for\s+(?:the\s+)?company:\s*([^,\n]+?)(?:\s+and|\s+for|$)'
1089
  ]
1090
 
1091
  for pattern in patterns:
1092
  match = re.search(pattern, message, re.IGNORECASE)
1093
  if match:
1094
- return match.group(1).strip()
 
 
 
1095
 
1096
  return None
1097
 
@@ -1182,9 +1260,20 @@ To change settings, use:
1182
  yield progress_message
1183
 
1184
  try:
1185
- # Initialize demo builder if needed
1186
- if not self.demo_builder:
1187
- self.log_feedback("Initializing DemoBuilder...")
 
 
 
 
 
 
 
 
 
 
 
1188
  progress_message += "✓ Initializing DemoBuilder...\n"
1189
  yield progress_message
1190
 
@@ -1193,8 +1282,12 @@ To change settings, use:
1193
  company_url=company
1194
  )
1195
 
1196
- # Prepare URL
1197
- url = company if company.startswith('http') else f"https://{company}"
 
 
 
 
1198
 
1199
  # Check for cached research results
1200
  domain = url.replace('https://', '').replace('http://', '').replace('www.', '').split('/')[0]
@@ -1275,6 +1368,25 @@ To change settings, use:
1275
 
1276
  website = Website(url)
1277
  self.demo_builder.website_data = website
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1278
  self.log_feedback(f"Extracted {len(website.text)} characters from {website.title}")
1279
  progress_message += f"✓ Extracted {len(website.text)} characters from {website.title}\n\n"
1280
  yield progress_message
@@ -1442,9 +1554,20 @@ To change settings, use:
1442
  self.log_feedback(f"🔍 Starting research for {company} - {use_case}")
1443
 
1444
  try:
1445
- # Initialize demo builder if needed
1446
- if not self.demo_builder:
1447
- self.log_feedback("Initializing DemoBuilder...")
 
 
 
 
 
 
 
 
 
 
 
1448
  self.demo_builder = DemoBuilder(
1449
  use_case=use_case,
1450
  company_url=company
@@ -1455,6 +1578,22 @@ To change settings, use:
1455
  url = company if company.startswith('http') else f"https://{company}"
1456
  website = Website(url)
1457
  self.demo_builder.website_data = website
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1458
  self.log_feedback(f"Extracted {len(website.text)} characters from {website.title}")
1459
 
1460
  # Get LLM provider
@@ -1586,6 +1725,13 @@ TECHNICAL REQUIREMENTS:
1586
  - Include realistic column names that match the business context
1587
  - Add proper constraints and relationships
1588
 
 
 
 
 
 
 
 
1589
  SNOWFLAKE SYNTAX EXAMPLES:
1590
  - Auto-increment: ColumnID INT IDENTITY(1,1) PRIMARY KEY
1591
  - NOT: ColumnID INT PRIMARY KEY AUTO_INCREMENT
@@ -1783,46 +1929,59 @@ Generate complete CREATE TABLE statements with proper Snowflake syntax and depen
1783
  fake_values.append("random.choice(['A', 'B', 'C'])")
1784
 
1785
  elif 'VARCHAR' in col_type or 'TEXT' in col_type or 'STRING' in col_type or 'CHAR' in col_type:
1786
- # Generate domain-specific realistic data based on column name
 
 
 
 
 
 
1787
  if 'NAME' in col_name_upper and 'COMPANY' not in col_name_upper:
1788
  if 'PRODUCT' in col_name_upper:
1789
- fake_values.append("random.choice(['Laptop Pro 15', 'Wireless Mouse 2.4GHz', 'USB-C Cable 6ft', 'Monitor Stand Adjustable', 'Mechanical Keyboard RGB', 'Noise Canceling Headphones', '1080p Webcam', 'Portable SSD 1TB', 'Power Bank 20000mAh', 'Tablet 10 inch', 'Smart Watch', 'Bluetooth Speaker', 'Gaming Mouse Pad', 'Phone Case', 'Screen Protector', 'Charging Cable', 'Desk Lamp LED', 'Laptop Bag', 'Wireless Earbuds', 'USB Hub'])")
1790
  elif 'CUSTOMER' in col_name_upper or 'USER' in col_name_upper:
1791
- fake_values.append("fake.name()[:50]") # Truncate to 50 chars
1792
  elif 'SELLER' in col_name_upper or 'VENDOR' in col_name_upper:
1793
- fake_values.append("random.choice(['Amazon', 'Best Buy', 'Walmart', 'Target', 'Costco', 'Home Depot', 'Lowes', 'Macys', 'Nordstrom', 'Kohls'])")
1794
  else:
1795
- fake_values.append("fake.name()[:50]")
1796
  elif 'CATEGORY' in col_name_upper:
1797
- fake_values.append("random.choice(['Electronics', 'Home & Kitchen', 'Books', 'Clothing', 'Sports', 'Toys', 'Beauty', 'Automotive'])")
1798
  elif 'BRAND' in col_name_upper:
1799
- fake_values.append("random.choice(['Samsung', 'Apple', 'Sony', 'LG', 'Dell', 'HP', 'Lenovo', 'Amazon Basics', 'Anker', 'Logitech'])")
 
 
 
 
 
 
 
 
 
1800
  elif 'DESCRIPTION' in col_name_upper or 'DESC' in col_name_upper:
1801
- fake_values.append("random.choice(['High quality product', 'Best seller', 'Customer favorite', 'New arrival', 'Limited edition', 'Premium quality'])")
1802
  elif 'EMAIL' in col_name_upper:
1803
- fake_values.append("fake.email()[:50]") # Truncate to 50 chars
1804
  elif 'PHONE' in col_name_upper:
1805
- fake_values.append("f'{random.randint(200, 999)}-{random.randint(200, 999)}-{random.randint(1000, 9999)}'")
1806
  elif 'ADDRESS' in col_name_upper or 'STREET' in col_name_upper:
1807
- fake_values.append("f'{random.randint(1, 9999)} {random.choice([\"Main\", \"Oak\", \"Park\", \"Maple\", \"Cedar\", \"Elm\", \"Washington\", \"Lake\", \"Hill\", \"Broadway\"])} {random.choice([\"St\", \"Ave\", \"Blvd\", \"Dr\", \"Ln\"])}'")
1808
  elif 'CITY' in col_name_upper:
1809
- fake_values.append("random.choice(['New York', 'Los Angeles', 'Chicago', 'Houston', 'Phoenix', 'Philadelphia', 'San Antonio', 'San Diego', 'Dallas', 'San Jose', 'Austin', 'Seattle', 'Denver', 'Boston', 'Portland', 'Miami', 'Atlanta', 'Detroit', 'Las Vegas', 'Toronto'])")
1810
  elif 'STATE' in col_name_upper or 'PROVINCE' in col_name_upper:
1811
- fake_values.append("random.choice(['California', 'Texas', 'New York', 'Florida', 'Illinois', 'Ohio', 'Georgia', 'Washington', 'Virginia', 'Arizona', 'Colorado', 'Oregon', 'Nevada', 'Utah', 'Iowa'])")
1812
  elif 'COUNTRY' in col_name_upper:
1813
- fake_values.append("random.choice(['USA', 'Canada', 'UK', 'Germany', 'France', 'Japan', 'Australia', 'India', 'China', 'Brazil', 'Mexico', 'Spain', 'Italy', 'Netherlands', 'Sweden'])")
1814
  elif 'ZIP' in col_name_upper or 'POSTAL' in col_name_upper:
1815
- fake_values.append("random.choice(['10001', '90210', '60601', '77001', '85001', '19101', '78201', '92101', '75201', '95101', '78701', '98101', '80201', '02101', '97201'])")
1816
  elif 'COMPANY' in col_name_upper:
1817
- fake_values.append("random.choice(['Amazon', 'Microsoft', 'Apple Inc', 'Google LLC', 'Meta', 'Tesla Inc', 'Netflix', 'Adobe Inc', 'Oracle Corp', 'Salesforce', 'IBM Corp', 'Intel Corp', 'Cisco Systems', 'Dell Technologies', 'HP Inc'])")
1818
  else:
1819
- # Default: generate realistic short text
1820
- import re
1821
- length_match = re.search(r'\((\d+)\)', col_type)
1822
- if length_match and int(length_match.group(1)) < 20:
1823
- fake_values.append("fake.word()[:10]")
1824
- else:
1825
- fake_values.append("fake.word()")
1826
  elif 'INT' in col_type or 'NUMBER' in col_type or 'BIGINT' in col_type:
1827
  fake_values.append("random.randint(1, 1000)")
1828
  elif 'DECIMAL' in col_type or 'FLOAT' in col_type or 'DOUBLE' in col_type or 'NUMERIC' in col_type:
@@ -2406,8 +2565,9 @@ def create_chat_interface():
2406
  settings.get("liveboard_name", ""), # 4. liveboard_name (moved)
2407
  str(settings.get("fact_table_size", "1000")), # 5. fact_table_size (moved)
2408
  str(settings.get("dim_table_size", "100")), # 6. dim_table_size
2409
- settings.get("object_naming_prefix", ""), # 7. object_naming_prefix
2410
- float(settings.get("temperature", 0.3)), # 8. temperature_slider
 
2411
  int(settings.get("max_tokens", 4000)), # 9. max_tokens
2412
  int(settings.get("batch_size", 5000)), # 10. batch_size
2413
  int(settings.get("thread_count", 4)), # 11. thread_count
@@ -2427,8 +2587,9 @@ def create_chat_interface():
2427
  "claude-sonnet-4.5", "", "Sales Analytics", # 1-3
2428
  "", # 4: liveboard_name
2429
  "1000", "100", # 5-6: fact_table_size, dim_table_size
2430
- "", # 7: object_naming_prefix
2431
- 0.3, 4000, 5000, 4, # 8-11: temperature, max_tokens, batch, threads
 
2432
  "", "", "ACCOUNTADMIN", # 12-14: sf settings
2433
  "COMPUTE_WH", "DEMO_DB", "PUBLIC", # 15-17: warehouse, db, schema
2434
  "", "", # 18-19: ts url, ts username
@@ -2446,8 +2607,9 @@ def create_chat_interface():
2446
  settings_components['liveboard_name'], # 4 (moved)
2447
  settings_components['fact_table_size'], # 5 (moved)
2448
  settings_components['dim_table_size'], # 6
2449
- settings_components['object_naming_prefix'], # 7
2450
- settings_components['temperature_slider'], # 8
 
2451
  settings_components['max_tokens'], # 9
2452
  settings_components['batch_size'], # 10
2453
  settings_components['thread_count'], # 11
@@ -2684,6 +2846,13 @@ def create_settings_tab():
2684
  info="Number of rows in dimension tables"
2685
  )
2686
 
 
 
 
 
 
 
 
2687
  object_naming_prefix = gr.Textbox(
2688
  label="Object Naming Prefix",
2689
  placeholder="e.g., 'ACME_' or 'DEMO_'",
@@ -2810,7 +2979,7 @@ def create_settings_tab():
2810
 
2811
  def save_settings_handler(
2812
  ai_model, company_url, use_case,
2813
- lb_name, fact_size, dim_size, obj_prefix,
2814
  temp, max_tok, batch, threads,
2815
  sf_acc, sf_user, sf_role, wh, db, schema,
2816
  ts_url, ts_user
@@ -2832,6 +3001,7 @@ def create_settings_tab():
2832
  "default_use_case": use_case,
2833
  "fact_table_size": fact_size,
2834
  "dim_table_size": dim_size,
 
2835
  "temperature": str(temp),
2836
  "max_tokens": str(int(max_tok)),
2837
  "batch_size": str(int(batch)),
@@ -2865,7 +3035,7 @@ def create_settings_tab():
2865
  fn=save_settings_handler,
2866
  inputs=[
2867
  default_ai_model, default_company_url, default_use_case,
2868
- liveboard_name, fact_table_size, dim_table_size, object_naming_prefix,
2869
  temperature_slider, max_tokens, batch_size, thread_count,
2870
  sf_account, sf_user, sf_role, default_warehouse, default_database, default_schema,
2871
  ts_instance_url, ts_username
@@ -2898,6 +3068,7 @@ def create_settings_tab():
2898
  'default_schema': default_schema,
2899
  'ts_instance_url': ts_instance_url,
2900
  'ts_username': ts_username,
 
2901
  'object_naming_prefix': object_naming_prefix,
2902
  'liveboard_name': liveboard_name,
2903
  'settings_status': settings_status
 
42
  'model': 'claude-sonnet-4.5',
43
  'fact_table_size': '1000',
44
  'dim_table_size': '100',
45
+ 'stage': 'initialization',
46
+ 'tag_name': None
47
  }
48
 
49
  try:
 
65
  defaults['fact_table_size'] = settings.get('fact_table_size')
66
  if settings.get('dim_table_size'):
67
  defaults['dim_table_size'] = settings.get('dim_table_size')
68
+ if settings.get('tag_name'):
69
+ defaults['tag_name'] = settings.get('tag_name')
70
  except Exception as e:
71
  print(f"Could not load settings from Supabase: {e}")
72
 
 
510
  # Get currently selected model
511
  llm_model = self.settings.get('model', 'claude-sonnet-4.5')
512
 
513
+ tag_name_value = self.settings.get('tag_name')
514
+ print(f"🔍 DEBUG: tag_name from settings = '{tag_name_value}'")
515
+
516
  results = deployer.deploy_all(
517
  ddl=ddl,
518
  database=database,
 
521
  use_case=use_case,
522
  liveboard_name=liveboard_name,
523
  llm_model=llm_model, # Pass selected model
524
+ tag_name=tag_name_value, # Pass tag from settings
525
  progress_callback=progress_callback
526
  )
527
 
 
746
 
747
  # Handle deployment errors (usually population failures)
748
  if hasattr(self, '_last_population_error'):
749
+ # Handle '1' or 'retry' - retry with same code
750
+ if 'retry' in message_lower or message_lower.strip() == '1':
751
  # Retry with same code
752
  chat_history[-1] = (message, "🔄 Retrying population...")
753
  yield chat_history, current_stage, current_model, company, use_case, ""
754
 
755
+ try:
756
+ # Check required attributes exist
757
+ if not hasattr(self.demo_builder, 'data_population_results') or not self.demo_builder.data_population_results:
758
+ response = "❌ **Error:** Population code not found. Please run population again first."
759
+ chat_history[-1] = (message, response)
760
+ yield chat_history, current_stage, current_model, company, use_case, ""
761
+ return
762
+
763
+ if not hasattr(self, '_last_schema_name') or not self._last_schema_name:
764
+ response = " **Error:** Schema name not found. Please run deployment again first."
765
+ chat_history[-1] = (message, response)
766
+ yield chat_history, current_stage, current_model, company, use_case, ""
767
+ return
768
+
769
+ from demo_prep import execute_population_script
770
+ is_template = getattr(self.demo_builder, 'population_code_source', 'llm') == 'template'
771
+ success, msg = execute_population_script(
772
+ self.demo_builder.data_population_results,
773
+ self._last_schema_name,
774
+ skip_modifications=is_template
775
+ )
776
+
777
+ if success:
778
+ response = f"✅ **Population Successful!**\n\n{msg}\n\nDemo deployed to Snowflake! 🎉"
779
+ del self._last_population_error
780
+ del self._last_schema_name
781
+ else:
782
+ response = f"❌ Still failed: {msg[:200]}...\n\nTry 'truncate' (or '2') or 'fix' (or '3')?"
783
+
784
+ except Exception as e:
785
+ import traceback
786
+ error_details = traceback.format_exc()
787
+ self.log_feedback(f"❌ Retry error: {error_details}")
788
+ response = f"❌ **Retry failed with error:**\n\n```\n{str(e)}\n```\n\nPlease try 'truncate' (or '2') to clear tables first."
789
 
790
  chat_history[-1] = (message, response)
791
  yield chat_history, current_stage, current_model, company, use_case, ""
792
  return
793
 
794
+ elif 'truncate' in message_lower or message_lower.strip() == '2':
795
  # Truncate tables and retry
796
  chat_history[-1] = (message, "🗑️ Truncating tables and retrying...")
797
  yield chat_history, current_stage, current_model, company, use_case, ""
798
 
 
 
 
 
 
 
 
799
  try:
800
+ # Check required attributes exist
801
+ if not hasattr(self, '_last_schema_name') or not self._last_schema_name:
802
+ response = " **Error:** Schema name not found. Please run deployment again first."
803
+ chat_history[-1] = (message, response)
804
+ yield chat_history, current_stage, current_model, company, use_case, ""
805
+ return
806
+
807
+ if not hasattr(self.demo_builder, 'data_population_results') or not self.demo_builder.data_population_results:
808
+ response = "❌ **Error:** Population code not found. Please run population again first."
809
+ chat_history[-1] = (message, response)
810
+ yield chat_history, current_stage, current_model, company, use_case, ""
811
+ return
812
 
813
+ from cdw_connector import SnowflakeDeployer
814
+ from demo_prep import execute_population_script
 
 
815
 
816
+ deployer = SnowflakeDeployer()
817
+ deployer.connect()
818
+
819
+ # Truncate all tables in schema
820
+ try:
821
+ cursor = deployer.connection.cursor()
822
+ cursor.execute(f"USE SCHEMA {self._last_schema_name}")
823
+ cursor.execute("SHOW TABLES")
824
+ tables = cursor.fetchall()
825
+
826
+ for table in tables:
827
+ table_name = table[1]
828
+ self.log_feedback(f"Truncating {table_name}...")
829
+ cursor.execute(f"TRUNCATE TABLE {table_name}")
830
+
831
+ cursor.close()
832
+ deployer.disconnect()
833
+ self.log_feedback("✅ Tables truncated")
834
+ except Exception as e:
835
+ self.log_feedback(f"⚠️ Truncate warning: {e}")
836
+ if deployer.connection:
837
+ deployer.disconnect()
838
+
839
+ # Retry population
840
+ is_template = getattr(self.demo_builder, 'population_code_source', 'llm') == 'template'
841
+ success, msg = execute_population_script(
842
+ self.demo_builder.data_population_results,
843
+ self._last_schema_name,
844
+ skip_modifications=is_template
845
+ )
846
+
847
+ if success:
848
+ response = f"✅ **Population Successful!**\n\n{msg}\n\nDemo deployed to Snowflake! 🎉"
849
+ del self._last_population_error
850
+ del self._last_schema_name
851
+ else:
852
+ response = f"❌ Still failed: {msg[:200]}...\n\nTry 'fix' (or '3') to let AI correct the code?"
853
 
854
+ except Exception as e:
855
+ import traceback
856
+ error_details = traceback.format_exc()
857
+ self.log_feedback(f"❌ Truncate/retry error: {error_details}")
858
+ response = f"❌ **Truncate/retry failed with error:**\n\n```\n{str(e)}\n```\n\nPlease check the error details above."
 
859
 
860
  chat_history[-1] = (message, response)
861
  yield chat_history, current_stage, current_model, company, use_case, ""
862
  return
863
 
864
+ elif 'fix' in message_lower or message_lower.strip() == '3':
865
  # Regenerate the code using the fixed template
866
  chat_history[-1] = (message, "🔧 Regenerating population code with fixed template...")
867
  yield chat_history, current_stage, current_model, company, use_case, ""
868
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
869
  try:
870
+ # Check required attributes exist
871
+ if not hasattr(self.demo_builder, 'schema_generation_results') or not self.demo_builder.schema_generation_results:
872
+ response = "❌ **Error:** DDL schema not found. Please run DDL creation again first."
873
+ chat_history[-1] = (message, response)
874
+ yield chat_history, current_stage, current_model, company, use_case, ""
875
+ return
876
+
877
+ if not hasattr(self, '_last_schema_name') or not self._last_schema_name:
878
+ response = "❌ **Error:** Schema name not found. Please run deployment again first."
879
+ chat_history[-1] = (message, response)
880
+ yield chat_history, current_stage, current_model, company, use_case, ""
881
+ return
882
+
883
+ self.log_feedback("🔧 Regenerating population code from scratch...")
884
+
885
+ # Regenerate using the reliable template
886
+ from schema_utils import parse_ddl_schema
887
+
888
+ schema_info = parse_ddl_schema(self.demo_builder.schema_generation_results)
889
+ if not schema_info:
890
+ response = "❌ Failed to parse DDL schema. Cannot regenerate."
891
+ chat_history[-1] = (message, response)
892
+ yield chat_history, current_stage, current_model, company, use_case, ""
893
+ return
894
+
895
+ # Generate new code using the template (which includes all fixes)
896
+ fixed_code = self.get_fallback_population_code(schema_info)
897
+
898
+ # Validate it compiles
899
+ try:
900
+ compile(fixed_code, '<regenerated>', 'exec')
901
+ self.log_feedback("✅ Regenerated code validated")
902
+ except SyntaxError as e:
903
+ response = f"❌ Template generation bug: {e}\n\nPlease contact support."
904
+ chat_history[-1] = (message, response)
905
+ yield chat_history, current_stage, current_model, company, use_case, ""
906
+ return
907
+
908
+ # Update the code and mark as template-generated
909
+ self.demo_builder.data_population_results = fixed_code
910
+ self.population_code = fixed_code
911
+ self.demo_builder.population_code_source = "template" # Mark as template
912
+
913
+ self.log_feedback("🔧 Code regenerated, retrying deployment...")
914
+
915
+ # Truncate and retry
916
+ from cdw_connector import SnowflakeDeployer
917
+ from demo_prep import execute_population_script
918
+
919
+ deployer = SnowflakeDeployer()
920
+ deployer.connect()
921
+
922
+ try:
923
+ cursor = deployer.connection.cursor()
924
+ cursor.execute(f"USE SCHEMA {self._last_schema_name}")
925
+ cursor.execute("SHOW TABLES")
926
+ tables = cursor.fetchall()
927
+ for table in tables:
928
+ cursor.execute(f"TRUNCATE TABLE {table[1]}")
929
+ cursor.close()
930
+ deployer.disconnect()
931
+ except Exception as e:
932
+ self.log_feedback(f"⚠️ Truncate warning: {e}")
933
+ if deployer.connection:
934
+ deployer.disconnect()
935
+
936
+ success, msg = execute_population_script(
937
+ fixed_code,
938
+ self._last_schema_name,
939
+ skip_modifications=True # Template code, don't modify
940
+ )
941
+
942
+ if success:
943
+ response = f"✅ **Fixed and Successful!**\n\n{msg}\n\nDemo deployed to Snowflake! 🎉"
944
+ del self._last_population_error
945
+ del self._last_schema_name
946
+ else:
947
+ response = f"❌ AI fix didn't work: {msg[:200]}...\n\nTry 'fix' again or 'retry'?"
948
 
 
 
 
 
 
 
 
 
949
  except Exception as e:
950
+ import traceback
951
+ error_details = traceback.format_exc()
952
+ self.log_feedback(f"❌ Fix/regenerate error: {error_details}")
953
+ response = f"❌ **Fix/regenerate failed with error:**\n\n```\n{str(e)}\n```\n\nPlease check the error details above."
 
 
 
 
 
 
 
 
 
 
954
 
955
  chat_history[-1] = (message, response)
956
  yield chat_history, current_stage, current_model, company, use_case, ""
 
1156
  import re
1157
 
1158
  # Pattern: "company: XYZ" or "for the company: XYZ"
1159
+ # Stop at "use case:", "and", "for", or end of string
1160
  patterns = [
1161
+ r'company:\s*([^,\n]+?)(?:\s+use\s+case:|\s+and|\s+for|$)',
1162
+ r'for\s+(?:the\s+)?company:\s*([^,\n]+?)(?:\s+use\s+case:|\s+and|\s+for|$)',
1163
+ r'demo for\s+(?:the\s+)?company:\s*([^,\n]+?)(?:\s+use\s+case:|\s+and|\s+for|$)'
1164
  ]
1165
 
1166
  for pattern in patterns:
1167
  match = re.search(pattern, message, re.IGNORECASE)
1168
  if match:
1169
+ company = match.group(1).strip()
1170
+ # Additional cleanup: remove any trailing "use case" text that might have been captured
1171
+ company = re.sub(r'\s+use\s+case:.*$', '', company, flags=re.IGNORECASE).strip()
1172
+ return company
1173
 
1174
  return None
1175
 
 
1260
  yield progress_message
1261
 
1262
  try:
1263
+ # Initialize demo builder if needed OR if company/use_case changed
1264
+ # CRITICAL: Always create fresh DemoBuilder when company/use_case changes
1265
+ # to avoid persisting prompts/data from previous runs
1266
+ needs_new_builder = (
1267
+ not self.demo_builder or
1268
+ self.demo_builder.company_url != company or
1269
+ self.demo_builder.use_case != use_case
1270
+ )
1271
+
1272
+ if needs_new_builder:
1273
+ if self.demo_builder:
1274
+ self.log_feedback(f"🔄 Company/use case changed - creating fresh DemoBuilder (was: {self.demo_builder.company_url}/{self.demo_builder.use_case})")
1275
+ else:
1276
+ self.log_feedback("Initializing DemoBuilder...")
1277
  progress_message += "✓ Initializing DemoBuilder...\n"
1278
  yield progress_message
1279
 
 
1282
  company_url=company
1283
  )
1284
 
1285
+ # Prepare URL - clean up any extra text that might have been captured
1286
+ # Remove "use case:" and anything after it, and clean whitespace
1287
+ import re
1288
+ clean_company = re.sub(r'\s+use\s+case:.*$', '', company, flags=re.IGNORECASE).strip()
1289
+ clean_company = re.sub(r'\s+and\s+.*$', '', clean_company, flags=re.IGNORECASE).strip()
1290
+ url = clean_company if clean_company.startswith('http') else f"https://{clean_company}"
1291
 
1292
  # Check for cached research results
1293
  domain = url.replace('https://', '').replace('http://', '').replace('www.', '').split('/')[0]
 
1368
 
1369
  website = Website(url)
1370
  self.demo_builder.website_data = website
1371
+
1372
+ # Check if website extraction failed
1373
+ if not website.text or len(website.text) == 0:
1374
+ error_msg = f"❌ **Couldn't access the website**\n\n"
1375
+ if website.error_message:
1376
+ error_msg += f"**Error:** {website.error_message}\n\n"
1377
+ error_msg += f"**URL:** {url}\n\n"
1378
+ error_msg += "**Troubleshooting:**\n"
1379
+ error_msg += "- Verify the URL is correct and accessible in your browser\n"
1380
+ error_msg += "- Check if the site requires authentication\n"
1381
+ error_msg += "- The site may be blocking automated requests\n"
1382
+ error_msg += "- Try accessing the site manually to confirm it's working\n\n"
1383
+ error_msg += "Please check the URL and try again."
1384
+
1385
+ self.log_feedback(error_msg)
1386
+ progress_message += error_msg
1387
+ yield progress_message
1388
+ return
1389
+
1390
  self.log_feedback(f"Extracted {len(website.text)} characters from {website.title}")
1391
  progress_message += f"✓ Extracted {len(website.text)} characters from {website.title}\n\n"
1392
  yield progress_message
 
1554
  self.log_feedback(f"🔍 Starting research for {company} - {use_case}")
1555
 
1556
  try:
1557
+ # Initialize demo builder if needed OR if company/use_case changed
1558
+ # CRITICAL: Always create fresh DemoBuilder when company/use_case changes
1559
+ # to avoid persisting prompts/data from previous runs
1560
+ needs_new_builder = (
1561
+ not self.demo_builder or
1562
+ self.demo_builder.company_url != company or
1563
+ self.demo_builder.use_case != use_case
1564
+ )
1565
+
1566
+ if needs_new_builder:
1567
+ if self.demo_builder:
1568
+ self.log_feedback(f"🔄 Company/use case changed - creating fresh DemoBuilder (was: {self.demo_builder.company_url}/{self.demo_builder.use_case})")
1569
+ else:
1570
+ self.log_feedback("Initializing DemoBuilder...")
1571
  self.demo_builder = DemoBuilder(
1572
  use_case=use_case,
1573
  company_url=company
 
1578
  url = company if company.startswith('http') else f"https://{company}"
1579
  website = Website(url)
1580
  self.demo_builder.website_data = website
1581
+
1582
+ # Check if website extraction failed
1583
+ if not website.text or len(website.text) == 0:
1584
+ error_msg = f"❌ **Couldn't access the website**\n\n"
1585
+ if website.error_message:
1586
+ error_msg += f"**Error:** {website.error_message}\n\n"
1587
+ error_msg += f"**URL:** {url}\n\n"
1588
+ error_msg += "**Troubleshooting:**\n"
1589
+ error_msg += "- Verify the URL is correct and accessible in your browser\n"
1590
+ error_msg += "- Check if the site requires authentication\n"
1591
+ error_msg += "- The site may be blocking automated requests\n"
1592
+ error_msg += "- Try accessing the site manually to confirm it's working\n\n"
1593
+ error_msg += "Please check the URL and try again."
1594
+ self.log_feedback(error_msg)
1595
+ return None
1596
+
1597
  self.log_feedback(f"Extracted {len(website.text)} characters from {website.title}")
1598
 
1599
  # Get LLM provider
 
1725
  - Include realistic column names that match the business context
1726
  - Add proper constraints and relationships
1727
 
1728
+ TABLE NAMING REQUIREMENTS:
1729
+ - **DO NOT use DIM_ or FACT_ prefixes** (e.g., NOT DIM_PRODUCT or FACT_SALES)
1730
+ - Use simple, descriptive table names (e.g., PRODUCTS, CUSTOMERS, SALES, ORDERS)
1731
+ - Dimension tables: Use plural nouns (CUSTOMERS, PRODUCTS, WAREHOUSES)
1732
+ - Fact tables: Use descriptive names (SALES, TRANSACTIONS, ORDERS, INVENTORY_MOVEMENTS)
1733
+ - Keep names concise and business-friendly
1734
+
1735
  SNOWFLAKE SYNTAX EXAMPLES:
1736
  - Auto-increment: ColumnID INT IDENTITY(1,1) PRIMARY KEY
1737
  - NOT: ColumnID INT PRIMARY KEY AUTO_INCREMENT
 
1929
  fake_values.append("random.choice(['A', 'B', 'C'])")
1930
 
1931
  elif 'VARCHAR' in col_type or 'TEXT' in col_type or 'STRING' in col_type or 'CHAR' in col_type:
1932
+ # Extract VARCHAR length - always truncate generated values to fit
1933
+ import re
1934
+ length_match = re.search(r'\((\d+)\)', col_type)
1935
+ varchar_length = int(length_match.group(1)) if length_match else 255
1936
+
1937
+ # Generate domain-specific realistic data based on column name, then truncate to fit
1938
+ base_value = None
1939
  if 'NAME' in col_name_upper and 'COMPANY' not in col_name_upper:
1940
  if 'PRODUCT' in col_name_upper:
1941
+ base_value = "random.choice(['Laptop Pro 15', 'Wireless Mouse 2.4GHz', 'USB-C Cable 6ft', 'Monitor Stand Adjustable', 'Mechanical Keyboard RGB', 'Noise Canceling Headphones', '1080p Webcam', 'Portable SSD 1TB', 'Power Bank 20000mAh', 'Tablet 10 inch', 'Smart Watch', 'Bluetooth Speaker', 'Gaming Mouse Pad', 'Phone Case', 'Screen Protector', 'Charging Cable', 'Desk Lamp LED', 'Laptop Bag', 'Wireless Earbuds', 'USB Hub'])"
1942
  elif 'CUSTOMER' in col_name_upper or 'USER' in col_name_upper:
1943
+ base_value = "fake.name()"
1944
  elif 'SELLER' in col_name_upper or 'VENDOR' in col_name_upper:
1945
+ base_value = "random.choice(['Amazon', 'Best Buy', 'Walmart', 'Target', 'Costco', 'Home Depot', 'Lowes', 'Macys', 'Nordstrom', 'Kohls'])"
1946
  else:
1947
+ base_value = "fake.name()"
1948
  elif 'CATEGORY' in col_name_upper:
1949
+ base_value = "random.choice(['Electronics', 'Home & Kitchen', 'Books', 'Clothing', 'Sports', 'Toys', 'Beauty', 'Automotive'])"
1950
  elif 'BRAND' in col_name_upper:
1951
+ base_value = "random.choice(['Samsung', 'Apple', 'Sony', 'LG', 'Dell', 'HP', 'Lenovo', 'Amazon Basics', 'Anker', 'Logitech'])"
1952
+ elif 'CHANNEL' in col_name_upper or 'SOURCE' in col_name_upper:
1953
+ # Marketing channels for lead generation / call tracking
1954
+ base_value = "random.choice(['Google Ads Search', 'Bing Ads', 'Facebook Ads', 'LinkedIn Ads', 'Instagram Ads', 'Twitter Ads', 'Display Network', 'Programmatic Display', 'Retargeting', 'TV Commercial', 'Radio Ads', 'Billboard', 'Print Ads', 'Direct Mail', 'Email Newsletter', 'Organic Search', 'Social Media Organic', 'Google My Business', 'Referral', 'Affiliate Marketing', 'Content Marketing', 'Webinar', 'Podcast Sponsorship'])"
1955
+ elif 'CAMPAIGN' in col_name_upper and ('NAME' in col_name_upper or col_name_upper == 'CAMPAIGN_NAME'):
1956
+ # Marketing campaign names (usually reference the channel)
1957
+ base_value = "random.choice(['Google Ads Q4 Lead Gen', 'Facebook Black Friday Promo', 'LinkedIn Spring Campaign', 'Instagram New Product Launch', 'Email Brand Awareness', 'Display Holiday Special', 'Google Ads Summer Sale', 'Facebook Back to School', 'LinkedIn Valentine Promo', 'Google Shopping Cyber Monday', 'Email Free Trial Offer', 'Webinar Registration Q3', 'Email Nurture Series', 'Display Retargeting Q3', 'Google Ads Demo Request', 'Referral Rewards Program', 'Google Ads Year End Sale', 'Facebook New Year Campaign', 'Instagram Flash Sale', 'Email Limited Time Offer', 'Google Ads Early Bird', 'LinkedIn VIP Member Drive', 'Facebook Product Teaser', 'Display Conference Promo', 'Email Partner Campaign', 'Google Ads Seasonal', 'Facebook Customer Appreciation', 'Email Win Back Campaign', 'LinkedIn Upsell Drive', 'Display Cross-Sell Q4'])"
1958
+ elif ('CENTER' in col_name_upper and 'NAME' in col_name_upper) or ('CALL' in col_name_upper and 'CENTER' in col_name_upper):
1959
+ # Call center names
1960
+ base_value = "random.choice(['New York Contact Center', 'Los Angeles Support Hub', 'Chicago Call Center', 'Dallas Operations Center', 'Phoenix Customer Care', 'Philadelphia Service Center', 'San Diego Support Center', 'Miami Contact Hub', 'Atlanta Operations', 'Denver Call Center', 'Seattle Support Center', 'Boston Customer Service', 'Portland Contact Center', 'Austin Operations Hub', 'Las Vegas Call Center', 'Toronto Support Center', 'Offshore Manila Center', 'Offshore Bangalore Hub', 'Remote East Coast Team', 'Remote West Coast Team', 'Central Support Center', 'National Call Center', 'Regional North Hub', 'Regional South Hub', 'Enterprise Support Center'])"
1961
  elif 'DESCRIPTION' in col_name_upper or 'DESC' in col_name_upper:
1962
+ base_value = "random.choice(['High quality product', 'Best seller', 'Customer favorite', 'New arrival', 'Limited edition', 'Premium quality'])"
1963
  elif 'EMAIL' in col_name_upper:
1964
+ base_value = "fake.email()"
1965
  elif 'PHONE' in col_name_upper:
1966
+ base_value = "f'{random.randint(200, 999)}-{random.randint(200, 999)}-{random.randint(1000, 9999)}'"
1967
  elif 'ADDRESS' in col_name_upper or 'STREET' in col_name_upper:
1968
+ base_value = "f'{random.randint(1, 9999)} {random.choice([\"Main\", \"Oak\", \"Park\", \"Maple\", \"Cedar\", \"Elm\", \"Washington\", \"Lake\", \"Hill\", \"Broadway\"])} {random.choice([\"St\", \"Ave\", \"Blvd\", \"Dr\", \"Ln\"])}'"
1969
  elif 'CITY' in col_name_upper:
1970
+ base_value = "random.choice(['New York', 'Los Angeles', 'Chicago', 'Houston', 'Phoenix', 'Philadelphia', 'San Antonio', 'San Diego', 'Dallas', 'San Jose', 'Austin', 'Seattle', 'Denver', 'Boston', 'Portland', 'Miami', 'Atlanta', 'Detroit', 'Las Vegas', 'Toronto'])"
1971
  elif 'STATE' in col_name_upper or 'PROVINCE' in col_name_upper:
1972
+ base_value = "random.choice(['California', 'Texas', 'New York', 'Florida', 'Illinois', 'Ohio', 'Georgia', 'Washington', 'Virginia', 'Arizona', 'Colorado', 'Oregon', 'Nevada', 'Utah', 'Iowa'])"
1973
  elif 'COUNTRY' in col_name_upper:
1974
+ base_value = "random.choice(['USA', 'Canada', 'UK', 'Germany', 'France', 'Japan', 'Australia', 'India', 'China', 'Brazil', 'Mexico', 'Spain', 'Italy', 'Netherlands', 'Sweden'])"
1975
  elif 'ZIP' in col_name_upper or 'POSTAL' in col_name_upper:
1976
+ base_value = "random.choice(['10001', '90210', '60601', '77001', '85001', '19101', '78201', '92101', '75201', '95101', '78701', '98101', '80201', '02101', '97201'])"
1977
  elif 'COMPANY' in col_name_upper:
1978
+ base_value = "random.choice(['Amazon', 'Microsoft', 'Apple Inc', 'Google LLC', 'Meta', 'Tesla Inc', 'Netflix', 'Adobe Inc', 'Oracle Corp', 'Salesforce', 'IBM Corp', 'Intel Corp', 'Cisco Systems', 'Dell Technologies', 'HP Inc'])"
1979
  else:
1980
+ # Default: use faker word
1981
+ base_value = "fake.word()"
1982
+
1983
+ # Always truncate to VARCHAR length - simple and works for all cases
1984
+ fake_values.append(f"({base_value})[:{varchar_length}]")
 
 
1985
  elif 'INT' in col_type or 'NUMBER' in col_type or 'BIGINT' in col_type:
1986
  fake_values.append("random.randint(1, 1000)")
1987
  elif 'DECIMAL' in col_type or 'FLOAT' in col_type or 'DOUBLE' in col_type or 'NUMERIC' in col_type:
 
2565
  settings.get("liveboard_name", ""), # 4. liveboard_name (moved)
2566
  str(settings.get("fact_table_size", "1000")), # 5. fact_table_size (moved)
2567
  str(settings.get("dim_table_size", "100")), # 6. dim_table_size
2568
+ settings.get("tag_name", ""), # 7. tag_name
2569
+ settings.get("object_naming_prefix", ""), # 8. object_naming_prefix
2570
+ float(settings.get("temperature", 0.3)), # 9. temperature_slider
2571
  int(settings.get("max_tokens", 4000)), # 9. max_tokens
2572
  int(settings.get("batch_size", 5000)), # 10. batch_size
2573
  int(settings.get("thread_count", 4)), # 11. thread_count
 
2587
  "claude-sonnet-4.5", "", "Sales Analytics", # 1-3
2588
  "", # 4: liveboard_name
2589
  "1000", "100", # 5-6: fact_table_size, dim_table_size
2590
+ "", # 7: tag_name
2591
+ "", # 8: object_naming_prefix
2592
+ 0.3, 4000, 5000, 4, # 9-12: temperature, max_tokens, batch, threads
2593
  "", "", "ACCOUNTADMIN", # 12-14: sf settings
2594
  "COMPUTE_WH", "DEMO_DB", "PUBLIC", # 15-17: warehouse, db, schema
2595
  "", "", # 18-19: ts url, ts username
 
2607
  settings_components['liveboard_name'], # 4 (moved)
2608
  settings_components['fact_table_size'], # 5 (moved)
2609
  settings_components['dim_table_size'], # 6
2610
+ settings_components['tag_name'], # 7
2611
+ settings_components['object_naming_prefix'], # 8
2612
+ settings_components['temperature_slider'], # 9
2613
  settings_components['max_tokens'], # 9
2614
  settings_components['batch_size'], # 10
2615
  settings_components['thread_count'], # 11
 
2846
  info="Number of rows in dimension tables"
2847
  )
2848
 
2849
+ tag_name = gr.Textbox(
2850
+ label="Tag Name",
2851
+ placeholder="e.g., 'Sales_Demo' or 'Q4_2024'",
2852
+ value="",
2853
+ info="Tag to apply to ThoughtSpot objects (tables and models)"
2854
+ )
2855
+
2856
  object_naming_prefix = gr.Textbox(
2857
  label="Object Naming Prefix",
2858
  placeholder="e.g., 'ACME_' or 'DEMO_'",
 
2979
 
2980
  def save_settings_handler(
2981
  ai_model, company_url, use_case,
2982
+ lb_name, fact_size, dim_size, tag, obj_prefix,
2983
  temp, max_tok, batch, threads,
2984
  sf_acc, sf_user, sf_role, wh, db, schema,
2985
  ts_url, ts_user
 
3001
  "default_use_case": use_case,
3002
  "fact_table_size": fact_size,
3003
  "dim_table_size": dim_size,
3004
+ "tag_name": tag or "",
3005
  "temperature": str(temp),
3006
  "max_tokens": str(int(max_tok)),
3007
  "batch_size": str(int(batch)),
 
3035
  fn=save_settings_handler,
3036
  inputs=[
3037
  default_ai_model, default_company_url, default_use_case,
3038
+ liveboard_name, fact_table_size, dim_table_size, tag_name, object_naming_prefix,
3039
  temperature_slider, max_tokens, batch_size, thread_count,
3040
  sf_account, sf_user, sf_role, default_warehouse, default_database, default_schema,
3041
  ts_instance_url, ts_username
 
3068
  'default_schema': default_schema,
3069
  'ts_instance_url': ts_instance_url,
3070
  'ts_username': ts_username,
3071
+ 'tag_name': tag_name,
3072
  'object_naming_prefix': object_naming_prefix,
3073
  'liveboard_name': liveboard_name,
3074
  'settings_status': settings_status
conversational_data_adjuster.py ADDED
@@ -0,0 +1,447 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Conversational Data Adjuster
3
+
4
+ Allows natural language data adjustments with strategy selection:
5
+ User: "Make 1080p webcam sales 50B"
6
+ System: Analyzes data, presents options
7
+ User: Picks strategy
8
+ System: Executes SQL
9
+ """
10
+
11
+ import os
12
+ from typing import Dict, List, Optional
13
+ from openai import OpenAI
14
+ from snowflake_auth import get_snowflake_connection
15
+ import json
16
+
17
+
18
+ class ConversationalDataAdjuster:
19
+ """Interactive data adjustment with user choice of strategy"""
20
+
21
+ def __init__(self, database: str, schema: str, model_id: str):
22
+ self.database = database
23
+ self.schema = schema
24
+ self.model_id = model_id
25
+ self.conn = None
26
+ self.openai_client = OpenAI(api_key=os.getenv('OPENAI_API_KEY'))
27
+ self.current_context = {}
28
+
29
+ def connect(self):
30
+ """Connect to Snowflake"""
31
+ self.conn = get_snowflake_connection()
32
+ cursor = self.conn.cursor()
33
+ cursor.execute(f"USE DATABASE {self.database}")
34
+ cursor.execute(f'USE SCHEMA "{self.schema}"') # Quote schema name (may start with number)
35
+ print(f"✅ Connected to {self.database}.{self.schema}")
36
+
37
+ def parse_adjustment_request(self, request: str, available_tables: List[str]) -> Dict:
38
+ """
39
+ Parse natural language request to identify what to adjust
40
+
41
+ Args:
42
+ request: e.g., "increase 1080p webcam sales to 50B"
43
+ available_tables: List of table names in schema
44
+
45
+ Returns:
46
+ {
47
+ 'table': 'SALES_TRANSACTIONS',
48
+ 'entity_column': 'product_name',
49
+ 'entity_value': '1080p webcam',
50
+ 'metric_column': 'total_revenue',
51
+ 'target_value': 50000000000,
52
+ 'current_value': 30000000000 # if known
53
+ }
54
+ """
55
+ prompt = f"""Parse this data adjustment request.
56
+
57
+ Request: "{request}"
58
+
59
+ Available tables: {', '.join(available_tables)}
60
+
61
+ Common columns:
62
+ - SALES_TRANSACTIONS: PRODUCT_ID, CUSTOMER_ID, SELLER_ID, TOTAL_REVENUE, QUANTITY_SOLD, PROFIT_MARGIN, ORDER_DATE
63
+ - PRODUCTS: PRODUCT_ID, PRODUCT_NAME, CATEGORY
64
+ - CUSTOMERS: CUSTOMER_ID, CUSTOMER_SEGMENT
65
+
66
+ IMPORTANT - Column Meanings:
67
+ - TOTAL_REVENUE = dollar value of sales (e.g., $50B means fifty billion dollars)
68
+ - QUANTITY_SOLD = number of units sold (e.g., 1000 units)
69
+
70
+ When user says "sales", "revenue", or dollar amounts → use TOTAL_REVENUE
71
+ When user says "quantity", "units", or "items sold" → use QUANTITY_SOLD
72
+
73
+ Note: To filter by product name, you'll need to reference PRODUCTS table or use PRODUCT_ID directly
74
+
75
+ Extract:
76
+ 1. table: Which table to modify (likely SALES_TRANSACTIONS for revenue/sales changes)
77
+ 2. entity_column: Column to filter by (e.g., product_name, customer_segment)
78
+ 3. entity_value: Specific value to filter (e.g., "1080p webcam", "Electronics")
79
+ 4. metric_column: Numeric column to change
80
+ - If request mentions "sales", "revenue", or dollar amounts → TOTAL_REVENUE
81
+ - If request mentions "quantity", "units", "items" → QUANTITY_SOLD
82
+ - If request mentions "profit margin" → PROFIT_MARGIN
83
+ 5. target_value: The target numeric value (convert B to billions, M to millions)
84
+
85
+ Return ONLY valid JSON: {{"table": "...", "entity_column": "...", "entity_value": "...", "metric_column": "...", "target_value": 123}}
86
+
87
+ Examples:
88
+ - "increase 1080p webcam sales to 50B" → {{"table": "SALES_TRANSACTIONS", "entity_column": "PRODUCT_ID", "entity_value": "1080p Webcam", "metric_column": "TOTAL_REVENUE", "target_value": 50000000000, "needs_join": "PRODUCTS", "join_column": "PRODUCT_NAME"}}
89
+ - "make tablet revenue 100 billion" → {{"table": "SALES_TRANSACTIONS", "entity_column": "PRODUCT_ID", "entity_value": "Tablet", "metric_column": "TOTAL_REVENUE", "target_value": 100000000000, "needs_join": "PRODUCTS", "join_column": "PRODUCT_NAME"}}
90
+ - "increase laptop quantity to 50000 units" → {{"table": "SALES_TRANSACTIONS", "entity_column": "PRODUCT_ID", "entity_value": "Laptop", "metric_column": "QUANTITY_SOLD", "target_value": 50000, "needs_join": "PRODUCTS", "join_column": "PRODUCT_NAME"}}
91
+ - "set profit margin to 25% for electronics" → {{"table": "SALES_TRANSACTIONS", "entity_column": "PRODUCT_ID", "entity_value": "electronics", "metric_column": "PROFIT_MARGIN", "target_value": 25, "needs_join": "PRODUCTS", "join_column": "CATEGORY"}}
92
+
93
+ If the entity refers to a column not in the target table (e.g., product_name when modifying SALES_TRANSACTIONS),
94
+ include "needs_join" with the table name and "join_column" with the column to match on.
95
+ """
96
+
97
+ response = self.openai_client.chat.completions.create(
98
+ model="gpt-4o",
99
+ messages=[{"role": "user", "content": prompt}],
100
+ temperature=0
101
+ )
102
+
103
+ content = response.choices[0].message.content
104
+
105
+ # Strip markdown code blocks if present
106
+ if content.startswith('```'):
107
+ lines = content.split('\n')
108
+ content = '\n'.join(lines[1:-1]) # Remove first and last line (``` markers)
109
+
110
+ try:
111
+ result = json.loads(content)
112
+ print(f"✅ Parsed request: {result.get('entity_value')} - {result.get('metric_column')}")
113
+ return result
114
+ except json.JSONDecodeError as e:
115
+ print(f"❌ Failed to parse JSON: {e}")
116
+ print(f"Content was: {content}")
117
+ return {'error': f'Failed to parse request: {content}'}
118
+
119
+ def analyze_current_data(self, adjustment: Dict) -> Dict:
120
+ """
121
+ Query current state of the data
122
+
123
+ Returns:
124
+ {
125
+ 'current_total': float,
126
+ 'row_count': int,
127
+ 'avg_value': float,
128
+ 'min_value': float,
129
+ 'max_value': float,
130
+ 'gap': float # target - current
131
+ }
132
+ """
133
+ cursor = self.conn.cursor()
134
+
135
+ table = adjustment['table']
136
+ entity_col = adjustment['entity_column']
137
+ entity_val = adjustment['entity_value']
138
+ metric_col = adjustment['metric_column']
139
+ target = adjustment['target_value']
140
+
141
+ # Build WHERE clause - handle joins if needed
142
+ if adjustment.get('needs_join'):
143
+ join_table = adjustment['needs_join']
144
+ join_col = adjustment['join_column']
145
+ where_clause = f"""WHERE {entity_col} IN (
146
+ SELECT PRODUCT_ID FROM {self.database}."{self.schema}".{join_table}
147
+ WHERE LOWER({join_col}) = LOWER('{entity_val}')
148
+ )"""
149
+ else:
150
+ where_clause = f"WHERE LOWER({entity_col}) = LOWER('{entity_val}')"
151
+
152
+ # Query current state
153
+ query = f"""
154
+ SELECT
155
+ SUM({metric_col}) as total,
156
+ COUNT(*) as row_count,
157
+ AVG({metric_col}) as avg_value,
158
+ MIN({metric_col}) as min_value,
159
+ MAX({metric_col}) as max_value
160
+ FROM {self.database}."{self.schema}".{table}
161
+ {where_clause}
162
+ """
163
+
164
+ print(f"\n🔍 Analyzing current data...")
165
+ print(f" Query: {query}")
166
+
167
+ cursor.execute(query)
168
+ row = cursor.fetchone()
169
+
170
+ current_total = float(row[0]) if row[0] else 0
171
+ row_count = int(row[1])
172
+ avg_value = float(row[2]) if row[2] else 0
173
+ min_value = float(row[3]) if row[3] else 0
174
+ max_value = float(row[4]) if row[4] else 0
175
+
176
+ gap = target - current_total
177
+
178
+ return {
179
+ 'current_total': current_total,
180
+ 'row_count': row_count,
181
+ 'avg_value': avg_value,
182
+ 'min_value': min_value,
183
+ 'max_value': max_value,
184
+ 'gap': gap
185
+ }
186
+
187
+ def generate_strategy_options(self, adjustment: Dict, analysis: Dict) -> List[Dict]:
188
+ """
189
+ Generate 3 strategy options for achieving the target
190
+
191
+ Returns list of strategies with details
192
+ """
193
+ table = adjustment['table']
194
+ entity_col = adjustment['entity_column']
195
+ entity_val = adjustment['entity_value']
196
+ metric_col = adjustment['metric_column']
197
+ target = adjustment['target_value']
198
+
199
+ # Build WHERE clause - handle joins if needed
200
+ if adjustment.get('needs_join'):
201
+ join_table = adjustment['needs_join']
202
+ join_col = adjustment['join_column']
203
+ where_clause = f"""{entity_col} IN (
204
+ SELECT PRODUCT_ID FROM {self.database}."{self.schema}".{join_table}
205
+ WHERE LOWER({join_col}) = LOWER('{entity_val}')
206
+ )"""
207
+ else:
208
+ where_clause = f"LOWER({entity_col}) = LOWER('{entity_val}')"
209
+
210
+ current = analysis['current_total']
211
+ gap = analysis['gap']
212
+ row_count = analysis['row_count']
213
+
214
+ if gap <= 0:
215
+ return [{
216
+ 'id': 'decrease',
217
+ 'name': 'Decrease All',
218
+ 'description': f"Current value ({current:,.0f}) already exceeds target ({target:,.0f})",
219
+ 'sql': None
220
+ }]
221
+
222
+ strategies = []
223
+
224
+ # Strategy A: Distribute increase across all rows
225
+ multiplier = target / current if current > 0 else 1
226
+ percentage_increase = (multiplier - 1) * 100
227
+
228
+ strategies.append({
229
+ 'id': 'A',
230
+ 'name': 'Distribute Across All Transactions',
231
+ 'description': f"Increase all {row_count:,} existing transactions by {percentage_increase:.1f}%",
232
+ 'details': {
233
+ 'approach': 'Multiply all existing values',
234
+ 'rows_affected': row_count,
235
+ 'new_avg': analysis['avg_value'] * multiplier
236
+ },
237
+ 'sql': f"""UPDATE {self.database}."{self.schema}".{table}
238
+ SET {metric_col} = {metric_col} * {multiplier:.6f}
239
+ WHERE {where_clause}"""
240
+ })
241
+
242
+ # Strategy B: Add new large transactions
243
+ num_new_transactions = max(1, int(gap / (analysis['max_value'] * 2))) # Add transactions 2x the current max
244
+ value_per_new = gap / num_new_transactions
245
+
246
+ strategies.append({
247
+ 'id': 'B',
248
+ 'name': 'Add New Large Transactions',
249
+ 'description': f"Insert {num_new_transactions} new transactions of ${value_per_new:,.0f} each",
250
+ 'details': {
251
+ 'approach': 'Create new outlier transactions',
252
+ 'rows_to_add': num_new_transactions,
253
+ 'value_each': value_per_new
254
+ },
255
+ 'sql': f"""-- INSERT new transactions (requires full row data)
256
+ -- INSERT INTO {self.database}."{self.schema}".{table} ({entity_col}, {metric_col}, ...)
257
+ -- VALUES ('{entity_val}', {value_per_new}, ...)
258
+ -- NOTE: This requires knowing all required columns in the table"""
259
+ })
260
+
261
+ # Strategy C: Boost top transactions
262
+ top_n = min(10, max(1, row_count // 10)) # Top 10% of transactions
263
+ boost_needed_per_row = gap / top_n
264
+
265
+ strategies.append({
266
+ 'id': 'C',
267
+ 'name': 'Boost Top Transactions',
268
+ 'description': f"Increase the top {top_n} transactions by ${boost_needed_per_row:,.0f} each",
269
+ 'details': {
270
+ 'approach': 'Create outliers from existing top transactions',
271
+ 'rows_affected': top_n,
272
+ 'boost_per_row': boost_needed_per_row
273
+ },
274
+ 'sql': f"""WITH top_rows AS (
275
+ SELECT * FROM {self.database}."{self.schema}".{table}
276
+ WHERE {where_clause}
277
+ ORDER BY {metric_col} DESC
278
+ LIMIT {top_n}
279
+ )
280
+ UPDATE {self.database}."{self.schema}".{table} t
281
+ SET {metric_col} = {metric_col} + {boost_needed_per_row:.2f}
282
+ WHERE EXISTS (
283
+ SELECT 1 FROM top_rows
284
+ WHERE top_rows.rowid = t.rowid
285
+ )"""
286
+ })
287
+
288
+ return strategies
289
+
290
+ def present_options(self, adjustment: Dict, analysis: Dict, strategies: List[Dict]) -> None:
291
+ """Display options to user in a friendly format"""
292
+
293
+ print("\n" + "="*80)
294
+ print("📊 DATA ADJUSTMENT OPTIONS")
295
+ print("="*80)
296
+
297
+ entity = f"{adjustment['entity_column']}='{adjustment['entity_value']}'"
298
+ metric = adjustment['metric_column']
299
+
300
+ print(f"\n🎯 Goal: Adjust {metric} for {entity}")
301
+ print(f" Current Total: ${analysis['current_total']:,.0f}")
302
+ print(f" Target Total: ${adjustment['target_value']:,.0f}")
303
+ print(f" Gap to Fill: ${analysis['gap']:,.0f} ({analysis['gap']/analysis['current_total']*100:.1f}% increase)")
304
+
305
+ print(f"\n📈 Current Data:")
306
+ print(f" Rows: {analysis['row_count']:,}")
307
+ print(f" Average: ${analysis['avg_value']:,.0f}")
308
+ print(f" Range: ${analysis['min_value']:,.0f} - ${analysis['max_value']:,.0f}")
309
+
310
+ print(f"\n" + "="*80)
311
+ print("STRATEGY OPTIONS:")
312
+ print("="*80)
313
+
314
+ for strategy in strategies:
315
+ print(f"\n[{strategy['id']}] {strategy['name']}")
316
+ print(f" {strategy['description']}")
317
+
318
+ if 'details' in strategy:
319
+ details = strategy['details']
320
+ print(f" Details:")
321
+ for key, value in details.items():
322
+ if isinstance(value, float):
323
+ print(f" - {key}: ${value:,.0f}")
324
+ else:
325
+ print(f" - {key}: {value}")
326
+
327
+ if strategy['sql']:
328
+ print(f"\n SQL Preview:")
329
+ sql_preview = strategy['sql'].strip().split('\n')
330
+ for line in sql_preview[:3]: # Show first 3 lines
331
+ print(f" {line}")
332
+ if len(sql_preview) > 3:
333
+ print(f" ... ({len(sql_preview)-3} more lines)")
334
+
335
+ print("\n" + "="*80)
336
+
337
+ def execute_strategy(self, strategy: Dict) -> Dict:
338
+ """Execute the chosen strategy"""
339
+
340
+ if not strategy['sql']:
341
+ return {
342
+ 'success': False,
343
+ 'error': 'This strategy requires manual implementation (INSERT statements)'
344
+ }
345
+
346
+ cursor = self.conn.cursor()
347
+
348
+ print(f"\n⚙️ Executing strategy: {strategy['name']}")
349
+ print(f" SQL: {strategy['sql'][:200]}...")
350
+
351
+ try:
352
+ cursor.execute(strategy['sql'])
353
+ rows_affected = cursor.rowcount
354
+ self.conn.commit()
355
+
356
+ return {
357
+ 'success': True,
358
+ 'message': f"✅ Updated {rows_affected} rows",
359
+ 'rows_affected': rows_affected
360
+ }
361
+ except Exception as e:
362
+ self.conn.rollback()
363
+ return {
364
+ 'success': False,
365
+ 'error': str(e)
366
+ }
367
+
368
+ def get_available_tables(self) -> List[str]:
369
+ """Get list of tables in the schema"""
370
+ cursor = self.conn.cursor()
371
+ cursor.execute(f"""
372
+ SELECT TABLE_NAME
373
+ FROM {self.database}.INFORMATION_SCHEMA.TABLES
374
+ WHERE TABLE_SCHEMA = '{self.schema}'
375
+ """)
376
+ tables = [row[0] for row in cursor.fetchall()]
377
+ return tables
378
+
379
+ def close(self):
380
+ """Close connection"""
381
+ if self.conn:
382
+ self.conn.close()
383
+
384
+
385
+ # Test/demo function
386
+ def demo_conversation():
387
+ """Simulate the conversational flow"""
388
+
389
+ print("""
390
+ ╔════════════════════════════════════════════════════════════╗
391
+ ║ ║
392
+ ║ Conversational Data Adjuster Demo ║
393
+ ║ ║
394
+ ╚════════════════════════════════════════════════════════════╝
395
+ """)
396
+
397
+ # Setup from environment variables
398
+ from dotenv import load_dotenv
399
+ load_dotenv()
400
+
401
+ adjuster = ConversationalDataAdjuster(
402
+ database=os.getenv('SNOWFLAKE_DATABASE'),
403
+ schema="20251116_140933_AMAZO_SAL", # Schema from deployment
404
+ model_id="3c97b0d6-448b-440a-b628-bac1f3d73049"
405
+ )
406
+
407
+ print(f"Using database: {os.getenv('SNOWFLAKE_DATABASE')}")
408
+ print(f"Using schema: 20251116_140933_AMAZO_SAL")
409
+
410
+ adjuster.connect()
411
+
412
+ # User request (using actual product from our data)
413
+ user_request = "increase 1080p Webcam sales to 50 billion"
414
+ print(f"\n💬 User: \"{user_request}\"")
415
+ print(f" (Current: ~$17.6B, Target: $50B)")
416
+
417
+ # Step 1: Parse request
418
+ tables = adjuster.get_available_tables()
419
+ adjustment = adjuster.parse_adjustment_request(user_request, tables)
420
+
421
+ # Step 2: Analyze current data
422
+ analysis = adjuster.analyze_current_data(adjustment)
423
+
424
+ # Step 3: Generate strategies
425
+ strategies = adjuster.generate_strategy_options(adjustment, analysis)
426
+
427
+ # Step 4: Present options
428
+ adjuster.present_options(adjustment, analysis, strategies)
429
+
430
+ # Step 5: User picks (simulated)
431
+ print("\n💬 User: \"Use strategy A\"")
432
+ chosen_strategy = strategies[0] # Strategy A
433
+
434
+ # Step 6: Execute
435
+ result = adjuster.execute_strategy(chosen_strategy)
436
+
437
+ if result['success']:
438
+ print(f"\n{result['message']}")
439
+ else:
440
+ print(f"\n❌ Error: {result.get('error')}")
441
+
442
+ adjuster.close()
443
+
444
+
445
+ if __name__ == "__main__":
446
+ demo_conversation()
447
+
data_adjuster.py ADDED
@@ -0,0 +1,212 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Data Adjustment Module for Liveboard Refinement
3
+
4
+ Allows natural language adjustments to demo data:
5
+ - "make Product A 55% higher"
6
+ - "increase Customer B revenue by 20%"
7
+ - "set profit margin to 15% for Segment C"
8
+ """
9
+
10
+ import re
11
+ from typing import Dict, Optional
12
+ from snowflake_auth import get_snowflake_connection
13
+ from openai import OpenAI
14
+ import os
15
+
16
+
17
+ class DataAdjuster:
18
+ """Adjust demo data based on natural language requests"""
19
+
20
+ def __init__(self, database: str, schema: str):
21
+ self.database = database
22
+ self.schema = schema
23
+ self.conn = None
24
+ self.openai_client = OpenAI(api_key=os.getenv('OPENAI_API_KEY'))
25
+
26
+ def connect(self):
27
+ """Connect to Snowflake"""
28
+ self.conn = get_snowflake_connection()
29
+ self.conn.cursor().execute(f"USE DATABASE {self.database}")
30
+ self.conn.cursor().execute(f"USE SCHEMA {self.schema}")
31
+ print(f"✅ Connected to {self.database}.{self.schema}")
32
+
33
+ def parse_adjustment_request(self, request: str, available_columns: list) -> Dict:
34
+ """
35
+ Parse natural language adjustment request using AI
36
+
37
+ Args:
38
+ request: e.g., "make Product A 55% higher" or "increase revenue for Customer B by 20%"
39
+ available_columns: List of column names in the data
40
+
41
+ Returns:
42
+ {
43
+ 'entity_column': 'product_name',
44
+ 'entity_value': 'Product A',
45
+ 'metric_column': 'total_revenue',
46
+ 'adjustment_type': 'percentage_increase',
47
+ 'adjustment_value': 55
48
+ }
49
+ """
50
+ prompt = f"""Parse this data adjustment request and extract structured information.
51
+
52
+ Request: "{request}"
53
+
54
+ Available columns in the dataset: {', '.join(available_columns)}
55
+
56
+ Extract:
57
+ 1. entity_column: Which column identifies what to change (e.g., product_name, customer_segment)
58
+ 2. entity_value: The specific value to filter by (e.g., "Product A", "Electronics")
59
+ 3. metric_column: Which numeric column to adjust (e.g., total_revenue, profit_margin, quantity_sold)
60
+ 4. adjustment_type: One of: "percentage_increase", "percentage_decrease", "set_value", "add_value"
61
+ 5. adjustment_value: The numeric value (e.g., 55 for "55%", 1000 for "add 1000")
62
+
63
+ Return ONLY a JSON object with these fields. If you can't parse it, return {{"error": "description"}}.
64
+
65
+ Examples:
66
+ - "make Product A 55% higher" → {{"entity_column": "product_name", "entity_value": "Product A", "metric_column": "total_revenue", "adjustment_type": "percentage_increase", "adjustment_value": 55}}
67
+ - "set profit margin to 15% for Electronics" → {{"entity_column": "category", "entity_value": "Electronics", "metric_column": "profit_margin", "adjustment_type": "set_value", "adjustment_value": 15}}
68
+ """
69
+
70
+ response = self.openai_client.chat.completions.create(
71
+ model="gpt-4o",
72
+ messages=[{"role": "user", "content": prompt}],
73
+ temperature=0
74
+ )
75
+
76
+ import json
77
+ result = json.loads(response.choices[0].message.content)
78
+ return result
79
+
80
+ def get_available_columns(self, table_name: str) -> list:
81
+ """Get list of columns from a table"""
82
+ cursor = self.conn.cursor()
83
+ cursor.execute(f"""
84
+ SELECT COLUMN_NAME
85
+ FROM {self.database}.INFORMATION_SCHEMA.COLUMNS
86
+ WHERE TABLE_SCHEMA = '{self.schema}'
87
+ AND TABLE_NAME = '{table_name.upper()}'
88
+ """)
89
+ columns = [row[0].lower() for row in cursor.fetchall()]
90
+ return columns
91
+
92
+ def apply_adjustment(self, table_name: str, adjustment: Dict) -> Dict:
93
+ """
94
+ Apply the parsed adjustment to the database
95
+
96
+ Returns:
97
+ {'success': bool, 'message': str, 'rows_affected': int}
98
+ """
99
+ if 'error' in adjustment:
100
+ return {'success': False, 'message': adjustment['error']}
101
+
102
+ cursor = self.conn.cursor()
103
+
104
+ # Build the UPDATE statement
105
+ entity_col = adjustment['entity_column']
106
+ entity_val = adjustment['entity_value']
107
+ metric_col = adjustment['metric_column']
108
+ adj_type = adjustment['adjustment_type']
109
+ adj_value = adjustment['adjustment_value']
110
+
111
+ # Calculate new value based on adjustment type
112
+ if adj_type == 'percentage_increase':
113
+ new_value_expr = f"{metric_col} * (1 + {adj_value}/100.0)"
114
+ elif adj_type == 'percentage_decrease':
115
+ new_value_expr = f"{metric_col} * (1 - {adj_value}/100.0)"
116
+ elif adj_type == 'set_value':
117
+ new_value_expr = f"{adj_value}"
118
+ elif adj_type == 'add_value':
119
+ new_value_expr = f"{metric_col} + {adj_value}"
120
+ else:
121
+ return {'success': False, 'message': f"Unknown adjustment type: {adj_type}"}
122
+
123
+ # Execute UPDATE
124
+ update_sql = f"""
125
+ UPDATE {self.database}.{self.schema}.{table_name}
126
+ SET {metric_col} = {new_value_expr}
127
+ WHERE LOWER({entity_col}) = LOWER('{entity_val}')
128
+ """
129
+
130
+ print(f"\n🔧 Executing adjustment:")
131
+ print(f" SQL: {update_sql}")
132
+
133
+ try:
134
+ cursor.execute(update_sql)
135
+ rows_affected = cursor.rowcount
136
+ self.conn.commit()
137
+
138
+ return {
139
+ 'success': True,
140
+ 'message': f"Updated {rows_affected} rows: {entity_col}='{entity_val}', adjusted {metric_col} by {adj_type}",
141
+ 'rows_affected': rows_affected
142
+ }
143
+ except Exception as e:
144
+ return {
145
+ 'success': False,
146
+ 'message': f"Database error: {str(e)}"
147
+ }
148
+
149
+ def adjust_data_for_liveboard(self, request: str, table_name: str) -> Dict:
150
+ """
151
+ Full workflow: parse request, update data
152
+
153
+ Args:
154
+ request: Natural language request like "make Product A 55% higher"
155
+ table_name: Name of the table to update
156
+
157
+ Returns:
158
+ Result dictionary with success status and details
159
+ """
160
+ if not self.conn:
161
+ self.connect()
162
+
163
+ # Get available columns
164
+ columns = self.get_available_columns(table_name)
165
+ print(f"📋 Available columns: {', '.join(columns)}")
166
+
167
+ # Parse the request
168
+ print(f"\n🤔 Parsing request: '{request}'")
169
+ adjustment = self.parse_adjustment_request(request, columns)
170
+ print(f"✅ Parsed: {adjustment}")
171
+
172
+ if 'error' in adjustment:
173
+ return {'success': False, 'error': adjustment['error']}
174
+
175
+ # Apply the adjustment
176
+ result = self.apply_adjustment(table_name, adjustment)
177
+
178
+ return result
179
+
180
+ def close(self):
181
+ """Close database connection"""
182
+ if self.conn:
183
+ self.conn.close()
184
+
185
+
186
+ # Example usage function
187
+ def test_data_adjustment():
188
+ """Test the data adjustment functionality"""
189
+ adjuster = DataAdjuster(
190
+ database="DEMO_DATABASE",
191
+ schema="DEMO_SCHEMA"
192
+ )
193
+
194
+ # Example: "make Product A 55% higher"
195
+ result = adjuster.adjust_data_for_liveboard(
196
+ request="make Product A 55% higher",
197
+ table_name="FACT_SALES"
198
+ )
199
+
200
+ print(f"\n{'='*60}")
201
+ if result['success']:
202
+ print(f"✅ SUCCESS: {result['message']}")
203
+ print(f"📊 Rows affected: {result['rows_affected']}")
204
+ else:
205
+ print(f"❌ FAILED: {result.get('error', result.get('message'))}")
206
+
207
+ adjuster.close()
208
+
209
+
210
+ if __name__ == "__main__":
211
+ test_data_adjustment()
212
+
demo_prep.py CHANGED
@@ -362,33 +362,33 @@ def execute_population_script(python_code, schema_name, skip_modifications=False
362
  cleaned_code = cleaned_code.replace('os.getenv("SNOWFLAKE_SCHEMA")', f'"{schema_name}"')
363
  else:
364
  print("⚠️ LLM-generated code - applying safety fixes")
365
- # CRITICAL FIX: Remove schema from conn_params to avoid duplicate schema parameter
366
  # Only add if not already present (new templates include it by default)
367
  if "conn_params.pop('schema'" not in clean_code:
368
  cleaned_code = replace_with_indentation(
369
  clean_code,
370
- "conn_params = get_snowflake_connection_params()",
371
  ["conn_params.pop('schema', None) # Remove schema to avoid duplicate"]
372
- )
373
  else:
374
  cleaned_code = clean_code
375
  print("✅ Schema pop already in code, skipping injection")
376
-
377
- # Simple and safe schema replacement - just replace the placeholder
378
- cleaned_code = cleaned_code.replace("os.getenv('SNOWFLAKE_SCHEMA')", f"'{schema_name}'")
379
- cleaned_code = cleaned_code.replace('os.getenv("SNOWFLAKE_SCHEMA")', f'"{schema_name}"')
380
-
381
- # FIX: Remove fake.unique() calls that cause "duplicated values after 1,000 iterations" error
382
- cleaned_code = cleaned_code.replace("fake.unique.word()", "fake.word()")
383
- cleaned_code = cleaned_code.replace("fake.unique.email()", "fake.email()")
384
- cleaned_code = cleaned_code.replace("fake.unique.company()", "fake.company()")
385
 
386
- # FIX: Truncate phone numbers to avoid extension overflow (e.g., '790-923-3730x07350')
387
- cleaned_code = cleaned_code.replace("fake.phone_number()", "fake.phone_number()[:20]")
388
 
389
- # FIX: Convert SQLite-style ? placeholders to Snowflake-style %s placeholders
390
- cleaned_code = re.sub(r'\bVALUES\s*\(\?', 'VALUES (%s', cleaned_code)
391
- cleaned_code = re.sub(r',\s*\?', ', %s', cleaned_code)
392
 
393
  # DEBUG: Save modified code
394
  with open(os_module.path.join(debug_dir, '2_after_modifications.py'), 'w') as f:
@@ -1548,6 +1548,13 @@ TECHNICAL REQUIREMENTS:
1548
  - Include realistic column names that match the business context
1549
  - Add proper constraints and relationships
1550
 
 
 
 
 
 
 
 
1551
  SNOWFLAKE SYNTAX EXAMPLES:
1552
  - Auto-increment: ColumnID INT IDENTITY(1,1) PRIMARY KEY
1553
  - NOT: ColumnID INT PRIMARY KEY AUTO_INCREMENT
@@ -1664,15 +1671,21 @@ SCRIPT REQUIREMENTS:
1664
  **SNOWFLAKE SYNTAX**: Use %s placeholders, NOT ? placeholders
1665
  Example: cursor.executemany("INSERT INTO table (col1, col2, col3) VALUES (%s, %s, %s)", batch_data_list)
1666
  3. Build data in Python lists/arrays first, then batch insert (do NOT use individual cursor.execute in loops)
1667
- 4. Populate tables with realistic data volumes (1000+ rows per table)
1668
- 5. Create baseline normal data patterns
1669
- 6. Inject strategic outliers with STRUCTURED COMMENTS (see format below)
1670
- 7. Include scenarios showcasing: {persona_config['demo_objectives']}
1671
- 8. NO explanatory text, just executable Python code
1672
- 9. Focus on scenarios that resonate with {persona_config['target_persona']} and prove ROI
1673
- 10. Include data validation to ensure referential integrity
1674
- 11. Add progress logging for each table population
1675
- 12. Ensure all foreign key relationships are maintained
 
 
 
 
 
 
1676
 
1677
  OUTLIER DOCUMENTATION FORMAT (REQUIRED):
1678
  For each strategic outlier, add structured comments BEFORE the code that creates it:
 
362
  cleaned_code = cleaned_code.replace('os.getenv("SNOWFLAKE_SCHEMA")', f'"{schema_name}"')
363
  else:
364
  print("⚠️ LLM-generated code - applying safety fixes")
365
+ # CRITICAL FIX: Remove schema from conn_params to avoid duplicate schema parameter
366
  # Only add if not already present (new templates include it by default)
367
  if "conn_params.pop('schema'" not in clean_code:
368
  cleaned_code = replace_with_indentation(
369
  clean_code,
370
+ "conn_params = get_snowflake_connection_params()",
371
  ["conn_params.pop('schema', None) # Remove schema to avoid duplicate"]
372
+ )
373
  else:
374
  cleaned_code = clean_code
375
  print("✅ Schema pop already in code, skipping injection")
376
+
377
+ # Simple and safe schema replacement - just replace the placeholder
378
+ cleaned_code = cleaned_code.replace("os.getenv('SNOWFLAKE_SCHEMA')", f"'{schema_name}'")
379
+ cleaned_code = cleaned_code.replace('os.getenv("SNOWFLAKE_SCHEMA")', f'"{schema_name}"')
380
+
381
+ # FIX: Remove fake.unique() calls that cause "duplicated values after 1,000 iterations" error
382
+ cleaned_code = cleaned_code.replace("fake.unique.word()", "fake.word()")
383
+ cleaned_code = cleaned_code.replace("fake.unique.email()", "fake.email()")
384
+ cleaned_code = cleaned_code.replace("fake.unique.company()", "fake.company()")
385
 
386
+ # FIX: Truncate phone numbers to avoid extension overflow (e.g., '790-923-3730x07350')
387
+ cleaned_code = cleaned_code.replace("fake.phone_number()", "fake.phone_number()[:20]")
388
 
389
+ # FIX: Convert SQLite-style ? placeholders to Snowflake-style %s placeholders
390
+ cleaned_code = re.sub(r'\bVALUES\s*\(\?', 'VALUES (%s', cleaned_code)
391
+ cleaned_code = re.sub(r',\s*\?', ', %s', cleaned_code)
392
 
393
  # DEBUG: Save modified code
394
  with open(os_module.path.join(debug_dir, '2_after_modifications.py'), 'w') as f:
 
1548
  - Include realistic column names that match the business context
1549
  - Add proper constraints and relationships
1550
 
1551
+ TABLE NAMING REQUIREMENTS:
1552
+ - **DO NOT use DIM_ or FACT_ prefixes** (e.g., NOT DIM_PRODUCT or FACT_SALES)
1553
+ - Use simple, descriptive table names (e.g., PRODUCTS, CUSTOMERS, SALES, ORDERS)
1554
+ - Dimension tables: Use plural nouns (CUSTOMERS, PRODUCTS, WAREHOUSES)
1555
+ - Fact tables: Use descriptive names (SALES, TRANSACTIONS, ORDERS, INVENTORY_MOVEMENTS)
1556
+ - Keep names concise and business-friendly
1557
+
1558
  SNOWFLAKE SYNTAX EXAMPLES:
1559
  - Auto-increment: ColumnID INT IDENTITY(1,1) PRIMARY KEY
1560
  - NOT: ColumnID INT PRIMARY KEY AUTO_INCREMENT
 
1671
  **SNOWFLAKE SYNTAX**: Use %s placeholders, NOT ? placeholders
1672
  Example: cursor.executemany("INSERT INTO table (col1, col2, col3) VALUES (%s, %s, %s)", batch_data_list)
1673
  3. Build data in Python lists/arrays first, then batch insert (do NOT use individual cursor.execute in loops)
1674
+ 4. Populate tables with realistic data volumes (10,000+ rows for transactions)
1675
+ 5. **REALISTIC TRANSACTION AMOUNTS**:
1676
+ - For e-commerce/retail: $50-$2,000 per order (rare large orders up to $50,000)
1677
+ - For B2B: $1,000-$50,000 per order (enterprise orders up to $500,000)
1678
+ - **NEVER create individual transactions over $1M** - use many small transactions
1679
+ - To reach high totals: generate MANY transactions, not huge individual amounts
1680
+ - Example: $40B total = 100,000+ transactions averaging $400k each
1681
+ 6. Create baseline normal data patterns
1682
+ 7. Inject strategic outliers with STRUCTURED COMMENTS (see format below)
1683
+ 8. Include scenarios showcasing: {persona_config['demo_objectives']}
1684
+ 9. NO explanatory text, just executable Python code
1685
+ 10. Focus on scenarios that resonate with {persona_config['target_persona']} and prove ROI
1686
+ 11. Include data validation to ensure referential integrity
1687
+ 12. Add progress logging for each table population
1688
+ 13. Ensure all foreign key relationships are maintained
1689
 
1690
  OUTLIER DOCUMENTATION FORMAT (REQUIRED):
1691
  For each strategic outlier, add structured comments BEFORE the code that creates it:
launch_chat.py CHANGED
@@ -4,8 +4,15 @@ Quick launcher for the new chat-based interface
4
  """
5
 
6
  from chat_interface import create_chat_interface
 
7
 
8
  if __name__ == "__main__":
 
 
 
 
 
 
9
  print("🚀 Starting Chat-Based Demo Builder...")
10
  print("=" * 60)
11
  print()
@@ -16,14 +23,14 @@ if __name__ == "__main__":
16
  print(" • Editable AI model selector")
17
  print(" • Quick action buttons")
18
  print()
19
- print("🌐 Opening in browser at http://localhost:7862")
20
  print("=" * 60)
21
 
22
  app = create_chat_interface()
23
 
24
  app.launch(
25
  server_name="0.0.0.0",
26
- server_port=7862,
27
  share=False,
28
  inbrowser=True,
29
  debug=True
 
4
  """
5
 
6
  from chat_interface import create_chat_interface
7
+ from datetime import datetime
8
 
9
  if __name__ == "__main__":
10
+ # Write directly to log file
11
+ with open('/tmp/chat_output.log', 'a') as f:
12
+ f.write(f"\n{'='*70}\n")
13
+ f.write(f"TSDB APP START: {datetime.now().strftime('%Y-%m-%d %H:%M:%S')}\n")
14
+ f.write(f"{'='*70}\n\n")
15
+
16
  print("🚀 Starting Chat-Based Demo Builder...")
17
  print("=" * 60)
18
  print()
 
23
  print(" • Editable AI model selector")
24
  print(" • Quick action buttons")
25
  print()
26
+ print("🌐 Opening in browser at http://localhost:7863")
27
  print("=" * 60)
28
 
29
  app = create_chat_interface()
30
 
31
  app.launch(
32
  server_name="0.0.0.0",
33
+ server_port=7863,
34
  share=False,
35
  inbrowser=True,
36
  debug=True
liveboard_creator.py CHANGED
@@ -15,6 +15,7 @@ import json
15
  import yaml
16
  import os
17
  import re
 
18
  from typing import Dict, List, Optional
19
 
20
 
@@ -145,6 +146,132 @@ class OutlierParser:
145
  return outliers
146
 
147
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
148
  class QueryTranslator:
149
  """Translate natural language queries to ThoughtSpot search syntax"""
150
 
@@ -302,7 +429,8 @@ class LiveboardCreator:
302
  self.model_columns = self._fetch_model_columns()
303
 
304
  # Use selected LLM model instead of hardcoded OpenAI
305
- from main_research import MultiLLMResearcher, map_llm_display_to_provider
 
306
  model_to_use = llm_model or 'claude-sonnet-4.5'
307
  provider_name, model_name_llm = map_llm_display_to_provider(model_to_use)
308
  self.llm_researcher = MultiLLMResearcher(provider=provider_name, model=model_name_llm)
@@ -1013,7 +1141,7 @@ Examples:
1013
  'name': viz_config.get('name', 'Text'),
1014
  'description': viz_config.get('description', ''),
1015
  'tables': [{
1016
- 'id': self.model_name,
1017
  'name': self.model_name
1018
  }],
1019
  'text_tile': {
@@ -1134,7 +1262,7 @@ Examples:
1134
  'name': viz_config['name'],
1135
  'description': viz_config.get('description', ''),
1136
  'tables': [{
1137
- 'id': self.model_name, # Use model name not GUID for TML
1138
  'name': self.model_name
1139
  }],
1140
  'search_query': search_query,
@@ -1219,6 +1347,9 @@ For each visualization, create a JSON object with:
1219
  Guidelines:
1220
  - Mix chart types (don't use all the same type)
1221
  - Include at least 1-2 KPI charts for key metrics
 
 
 
1222
  - Include trend analysis with LINE or AREA charts
1223
  - Include comparisons with COLUMN or BAR charts
1224
  - Use appropriate time filters for business context
@@ -1239,10 +1370,31 @@ Return ONLY a valid JSON object with structure:
1239
  try:
1240
  messages = [{"role": "user", "content": prompt}]
1241
  response_text = self.llm_researcher.make_request(messages, temperature=0.7, max_tokens=4000, stream=False)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1242
  result = json.loads(response_text)
1243
  return result.get('visualizations', [])
1244
  except Exception as e:
1245
  print(f"Error generating visualizations: {e}")
 
1246
  # Return fallback simple visualizations
1247
  return self._generate_fallback_visualizations(available_measures, date_columns)
1248
 
@@ -1255,11 +1407,13 @@ Return ONLY a valid JSON object with structure:
1255
  fallback_viz = []
1256
 
1257
  if measures and date_columns:
1258
- # KPI of first measure
1259
  fallback_viz.append({
1260
  'name': f'Total {measures[0]}',
1261
  'chart_type': 'KPI',
1262
- 'measure': measures[0]
 
 
1263
  })
1264
 
1265
  # Trend of first measure
@@ -1273,11 +1427,13 @@ Return ONLY a valid JSON object with structure:
1273
  })
1274
 
1275
  if len(measures) > 1:
1276
- # Second measure KPI
1277
  fallback_viz.append({
1278
  'name': f'Total {measures[1]}',
1279
  'chart_type': 'KPI',
1280
- 'measure': measures[1]
 
 
1281
  })
1282
 
1283
  return fallback_viz
@@ -1341,28 +1497,30 @@ Return ONLY a valid JSON object with structure:
1341
  else:
1342
  viz_configs = []
1343
 
 
1344
  # Add text tiles for context (like in sample liveboard)
1345
- text_tiles = [
1346
- {
1347
- 'id': 'Text_1',
1348
- 'name': '📊 Dashboard Overview',
1349
- 'chart_type': 'TEXT',
1350
- 'text_content': f"## {company_data.get('name', 'Company')} Analytics\n\n{use_case} insights and metrics",
1351
- 'background_color': '#2E3D4D' # Dark blue-gray
1352
- },
1353
- {
1354
- 'id': 'Text_2',
1355
- 'name': 'Key Insights',
1356
- 'chart_type': 'TEXT',
1357
- 'text_content': "💡 **Key Performance Indicators**\n\nTrack trends and identify opportunities",
1358
- 'background_color': '#85016b' # Pink (from sample)
1359
- }
1360
- ]
 
1361
 
1362
  # Create text tile visualizations
1363
- for text_config in text_tiles:
1364
- viz_tml = self.create_visualization_tml(text_config)
1365
- visualizations.append(viz_tml)
1366
 
1367
  # Create visualization TML objects
1368
  if viz_configs:
@@ -1392,7 +1550,8 @@ Return ONLY a valid JSON object with structure:
1392
  }
1393
  }
1394
 
1395
- return json.dumps(liveboard_tml, indent=2)
 
1396
 
1397
  def _check_liveboard_errors(self, liveboard_id: str) -> Dict:
1398
  """
@@ -1484,6 +1643,11 @@ Return ONLY a valid JSON object with structure:
1484
  - error: Error message if failed
1485
  """
1486
  try:
 
 
 
 
 
1487
  response = self.ts_client.session.post(
1488
  f"{self.ts_client.base_url}/api/rest/2.0/metadata/tml/import",
1489
  headers=self.ts_client.headers,
@@ -1634,6 +1798,9 @@ def _create_kpi_question_from_outlier(outlier: Dict) -> Optional[str]:
1634
  """
1635
  Create companion KPI question from outlier if KPI metric is specified.
1636
 
 
 
 
1637
  Args:
1638
  outlier: Dictionary with outlier metadata
1639
 
@@ -1645,14 +1812,14 @@ def _create_kpi_question_from_outlier(outlier: Dict) -> Optional[str]:
1645
 
1646
  kpi_metric = outlier.get('kpi_metric', '')
1647
  if kpi_metric:
1648
- # Create a question about the total/aggregate
1649
- return f"What is the total {kpi_metric}?"
1650
 
1651
  # Fallback: extract first measure from viz_measure_types
1652
  measure_types = outlier.get('viz_measure_types', '')
1653
  if measure_types:
1654
  first_measure = measure_types.split(',')[0].strip()
1655
- return f"What is the total {first_measure}?"
1656
 
1657
  return None
1658
 
@@ -1677,7 +1844,8 @@ def _generate_smart_questions_with_ai(
1677
  """
1678
  try:
1679
  # Use the selected LLM model
1680
- from main_research import MultiLLMResearcher, map_llm_display_to_provider
 
1681
 
1682
  model_to_use = llm_model or 'claude-sonnet-4.5'
1683
  provider_name, model_name = map_llm_display_to_provider(model_to_use)
@@ -1690,18 +1858,57 @@ def _generate_smart_questions_with_ai(
1690
  Company: {company_data.get('name', 'Unknown Company')}
1691
  Use Case: {use_case}
1692
 
1693
- Generate questions that would make compelling visualizations for a demo. Each question should:
1694
- - Be specific and actionable (not generic like "What is total sales?")
1695
- - Include time periods when relevant (last quarter, this year vs last year, etc.)
1696
- - Reference business concepts relevant to {use_case}
1697
- - Ask about trends, patterns, comparisons, or top/bottom N
1698
- - Sound like questions a real business analyst would ask
1699
-
1700
- Examples of GOOD questions:
1701
- - "What are the top 10 products by revenue in the last quarter?"
1702
- - "How do sales compare across regions this year versus last year?"
1703
- - "Which customers have the highest lifetime value but declining engagement?"
1704
- - "What is the monthly revenue trend for the past 12 months?"
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1705
 
1706
  Return ONLY a JSON object with this exact structure (no other text):
1707
  {{
@@ -1720,6 +1927,17 @@ Return ONLY a JSON object with this exact structure (no other text):
1720
  if not content or content.strip() == '':
1721
  print(f" ⚠️ AI returned empty content")
1722
  raise ValueError("Empty AI response")
 
 
 
 
 
 
 
 
 
 
 
1723
 
1724
  result = json.loads(content)
1725
 
@@ -1736,11 +1954,16 @@ Return ONLY a JSON object with this exact structure (no other text):
1736
 
1737
  except Exception as e:
1738
  print(f" ⚠️ AI question generation failed: {e}")
1739
- # Fallback to simple questions
 
 
1740
  return [
1741
- f"What is the total revenue for {use_case}?",
1742
- f"Show me the trend over time for {use_case}",
1743
- f"What are the top categories in {use_case}?"
 
 
 
1744
  ][:num_questions]
1745
 
1746
 
@@ -1982,25 +2205,14 @@ def create_liveboard_from_model_mcp(
1982
  if mcp_count > 0:
1983
  source_info.append(f"⚡ {mcp_count} MCP-suggested")
1984
 
1985
- note_tile = f"""
1986
- <div style="background: linear-gradient(135deg, #667eea 0%, #764ba2 100%);
1987
- padding: 40px; border-radius: 20px; color: white; font-family: system-ui;">
1988
- <h1 style="margin: 0 0 20px 0; font-size: 36px;">
1989
- {company_data.get('name', 'Company')} {use_case}
1990
- </h1>
1991
- <p style="margin: 0 0 25px 0; font-size: 18px; opacity: 0.95; line-height: 1.5;">
1992
- {company_data.get('description', 'AI-powered analytics dashboard')}
1993
- </p>
1994
- {outlier_section}
1995
- <div style="margin-top: 25px; padding: 20px; background: rgba(255,255,255,0.1);
1996
- border-radius: 12px;">
1997
- <p style="margin: 0; font-size: 13px; opacity: 0.9;">
1998
- 📊 {len(answers)} visualizations | {' | '.join(source_info) if source_info else 'AI-powered insights'} |
1999
- 🚀 Created with ThoughtSpot MCP
2000
- </p>
2001
- </div>
2002
- </div>
2003
- """
2004
 
2005
  print(f"🎨 Creating liveboard: {final_liveboard_name}")
2006
  print(f"📊 Preparing to send {len(answers)} answers to createLiveboard")
@@ -2042,6 +2254,16 @@ def create_liveboard_from_model_mcp(
2042
  result_text = liveboard_result.content[0].text
2043
  print(f"🔍 DEBUG: Parsing response text for URL...")
2044
 
 
 
 
 
 
 
 
 
 
 
2045
  except Exception as create_error:
2046
  print(f"❌ createLiveboard failed: {str(create_error)}")
2047
  print(f" Error type: {type(create_error).__name__}")
@@ -2067,6 +2289,136 @@ def create_liveboard_from_model_mcp(
2067
  print(f"🔗 URL: {liveboard_url}")
2068
  print(f"🆔 GUID: {liveboard_guid}")
2069
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
2070
  return {
2071
  'success': True,
2072
  'liveboard_name': final_liveboard_name,
 
15
  import yaml
16
  import os
17
  import re
18
+ import requests
19
  from typing import Dict, List, Optional
20
 
21
 
 
146
  return outliers
147
 
148
 
149
+ def clean_viz_title(question: str) -> str:
150
+ """
151
+ Extract a clean, short title from a verbose question
152
+
153
+ Examples:
154
+ "What is the total revenue? Show only..." -> "Total Revenue"
155
+ "Show the top 10 products by revenue..." -> "Top 10 Products by Revenue"
156
+ """
157
+ # Remove instructions after question mark or period
158
+ question = question.split('?')[0].strip()
159
+ question = question.split('.')[0].strip()
160
+
161
+ # Common patterns to clean
162
+ patterns = [
163
+ (r'^What is the ', ''),
164
+ (r'^What are the ', ''),
165
+ (r'^Show me the ', ''),
166
+ (r'^Show the ', ''),
167
+ (r'^Show ', ''),
168
+ (r'^Create a detailed table showing ', ''),
169
+ (r'^How (?:does|do) ', ''),
170
+ (r' for Amazon\.com', ''),
171
+ (r' for the company', ''),
172
+ ]
173
+
174
+ clean = question
175
+ for pattern, replacement in patterns:
176
+ clean = re.sub(pattern, replacement, clean, flags=re.IGNORECASE)
177
+
178
+ # Capitalize first letter
179
+ if clean:
180
+ clean = clean[0].upper() + clean[1:]
181
+
182
+ # Limit length
183
+ if len(clean) > 80:
184
+ clean = clean[:77] + "..."
185
+
186
+ return clean or question[:80]
187
+
188
+
189
+ def extract_brand_colors_from_css(css_url: str) -> List[str]:
190
+ """Extract color codes from a CSS file, filtering for brand-appropriate colors"""
191
+ try:
192
+ response = requests.get(css_url, timeout=5)
193
+ if response.status_code == 200:
194
+ css_content = response.text
195
+ # Find all hex colors
196
+ hex_colors = re.findall(r'#([0-9A-Fa-f]{6}|[0-9A-Fa-f]{3})\b', css_content)
197
+
198
+ # Convert 3-digit to 6-digit and calculate brightness
199
+ color_data = []
200
+ for color in hex_colors:
201
+ if len(color) == 3:
202
+ color = ''.join([c*2 for c in color])
203
+
204
+ # Calculate brightness
205
+ r, g, b = int(color[0:2], 16), int(color[2:4], 16), int(color[4:6], 16)
206
+ brightness = (r * 299 + g * 587 + b * 114) / 1000
207
+
208
+ # Filter out too dark (< 30) or too light (> 240) colors
209
+ # These are usually backgrounds, not brand colors
210
+ if 30 < brightness < 240:
211
+ # Calculate saturation
212
+ max_c = max(r, g, b)
213
+ min_c = min(r, g, b)
214
+ saturation = (max_c - min_c) / max_c if max_c > 0 else 0
215
+
216
+ color_data.append({
217
+ 'color': f'#{color}',
218
+ 'brightness': brightness,
219
+ 'saturation': saturation
220
+ })
221
+
222
+ # Deduplicate
223
+ seen = set()
224
+ unique_colors = []
225
+ for c in color_data:
226
+ if c['color'] not in seen:
227
+ seen.add(c['color'])
228
+ unique_colors.append(c)
229
+
230
+ # Sort by saturation (vibrant colors first) then brightness
231
+ unique_colors.sort(key=lambda x: (x['saturation'], x['brightness']), reverse=True)
232
+
233
+ # Take top 5 most vibrant colors
234
+ brand_colors = [c['color'] for c in unique_colors[:5]]
235
+ return brand_colors if brand_colors else None
236
+ except:
237
+ pass
238
+ return None
239
+
240
+
241
+ def create_branded_note_tile(company_data: Dict, use_case: str, answers: List[Dict],
242
+ source_info: List[str], outlier_section: str = "") -> str:
243
+ """
244
+ Create a branded note tile matching golden demo format
245
+ Uses ThoughtSpot's native theme classes for proper rendering
246
+
247
+ Args:
248
+ company_data: Company information including logo_url and brand_colors
249
+ use_case: Use case name
250
+ answers: List of visualization answers
251
+ source_info: List of source information strings
252
+ outlier_section: HTML for outlier highlights section
253
+
254
+ Returns:
255
+ HTML string for the note tile
256
+ """
257
+ # Extract company info
258
+ company_name = company_data.get('name', 'Company')
259
+ industry = company_data.get('industry', 'business')
260
+ viz_count = len(answers)
261
+
262
+ # Build source info text
263
+ if source_info:
264
+ source_text = ', '.join(source_info)
265
+ else:
266
+ source_text = 'business data'
267
+
268
+ # Golden demo format: EXACT structure from "Weekly Updates" tile
269
+ # Multiple paragraphs with white text and bold teal highlights
270
+ note_tile = f"""<h2 class="theme-module__editor-h2" dir="ltr"><span style="color: rgb(255, 255, 255); white-space: pre-wrap;">{company_name}</span></h2><hr><p class="theme-module__editor-paragraph" dir="ltr"><span style="color: rgb(255, 255, 255); white-space: pre-wrap;">This liveboard features </span><b><strong class="theme-module__editor-text-bold" style="color: rgb(64, 193, 192); white-space: pre-wrap;">{viz_count} AI-generated visualizations</strong></b><span style="color: rgb(255, 255, 255); white-space: pre-wrap;"> analyzing {industry} performance</span></p><p class="theme-module__editor-paragraph"><br></p><p class="theme-module__editor-paragraph" dir="ltr"><span style="color: rgb(255, 255, 255); white-space: pre-wrap;">Insights powered by </span><b><strong class="theme-module__editor-text-bold" style="color: rgb(64, 193, 192); white-space: pre-wrap;">ThoughtSpot AI</strong></b><span style="color: rgb(255, 255, 255); white-space: pre-wrap;"> across </span><b><strong class="theme-module__editor-text-bold" style="color: rgb(64, 193, 192); white-space: pre-wrap;">{source_text}</strong></b></p><p class="theme-module__editor-paragraph"><br></p><p class="theme-module__editor-paragraph" dir="ltr"><span style="color: rgb(255, 255, 255); white-space: pre-wrap;">Created with </span><b><strong class="theme-module__editor-text-bold" style="color: rgb(64, 193, 192); white-space: pre-wrap;">Demo Wire</strong></b><span style="color: rgb(255, 255, 255); white-space: pre-wrap;"> automation platform</span></p>"""
271
+
272
+ return note_tile
273
+
274
+
275
  class QueryTranslator:
276
  """Translate natural language queries to ThoughtSpot search syntax"""
277
 
 
429
  self.model_columns = self._fetch_model_columns()
430
 
431
  # Use selected LLM model instead of hardcoded OpenAI
432
+ from main_research import MultiLLMResearcher
433
+ from demo_prep import map_llm_display_to_provider
434
  model_to_use = llm_model or 'claude-sonnet-4.5'
435
  provider_name, model_name_llm = map_llm_display_to_provider(model_to_use)
436
  self.llm_researcher = MultiLLMResearcher(provider=provider_name, model=model_name_llm)
 
1141
  'name': viz_config.get('name', 'Text'),
1142
  'description': viz_config.get('description', ''),
1143
  'tables': [{
1144
+ 'id': self.model_name, # Use model NAME for TML (not GUID!)
1145
  'name': self.model_name
1146
  }],
1147
  'text_tile': {
 
1262
  'name': viz_config['name'],
1263
  'description': viz_config.get('description', ''),
1264
  'tables': [{
1265
+ 'id': self.model_name, # Use model NAME for TML (not GUID!)
1266
  'name': self.model_name
1267
  }],
1268
  'search_query': search_query,
 
1347
  Guidelines:
1348
  - Mix chart types (don't use all the same type)
1349
  - Include at least 1-2 KPI charts for key metrics
1350
+ * IMPORTANT: For KPIs, ALWAYS include time_column and granularity (monthly/quarterly/yearly)
1351
+ * This enables sparklines and percent change comparisons (MoM/QoQ/YoY)
1352
+ * Example KPI: measure="Total_revenue", time_column="Order_date", granularity="monthly"
1353
  - Include trend analysis with LINE or AREA charts
1354
  - Include comparisons with COLUMN or BAR charts
1355
  - Use appropriate time filters for business context
 
1370
  try:
1371
  messages = [{"role": "user", "content": prompt}]
1372
  response_text = self.llm_researcher.make_request(messages, temperature=0.7, max_tokens=4000, stream=False)
1373
+
1374
+ # Debug: Check what we got back
1375
+ print(f"🔍 DEBUG: AI response type: {type(response_text)}")
1376
+ print(f"🔍 DEBUG: AI response length: {len(response_text) if response_text else 0}")
1377
+ if response_text:
1378
+ print(f"🔍 DEBUG: AI response first 200 chars: {response_text[:200]}")
1379
+ else:
1380
+ print(f"❌ ERROR: AI returned empty response!")
1381
+ return self._generate_fallback_visualizations(available_measures, date_columns)
1382
+
1383
+ # Strip markdown code fences if present
1384
+ response_text = response_text.strip()
1385
+ if response_text.startswith('```'):
1386
+ # Remove opening fence (```json or ```)
1387
+ lines = response_text.split('\n')
1388
+ response_text = '\n'.join(lines[1:])
1389
+ # Remove closing fence
1390
+ if response_text.endswith('```'):
1391
+ response_text = response_text[:-3].strip()
1392
+
1393
  result = json.loads(response_text)
1394
  return result.get('visualizations', [])
1395
  except Exception as e:
1396
  print(f"Error generating visualizations: {e}")
1397
+ print(f" Response text was: {response_text[:500] if response_text else 'None'}")
1398
  # Return fallback simple visualizations
1399
  return self._generate_fallback_visualizations(available_measures, date_columns)
1400
 
 
1407
  fallback_viz = []
1408
 
1409
  if measures and date_columns:
1410
+ # KPI of first measure with time-series for sparkline
1411
  fallback_viz.append({
1412
  'name': f'Total {measures[0]}',
1413
  'chart_type': 'KPI',
1414
+ 'measure': measures[0],
1415
+ 'time_column': date_columns[0],
1416
+ 'granularity': 'monthly'
1417
  })
1418
 
1419
  # Trend of first measure
 
1427
  })
1428
 
1429
  if len(measures) > 1:
1430
+ # Second measure KPI with time-series for sparkline
1431
  fallback_viz.append({
1432
  'name': f'Total {measures[1]}',
1433
  'chart_type': 'KPI',
1434
+ 'measure': measures[1],
1435
+ 'time_column': date_columns[0],
1436
+ 'granularity': 'quarterly'
1437
  })
1438
 
1439
  return fallback_viz
 
1497
  else:
1498
  viz_configs = []
1499
 
1500
+ # TEMPORARILY DISABLED - Text tiles causing TML import errors
1501
  # Add text tiles for context (like in sample liveboard)
1502
+ text_tiles = []
1503
+ # text_tiles = [
1504
+ # {
1505
+ # 'id': 'Text_1',
1506
+ # 'name': '📊 Dashboard Overview',
1507
+ # 'chart_type': 'TEXT',
1508
+ # 'text_content': f"## {company_data.get('name', 'Company')} Analytics\n\n{use_case} insights and metrics",
1509
+ # 'background_color': '#2E3D4D' # Dark blue-gray
1510
+ # },
1511
+ # {
1512
+ # 'id': 'Text_2',
1513
+ # 'name': 'Key Insights',
1514
+ # 'chart_type': 'TEXT',
1515
+ # 'text_content': "💡 **Key Performance Indicators**\n\nTrack trends and identify opportunities",
1516
+ # 'background_color': '#85016b' # Pink (from sample)
1517
+ # }
1518
+ # ]
1519
 
1520
  # Create text tile visualizations
1521
+ # for text_config in text_tiles:
1522
+ # viz_tml = self.create_visualization_tml(text_config)
1523
+ # visualizations.append(viz_tml)
1524
 
1525
  # Create visualization TML objects
1526
  if viz_configs:
 
1550
  }
1551
  }
1552
 
1553
+ # Convert to YAML format (TML is YAML, not JSON)
1554
+ return yaml.dump(liveboard_tml, default_flow_style=False, sort_keys=False)
1555
 
1556
  def _check_liveboard_errors(self, liveboard_id: str) -> Dict:
1557
  """
 
1643
  - error: Error message if failed
1644
  """
1645
  try:
1646
+ # Debug: Log the TML being sent
1647
+ print(f"🔍 DEBUG: Sending TML to ThoughtSpot")
1648
+ print(f"🔍 DEBUG: TML length: {len(liveboard_tml_json)}")
1649
+ print(f"🔍 DEBUG: TML first 500 chars:\n{liveboard_tml_json[:500]}")
1650
+
1651
  response = self.ts_client.session.post(
1652
  f"{self.ts_client.base_url}/api/rest/2.0/metadata/tml/import",
1653
  headers=self.ts_client.headers,
 
1798
  """
1799
  Create companion KPI question from outlier if KPI metric is specified.
1800
 
1801
+ For sparklines, KPI questions need time dimension!
1802
+ Example: "What is the total revenue by week over the last 8 quarters?"
1803
+
1804
  Args:
1805
  outlier: Dictionary with outlier metadata
1806
 
 
1812
 
1813
  kpi_metric = outlier.get('kpi_metric', '')
1814
  if kpi_metric:
1815
+ # Include time dimension for sparkline visualization
1816
+ return f"What is the total {kpi_metric} by week over the last 8 quarters?"
1817
 
1818
  # Fallback: extract first measure from viz_measure_types
1819
  measure_types = outlier.get('viz_measure_types', '')
1820
  if measure_types:
1821
  first_measure = measure_types.split(',')[0].strip()
1822
+ return f"What is the total {first_measure} by week over the last 8 quarters?"
1823
 
1824
  return None
1825
 
 
1844
  """
1845
  try:
1846
  # Use the selected LLM model
1847
+ from main_research import MultiLLMResearcher
1848
+ from demo_prep import map_llm_display_to_provider
1849
 
1850
  model_to_use = llm_model or 'claude-sonnet-4.5'
1851
  provider_name, model_name = map_llm_display_to_provider(model_to_use)
 
1858
  Company: {company_data.get('name', 'Unknown Company')}
1859
  Use Case: {use_case}
1860
 
1861
+ Create questions that will produce high-quality visualizations. Each question MUST:
1862
+ - Be specific and actionable
1863
+ - Include a visualization type (line chart, bar chart, stacked column chart, pie chart, table, or single number KPI)
1864
+ - Use "top N" with specific numbers (top 5, top 10, top 15)
1865
+ - Include time periods (last 12 months, this year vs last year, last quarter, past 18 months)
1866
+ - Specify sorting (ranked from highest to lowest, ordered by)
1867
+ - Use "exactly N" when appropriate to enforce result counts
1868
+
1869
+ REQUIRED FORMAT - Each question should be SHORT (under 80 chars preferred) but include viz type at the end:
1870
+
1871
+ KPI (Single Number with Sparkline):
1872
+ "Total revenue by week - last 8 quarters (KPI)"
1873
+ "Profit margin weekly over time (single number)"
1874
+ NOTE: KPIs need time dimension for sparklines and trend comparisons!
1875
+
1876
+ Time Trend:
1877
+ "Monthly sales - last 12 months (line chart)"
1878
+ "Revenue trend this year vs last year (line chart)"
1879
+
1880
+ Top N Ranking:
1881
+ "Top 10 products by revenue (bar chart)"
1882
+ "Top 15 customers by orders - last 18 months (bar chart)"
1883
+
1884
+ Comparison Chart:
1885
+ "Revenue by region and quarter (stacked column chart)"
1886
+ "Sales by customer segment and month (stacked column)"
1887
+
1888
+ Detailed Table:
1889
+ "Revenue, units sold, avg order value by month and category (table)"
1890
+
1891
+ Examples of PERFECT questions (SHORT with viz type):
1892
+ - "Total profit margin by week - last 8 quarters (KPI)"
1893
+ - "Monthly sales trend - 12 months (line chart)"
1894
+ - "Top 10 customers by revenue (bar chart)"
1895
+ - "Revenue by region and quarter (stacked column)"
1896
+ - "Profit margins by category (horizontal bar chart)"
1897
+ - "Sales detail by month and category (table)"
1898
+
1899
+ REQUIRED QUESTIONS - Include these in THIS ORDER:
1900
+ 1. "Total revenue by week - last 8 quarters (KPI)"
1901
+ 2. "Total revenue by month month over month (line chart)"
1902
+ 3. "Top 10 product_name by total revenue (bar chart)"
1903
+ 4. "Profit margin weekly week over week (line chart)"
1904
+ 5. "Profit margin by category_l1 (bar chart)"
1905
+ 6. "Product performance by category and brand (table)"
1906
+ 7. "Total revenue by region (geo map)"
1907
+ 8. "Top products by region last month (stacked bar chart)"
1908
+
1909
+ Then generate {num_questions - 8} additional creative questions if needed.
1910
+
1911
+ CRITICAL: Keep questions SHORT (50-80 chars) since they become visualization titles!
1912
 
1913
  Return ONLY a JSON object with this exact structure (no other text):
1914
  {{
 
1927
  if not content or content.strip() == '':
1928
  print(f" ⚠️ AI returned empty content")
1929
  raise ValueError("Empty AI response")
1930
+
1931
+ # Try to extract JSON if AI added extra text
1932
+ content = content.strip()
1933
+ if not content.startswith('{') and not content.startswith('['):
1934
+ # Look for JSON in the response
1935
+ json_match = re.search(r'(\{[\s\S]*\})', content)
1936
+ if json_match:
1937
+ content = json_match.group(1)
1938
+ else:
1939
+ print(f" ⚠️ No JSON found in AI response")
1940
+ raise ValueError("No JSON in response")
1941
 
1942
  result = json.loads(content)
1943
 
 
1954
 
1955
  except Exception as e:
1956
  print(f" ⚠️ AI question generation failed: {e}")
1957
+ print(f" 🔍 DEBUG: AI response was: {response_text[:500] if 'response_text' in locals() else 'N/A'}")
1958
+ # Fallback to better generic questions (not using use_case literally)
1959
+ company_name = company_data.get('name', 'the company')
1960
  return [
1961
+ f"Total revenue (KPI)",
1962
+ f"Monthly revenue trend - last 12 months (line chart)",
1963
+ f"Top 10 products by revenue (bar chart)",
1964
+ f"Revenue by category (bar chart)",
1965
+ f"Sales by region and month (stacked column)",
1966
+ f"Revenue and profit detail by category (table)"
1967
  ][:num_questions]
1968
 
1969
 
 
2205
  if mcp_count > 0:
2206
  source_info.append(f"⚡ {mcp_count} MCP-suggested")
2207
 
2208
+ # Create branded note tile with logo and colors
2209
+ note_tile = create_branded_note_tile(
2210
+ company_data=company_data,
2211
+ use_case=use_case,
2212
+ answers=answers,
2213
+ source_info=source_info,
2214
+ outlier_section=outlier_section
2215
+ )
 
 
 
 
 
 
 
 
 
 
 
2216
 
2217
  print(f"🎨 Creating liveboard: {final_liveboard_name}")
2218
  print(f"📊 Preparing to send {len(answers)} answers to createLiveboard")
 
2254
  result_text = liveboard_result.content[0].text
2255
  print(f"🔍 DEBUG: Parsing response text for URL...")
2256
 
2257
+ # Check if MCP returned an error
2258
+ if result_text.startswith('ERROR:'):
2259
+ error_msg = result_text.replace('ERROR:', '').strip()
2260
+ print(f"❌ MCP createLiveboard failed: {error_msg}")
2261
+ return {
2262
+ 'success': False,
2263
+ 'error': f'MCP liveboard creation failed: {error_msg}',
2264
+ 'liveboard_name': final_liveboard_name
2265
+ }
2266
+
2267
  except Exception as create_error:
2268
  print(f"❌ createLiveboard failed: {str(create_error)}")
2269
  print(f" Error type: {type(create_error).__name__}")
 
2289
  print(f"🔗 URL: {liveboard_url}")
2290
  print(f"🆔 GUID: {liveboard_guid}")
2291
 
2292
+ # POST-MCP PROCESSING: Fix note tile layout
2293
+ # MCP creates note tiles with height: 8, but we want height: 2 like golden demo
2294
+ if liveboard_guid:
2295
+ print(f"🔧 Post-processing: Fixing note tile layout...")
2296
+ try:
2297
+ # Use ts_client session (already authenticated)
2298
+ ts_base_url = ts_client.base_url
2299
+
2300
+ # Export liveboard TML using authenticated session
2301
+ export_response = ts_client.session.post(
2302
+ f"{ts_base_url}/api/rest/2.0/metadata/tml/export",
2303
+ json={
2304
+ 'metadata': [{'identifier': liveboard_guid}],
2305
+ 'export_associated': False
2306
+ }
2307
+ )
2308
+
2309
+ if export_response.status_code == 200:
2310
+ tml_data = export_response.json()
2311
+ if tml_data and len(tml_data) > 0:
2312
+ # Parse YAML TML
2313
+ import yaml
2314
+ tml_str = tml_data[0].get('edoc', '')
2315
+ liveboard_tml = yaml.safe_load(tml_str)
2316
+
2317
+ # Fix Viz_1 layout (note tile)
2318
+ layout = liveboard_tml.get('liveboard', {}).get('layout', {})
2319
+ tiles = layout.get('tiles', [])
2320
+
2321
+ # Find and fix Viz_1 note tile dimensions (readable size)
2322
+ for tile in tiles:
2323
+ if tile.get('visualization_id') == 'Viz_1':
2324
+ # Make it readable: height 4, width 6 (half screen)
2325
+ tile['height'] = 4
2326
+ tile['width'] = 6
2327
+ print(f" ✓ Fixed Viz_1: height={tile['height']}, width={tile['width']}")
2328
+ break
2329
+
2330
+ # Replace Viz_1 content with company name using golden demo styling
2331
+ company_name = company_data.get('name', 'Company')
2332
+
2333
+ # Use the actual create_branded_note_tile function
2334
+ company_note = create_branded_note_tile(
2335
+ company_data=company_data,
2336
+ use_case=use_case or '',
2337
+ answers=answers,
2338
+ source_info=source_info,
2339
+ outlier_section=''
2340
+ )
2341
+
2342
+ # Find and update Viz_1 content in visualizations
2343
+ visualizations = liveboard_tml.get('liveboard', {}).get('visualizations', [])
2344
+ for viz in visualizations:
2345
+ if viz.get('id') == 'Viz_1':
2346
+ # Handle different TML structures
2347
+ if 'note_tile' in viz:
2348
+ # MCP creates note tiles with note_tile structure
2349
+ # note_tile is a DICT with html_parsed_string key
2350
+ if isinstance(viz['note_tile'], dict) and 'html_parsed_string' in viz['note_tile']:
2351
+ viz['note_tile']['html_parsed_string'] = company_note
2352
+ else:
2353
+ viz['note_tile'] = {'html_parsed_string': company_note}
2354
+ elif 'answer' in viz:
2355
+ viz['answer']['text_data'] = company_note
2356
+ viz['answer']['name'] = f'{company_name} Info'
2357
+ else:
2358
+ # Direct structure without wrappers
2359
+ if 'text_data' in viz:
2360
+ viz['text_data'] = company_note
2361
+ if 'name' in viz:
2362
+ viz['name'] = f'{company_name} Info'
2363
+ print(f" ✓ Replaced Viz_1 content with {company_name}")
2364
+ break
2365
+
2366
+ # Add style_properties to make note tile dark themed (like golden demo)
2367
+ style = liveboard_tml.get('liveboard', {}).get('style', {})
2368
+ if 'overrides' not in style:
2369
+ style['overrides'] = []
2370
+ liveboard_tml['liveboard']['style'] = style
2371
+
2372
+ # Check if Viz_1 already has style override
2373
+ viz_1_has_style = False
2374
+ for override in style['overrides']:
2375
+ if override.get('object_id') == 'Viz_1':
2376
+ viz_1_has_style = True
2377
+ # Ensure it has tile_brand_color for dark background
2378
+ if 'style_properties' not in override:
2379
+ override['style_properties'] = []
2380
+ has_brand_color = any(prop.get('name') == 'tile_brand_color' for prop in override['style_properties'])
2381
+ if not has_brand_color:
2382
+ override['style_properties'].append({
2383
+ 'name': 'tile_brand_color',
2384
+ 'value': 'TBC_I'
2385
+ })
2386
+ print(f" ✓ Added dark theme to Viz_1")
2387
+ break
2388
+
2389
+ if not viz_1_has_style:
2390
+ # Add new style override for Viz_1 with dark background
2391
+ style['overrides'].append({
2392
+ 'object_id': 'Viz_1',
2393
+ 'style_properties': [{
2394
+ 'name': 'tile_brand_color',
2395
+ 'value': 'TBC_I'
2396
+ }]
2397
+ })
2398
+ print(f" ✓ Added dark theme style to Viz_1")
2399
+
2400
+ # Re-import fixed TML using authenticated session
2401
+ import_response = ts_client.session.post(
2402
+ f"{ts_base_url}/api/rest/2.0/metadata/tml/import",
2403
+ json={
2404
+ 'metadata_tmls': [yaml.dump(liveboard_tml, default_flow_style=False, sort_keys=False)],
2405
+ 'import_policy': 'PARTIAL',
2406
+ 'create_new': False
2407
+ }
2408
+ )
2409
+
2410
+ if import_response.status_code == 200:
2411
+ print(f" ✅ Layout fixed successfully!")
2412
+ else:
2413
+ print(f" ⚠️ Could not re-import TML: {import_response.status_code}")
2414
+ else:
2415
+ print(f" ⚠️ No TML data in export response")
2416
+ else:
2417
+ print(f" ⚠️ Could not export TML: {export_response.status_code}")
2418
+ except Exception as fix_error:
2419
+ print(f" ⚠️ Layout fix failed: {str(fix_error)}")
2420
+ # Don't fail the whole operation if post-processing fails
2421
+
2422
  return {
2423
  'success': True,
2424
  'liveboard_name': final_liveboard_name,
main_research.py CHANGED
@@ -39,8 +39,23 @@ class Website:
39
  Enhanced Website object creation with better content extraction
40
  """
41
  self.url = url
 
 
 
42
  try:
43
- response = requests.get(url, headers=headers, timeout=10)
 
 
 
 
 
 
 
 
 
 
 
 
44
  response.raise_for_status()
45
  soup = BeautifulSoup(response.content, 'html.parser')
46
 
@@ -98,8 +113,47 @@ class Website:
98
  else:
99
  self.text = soup.get_text(separator="\n", strip=True)
100
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
101
  except Exception as e:
102
- print(f"Error processing website {url}: {str(e)}")
 
 
103
  self.title = "Error loading website"
104
  self.text = ""
105
  self.css_links = []
 
39
  Enhanced Website object creation with better content extraction
40
  """
41
  self.url = url
42
+ self.error_message = None
43
+ self.error_type = None
44
+
45
  try:
46
+ # Ensure URL has protocol
47
+ if not url.startswith(('http://', 'https://')):
48
+ url = 'https://' + url
49
+ self.url = url
50
+
51
+ # Try with SSL verification first, then without if it fails
52
+ try:
53
+ response = requests.get(url, headers=headers, timeout=30, verify=True, allow_redirects=True)
54
+ except requests.exceptions.SSLError:
55
+ # Retry without SSL verification for sites with certificate issues
56
+ print(f"⚠️ SSL verification failed for {url}, retrying without verification...")
57
+ response = requests.get(url, headers=headers, timeout=30, verify=False, allow_redirects=True)
58
+
59
  response.raise_for_status()
60
  soup = BeautifulSoup(response.content, 'html.parser')
61
 
 
113
  else:
114
  self.text = soup.get_text(separator="\n", strip=True)
115
 
116
+ except requests.exceptions.Timeout:
117
+ self.error_type = "timeout"
118
+ self.error_message = f"Request timed out after 30 seconds. The website may be slow or unresponsive."
119
+ print(f"❌ Timeout accessing {url}: {self.error_message}")
120
+ self.title = "Error loading website - Timeout"
121
+ self.text = ""
122
+ self.css_links = []
123
+ self.inline_styles = []
124
+ self.logo_candidates = []
125
+ except requests.exceptions.ConnectionError as e:
126
+ self.error_type = "connection"
127
+ self.error_message = f"Could not connect to {url}. Check if the URL is correct and the site is accessible."
128
+ print(f"❌ Connection error accessing {url}: {str(e)}")
129
+ self.title = "Error loading website - Connection Failed"
130
+ self.text = ""
131
+ self.css_links = []
132
+ self.inline_styles = []
133
+ self.logo_candidates = []
134
+ except requests.exceptions.HTTPError as e:
135
+ self.error_type = "http"
136
+ status_code = e.response.status_code if hasattr(e, 'response') and e.response else 'unknown'
137
+
138
+ if status_code == 404:
139
+ self.error_message = f"HTTP 404 - Page not found at {url}. The URL may be incorrect, the page may have been moved, or the site may require authentication."
140
+ elif status_code == 403:
141
+ self.error_message = f"HTTP 403 - Access forbidden. The site may be blocking automated requests or require authentication."
142
+ elif status_code == 401:
143
+ self.error_message = f"HTTP 401 - Authentication required. This site requires login credentials."
144
+ else:
145
+ self.error_message = f"HTTP error {status_code} accessing {url}. The server returned an error."
146
+
147
+ print(f"❌ HTTP error accessing {url}: {status_code} - {self.error_message}")
148
+ self.title = f"Error loading website - HTTP {status_code}"
149
+ self.text = ""
150
+ self.css_links = []
151
+ self.inline_styles = []
152
+ self.logo_candidates = []
153
  except Exception as e:
154
+ self.error_type = "unknown"
155
+ self.error_message = f"Unexpected error accessing {url}: {str(e)}"
156
+ print(f"❌ Error processing website {url}: {str(e)}")
157
  self.title = "Error loading website"
158
  self.text = ""
159
  self.css_links = []
prompts.py CHANGED
@@ -256,8 +256,17 @@ PERFORMANCE REQUIREMENTS:
256
  - Generate data in batches of 100-500 rows per executemany call
257
  - Use efficient data generation (avoid expensive operations in loops)
258
 
 
 
 
 
 
 
 
 
259
  OUTLIER REQUIREMENTS:
260
  - Create 5-10 specific dramatic outliers (5-10x normal values)
 
261
  - Don't rely on random chance for interesting patterns
262
  - Include clear before/after patterns for demo storytelling
263
 
 
256
  - Generate data in batches of 100-500 rows per executemany call
257
  - Use efficient data generation (avoid expensive operations in loops)
258
 
259
+ REALISTIC DATA REQUIREMENTS:
260
+ - **CRITICAL**: Individual transaction amounts MUST be realistic (e.g., $20-$50,000)
261
+ - For e-commerce/retail: typical orders are $50-$2,000, with rare large orders up to $50,000
262
+ - For B2B: typical orders are $1,000-$50,000, with enterprise orders up to $500,000
263
+ - **NEVER create individual transactions over $1 million** - use many small transactions instead
264
+ - To reach high totals (e.g., $40B revenue), generate MANY transactions (10,000+), not huge individual amounts
265
+ - Example: Product with $40B total revenue = 100,000 transactions × $400,000 avg (NOT 5 transactions × $8B each!)
266
+
267
  OUTLIER REQUIREMENTS:
268
  - Create 5-10 specific dramatic outliers (5-10x normal values)
269
+ - Outliers should be relative to realistic baselines (e.g., $10,000 order vs $500 baseline)
270
  - Don't rely on random chance for interesting patterns
271
  - Include clear before/after patterns for demo storytelling
272
 
requirements.txt CHANGED
@@ -21,5 +21,8 @@ supabase>=2.0.0 # PostgreSQL-based settings persistence
21
  faker>=20.1.0
22
  pandas>=2.0.0
23
 
 
 
 
24
  # Existing dependencies (if any)
25
  # Add any other dependencies your project needs
 
21
  faker>=20.1.0
22
  pandas>=2.0.0
23
 
24
+ # MCP - Model Context Protocol for ThoughtSpot
25
+ mcp>=1.0.0
26
+
27
  # Existing dependencies (if any)
28
  # Add any other dependencies your project needs
smart_chat.py ADDED
@@ -0,0 +1,225 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Smart Chat Interface for Data Adjustment
3
+
4
+ Uses liveboard context to intelligently understand requests
5
+ and bundle confirmations into smart prompts.
6
+ """
7
+
8
+ from dotenv import load_dotenv
9
+ import os
10
+ from smart_data_adjuster import SmartDataAdjuster
11
+
12
+ load_dotenv()
13
+
14
+
15
+ def chat_loop():
16
+ """Main smart chat loop"""
17
+
18
+ print("""
19
+ ╔════════════════════════════════════════════════════════════╗
20
+ ║ ║
21
+ ║ Smart Data Adjustment Chat ║
22
+ ║ ║
23
+ ╚════════════════════════════════════════════════════════════╝
24
+
25
+ This chat understands your liveboard and visualizations!
26
+
27
+ Commands:
28
+ - Type your adjustment request naturally
29
+ - Reference visualizations by number (e.g., "viz 2")
30
+ - "done" or "exit" to quit
31
+ - "help" for examples
32
+
33
+ Examples:
34
+ - "make 1080p Webcam 40 billion"
35
+ - "increase tablet revenue to 100B"
36
+ - "in viz 2, set laptops to 50B"
37
+ - "set profit margin to 30% for electronics"
38
+
39
+ """)
40
+
41
+ # Setup - can be overridden by environment or passed as arguments
42
+ database = os.getenv('SNOWFLAKE_DATABASE')
43
+ schema = os.getenv('DEMO_SCHEMA', "20251116_140933_AMAZO_SAL")
44
+ liveboard_guid = os.getenv('DEMO_LIVEBOARD_GUID', "9a30c9e4-efba-424a-8359-b16eb3a43ec3")
45
+
46
+ print(f"📊 Initializing...")
47
+ adjuster = SmartDataAdjuster(database, schema, liveboard_guid)
48
+ adjuster.connect()
49
+
50
+ # Load liveboard context
51
+ if not adjuster.load_liveboard_context():
52
+ print("❌ Failed to load liveboard context")
53
+ return
54
+
55
+ print("\n" + "="*80)
56
+ print("Ready! I understand your liveboard context.")
57
+ print("="*80 + "\n")
58
+
59
+ # Show numbered visualizations
60
+ print("📊 Available Visualizations:")
61
+ print("-" * 80)
62
+ for idx, viz in enumerate(adjuster.visualizations, start=1):
63
+ print(f" [{idx}] {viz['name']}")
64
+ cols = ', '.join(viz['columns'][:5]) # Show first 5 columns
65
+ if len(viz['columns']) > 5:
66
+ cols += f"... (+{len(viz['columns'])-5} more)"
67
+ print(f" Columns: {cols}")
68
+ print("-" * 80)
69
+ print("💡 TIP: You can reference visualizations by number (e.g., 'viz 2') or naturally!")
70
+ print("="*80 + "\n")
71
+
72
+ while True:
73
+ # Get user input
74
+ user_input = input("\n💬 You: ").strip()
75
+
76
+ if not user_input:
77
+ continue
78
+
79
+ # Check for exit
80
+ if user_input.lower() in ['done', 'exit', 'quit', 'bye']:
81
+ print("\n👋 Goodbye!")
82
+ break
83
+
84
+ # Check for help
85
+ if user_input.lower() == 'help':
86
+ print("""
87
+ 📚 Help - Smart Data Adjustment
88
+
89
+ I understand your liveboard context, so you can be natural:
90
+
91
+ Examples:
92
+ ✅ "make 1080p webcam 40 billion"
93
+ → I'll find the viz with products and TOTAL_REVENUE
94
+
95
+ ✅ "increase tablet to 100B"
96
+ → I'll match to the right product visualization
97
+
98
+ ✅ "in viz 2, set laptops to 50B"
99
+ → Use viz numbers to be specific!
100
+
101
+ ✅ "set profit margin to 30% for electronics"
102
+ → I'll find the viz with profit margin and categories
103
+
104
+ I'll show you what I understood and ask for one yes/no confirmation!
105
+ """)
106
+ continue
107
+
108
+ try:
109
+ # Check if user specified a viz number
110
+ viz_number = None
111
+ import re
112
+ viz_match = re.search(r'\bviz\s+(\d+)\b', user_input, re.IGNORECASE)
113
+ if viz_match:
114
+ viz_number = int(viz_match.group(1))
115
+ if viz_number < 1 or viz_number > len(adjuster.visualizations):
116
+ print(f"❌ Invalid viz number. Please use 1-{len(adjuster.visualizations)}")
117
+ continue
118
+ # Remove viz reference from request
119
+ user_input = re.sub(r'\bviz\s+\d+\b', '', user_input, flags=re.IGNORECASE).strip()
120
+ user_input = re.sub(r'^,?\s*', '', user_input) # Clean up leading comma/space
121
+
122
+ # Match request to visualization
123
+ print(f"\n🤔 Analyzing request...")
124
+ match = adjuster.match_request_to_viz(user_input)
125
+
126
+ if not match:
127
+ print("❌ I couldn't understand that request.")
128
+ print("💡 Try being more specific or type 'help' for examples")
129
+ continue
130
+
131
+ # If user specified viz number, override the match
132
+ if viz_number:
133
+ match['viz'] = adjuster.visualizations[viz_number - 1]
134
+ print(f" → Using specified viz: [{match['viz']['name']}]")
135
+ else:
136
+ print(f" → Matched to: [{match['viz']['name']}]")
137
+
138
+ print(f" → Entity: {match['entity_value']}")
139
+ print(f" → Confidence: {match['confidence'].upper()}")
140
+
141
+ # If low confidence, ask for confirmation
142
+ if match['confidence'] == 'low':
143
+ print(f"\n⚠️ I'm not very confident about this match.")
144
+ confirm_match = input(" Is this correct? [yes/no]: ").strip().lower()
145
+ if confirm_match not in ['yes', 'y']:
146
+ print("💡 Try rephrasing your request")
147
+ continue
148
+
149
+ # Get current value
150
+ print(f"\n📊 Querying current data...")
151
+ current_value = adjuster.get_current_value(
152
+ match['entity_value'],
153
+ match['metric_column']
154
+ )
155
+
156
+ if current_value == 0:
157
+ print(f"⚠️ No data found for '{match['entity_value']}'")
158
+ print("💡 Check the spelling or try a different entity")
159
+ continue
160
+
161
+ # Calculate target value if percentage
162
+ target_value = match.get('target_value')
163
+ if match.get('is_percentage'):
164
+ percentage = match.get('percentage', 0)
165
+ target_value = current_value * (1 + percentage / 100)
166
+ print(f" 💡 {percentage:+.1f}% change = ${current_value:,.0f} → ${target_value:,.0f}")
167
+
168
+ # Generate strategy
169
+ strategy = adjuster.generate_strategy(
170
+ match['entity_value'],
171
+ match['metric_column'],
172
+ current_value,
173
+ target_value
174
+ )
175
+
176
+ # Present smart bundled confirmation
177
+ # Update match with calculated target if percentage
178
+ if match.get('is_percentage'):
179
+ match['target_value'] = target_value
180
+ confirmation = adjuster.present_smart_confirmation(match, current_value, strategy)
181
+ print(confirmation)
182
+
183
+ # Get user decision
184
+ response = input("Run SQL? [yes/no]: ").strip().lower()
185
+
186
+ if response not in ['yes', 'y']:
187
+ print("\n❌ Cancelled - no changes made")
188
+ print("💡 You can try a different adjustment or rephrase")
189
+ continue
190
+
191
+ # Execute
192
+ print(f"\n⚙️ Executing SQL...")
193
+ result = adjuster.execute_sql(strategy['sql'])
194
+
195
+ if result['success']:
196
+ print(f"\n✅ SUCCESS! Updated {result['rows_affected']} rows")
197
+ print(f"🔄 Refresh your ThoughtSpot liveboard to see changes")
198
+ print(f" URL: https://se-thoughtspot-cloud.thoughtspot.cloud/#/pinboard/{liveboard_guid}")
199
+ else:
200
+ print(f"\n❌ FAILED: {result['error']}")
201
+
202
+ # Check for common errors
203
+ if 'out of representable range' in result['error'].lower():
204
+ print("\n💡 The number is too large for the database column.")
205
+ print(" Try a smaller target value (e.g., 40B instead of 50B)")
206
+
207
+ except KeyboardInterrupt:
208
+ print("\n\n⚠️ Interrupted")
209
+ break
210
+ except Exception as e:
211
+ print(f"\n❌ Error: {e}")
212
+ import traceback
213
+ print(traceback.format_exc())
214
+
215
+ # Cleanup
216
+ adjuster.close()
217
+ print("\n✅ Connection closed")
218
+
219
+
220
+ if __name__ == "__main__":
221
+ try:
222
+ chat_loop()
223
+ except KeyboardInterrupt:
224
+ print("\n\n👋 Goodbye!")
225
+
smart_data_adjuster.py ADDED
@@ -0,0 +1,604 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Smart Conversational Data Adjuster
3
+
4
+ Understands liveboard context and asks smart qualifying questions.
5
+ Bundles confirmations into one step when confident.
6
+ """
7
+
8
+ import os
9
+ from typing import Dict, List, Optional, Tuple
10
+ from openai import OpenAI
11
+ from snowflake_auth import get_snowflake_connection
12
+ from thoughtspot_deployer import ThoughtSpotDeployer
13
+ import json
14
+
15
+
16
+ class SmartDataAdjuster:
17
+ """Smart adjuster with liveboard context and conversational flow"""
18
+
19
+ def __init__(self, database: str, schema: str, liveboard_guid: str):
20
+ self.database = database
21
+ self.schema = schema
22
+ self.liveboard_guid = liveboard_guid
23
+ self.conn = None
24
+ self.ts_client = None
25
+ self.openai_client = OpenAI(api_key=os.getenv('OPENAI_API_KEY'))
26
+
27
+ # Context about the liveboard
28
+ self.liveboard_name = None
29
+ self.visualizations = [] # List of viz metadata
30
+
31
+ def connect(self):
32
+ """Connect to Snowflake and ThoughtSpot"""
33
+ # Snowflake
34
+ self.conn = get_snowflake_connection()
35
+ cursor = self.conn.cursor()
36
+ cursor.execute(f"USE DATABASE {self.database}")
37
+ cursor.execute(f'USE SCHEMA "{self.schema}"')
38
+
39
+ # ThoughtSpot
40
+ self.ts_client = ThoughtSpotDeployer()
41
+ self.ts_client.authenticate()
42
+
43
+ print(f"✅ Connected to {self.database}.{self.schema}")
44
+ print(f"✅ Connected to ThoughtSpot")
45
+
46
+ def load_liveboard_context(self):
47
+ """Load liveboard metadata and visualization details"""
48
+ print(f"\n📊 Loading liveboard context...")
49
+
50
+ # Get liveboard metadata
51
+ response = self.ts_client.session.post(
52
+ f"{self.ts_client.base_url}/api/rest/2.0/metadata/search",
53
+ json={
54
+ "metadata": [{"type": "LIVEBOARD", "identifier": self.liveboard_guid}],
55
+ "include_visualization_headers": True
56
+ }
57
+ )
58
+
59
+ if response.status_code != 200:
60
+ print(f"❌ Failed to load liveboard")
61
+ return False
62
+
63
+ data = response.json()[0]
64
+ self.liveboard_name = data.get('metadata_name', 'Unknown Liveboard')
65
+
66
+ viz_headers = data.get('visualization_headers', [])
67
+
68
+ print(f" Liveboard: {self.liveboard_name}")
69
+ print(f" Visualizations: {len(viz_headers)}")
70
+
71
+ # Extract viz details
72
+ for viz in viz_headers:
73
+ name = viz.get('name', '')
74
+ viz_id = viz.get('id')
75
+
76
+ # Skip note tiles
77
+ if 'note-tile' in name.lower():
78
+ continue
79
+
80
+ # Parse the name to extract columns used
81
+ # Names like "top 10 product_name by total revenue"
82
+ viz_info = {
83
+ 'id': viz_id,
84
+ 'name': name,
85
+ 'columns': self._extract_columns_from_name(name)
86
+ }
87
+
88
+ self.visualizations.append(viz_info)
89
+ print(f" - {name}")
90
+
91
+ return True
92
+
93
+ def _extract_columns_from_name(self, name: str) -> List[str]:
94
+ """Extract column names from visualization name"""
95
+ # Simple heuristic: look for column-like words
96
+ # e.g., "top 10 product_name by total revenue" → [product_name, total_revenue]
97
+
98
+ columns = []
99
+ name_lower = name.lower()
100
+
101
+ # Common column patterns
102
+ if 'product_name' in name_lower:
103
+ columns.append('PRODUCT_NAME')
104
+ if 'total revenue' in name_lower or 'total_revenue' in name_lower:
105
+ columns.append('TOTAL_AMOUNT')
106
+ if 'quantity' in name_lower:
107
+ columns.append('QUANTITY_SOLD')
108
+ if 'profit margin' in name_lower or 'profit_margin' in name_lower:
109
+ columns.append('PROFIT_MARGIN')
110
+ if 'customer_segment' in name_lower:
111
+ columns.append('CUSTOMER_SEGMENT')
112
+ if 'category' in name_lower:
113
+ columns.append('CATEGORY')
114
+ if 'seller' in name_lower:
115
+ columns.append('SELLER_NAME')
116
+
117
+ return columns
118
+
119
+ def _simple_parse(self, message: str) -> Optional[Dict]:
120
+ """Simple regex-based parser for common patterns like 'decrease phone case by 10%' or 'decrease seller acme by 10%'"""
121
+ import re
122
+
123
+ print(f"🔍 DEBUG _simple_parse: message='{message}'")
124
+ msg_lower = message.lower()
125
+
126
+ # Detect if user specified a viz number
127
+ viz_match = re.search(r'(?:viz|visualization)\s+(\d+)', msg_lower)
128
+ viz_number = int(viz_match.group(1)) if viz_match else None
129
+
130
+ # Detect entity type (product or seller)
131
+ # Check explicit "seller" keyword, or infer from viz number
132
+ is_seller = 'seller' in msg_lower
133
+
134
+ # If viz number is specified, check if it's a seller viz
135
+ if viz_number and not is_seller and len(self.visualizations) >= viz_number:
136
+ viz = self.visualizations[viz_number - 1]
137
+ if 'seller' in viz['name'].lower():
138
+ is_seller = True
139
+
140
+ entity_type = 'seller' if is_seller else 'product'
141
+
142
+ # Extract entity name - try quotes first, then words after action verbs
143
+ entity_match = re.search(r'"([^"]+)"', message)
144
+ if not entity_match:
145
+ # Try to find entity name after action words, but stop before numbers
146
+ # Include "seller" keyword if present
147
+ if is_seller:
148
+ # Match: "decrease seller home depot by 20%"
149
+ action_pattern = r'(?:decrease|increase|make|set|adjust)\s+(?:the\s+)?(?:profit\s+margin\s+for\s+)?seller\s+([a-z\s]+?)(?:\s+\d|\s+by|\s+to|\s*$)'
150
+ else:
151
+ # Match: "decrease bluetooth speaker by 10%" OR "decrease the revenue for bluetooth speaker by 10%"
152
+ action_pattern = r'(?:decrease|increase|make|set|adjust)\s+(?:the\s+)?(?:revenue\s+for\s+|profit\s+margin\s+for\s+)?([a-z\s]+?)(?:\s+\d|\s+by|\s+to|\s*$)'
153
+ entity_match = re.search(action_pattern, msg_lower, re.I)
154
+
155
+ if not entity_match:
156
+ return None
157
+
158
+ entity = entity_match.group(1).strip()
159
+
160
+ # Find percentage or absolute value
161
+ is_percentage = False
162
+ percentage = None
163
+ target_value = None
164
+
165
+ # Look for percentage like "by 10%" or "10%"
166
+ pct_match = re.search(r'by\s+(\d+\.?\d*)%|(\d+\.?\d*)%', msg_lower)
167
+ if pct_match:
168
+ is_percentage = True
169
+ percentage = float(pct_match.group(1) or pct_match.group(2))
170
+ # Check if it's decrease or increase
171
+ if 'decrease' in msg_lower or 'reduce' in msg_lower or 'lower' in msg_lower:
172
+ percentage = -percentage
173
+ else:
174
+ # Look for absolute value like "50B", "50 billion", "1000000"
175
+ val_match = re.search(r'(\d+\.?\d*)\s*([bBmMkK]|billion|million|thousand)?', message)
176
+ if val_match:
177
+ num = float(val_match.group(1))
178
+ unit = (val_match.group(2) or '').lower()
179
+ if unit in ['b', 'billion']:
180
+ target_value = num * 1_000_000_000
181
+ elif unit in ['m', 'million']:
182
+ target_value = num * 1_000_000
183
+ elif unit in ['k', 'thousand']:
184
+ target_value = num * 1_000
185
+ else:
186
+ target_value = num
187
+
188
+ if not is_percentage and not target_value:
189
+ return None
190
+
191
+ # Find appropriate viz and determine metric column
192
+ # If user specified viz number, use it; otherwise search for matching viz
193
+ if viz_number:
194
+ viz_num = viz_number
195
+ elif is_seller:
196
+ # Look for seller-related viz
197
+ viz_num = 1 # Default
198
+ for i, viz in enumerate(self.visualizations, 1):
199
+ if 'seller' in viz['name'].lower():
200
+ viz_num = i
201
+ break
202
+ else:
203
+ # Look for product-related viz
204
+ viz_num = 1 # Default
205
+ for i, viz in enumerate(self.visualizations, 1):
206
+ if 'product' in viz['name'].lower():
207
+ viz_num = i
208
+ break
209
+
210
+ # Determine metric based on entity type
211
+ if is_seller:
212
+ metric_column = 'PROFIT_MARGIN' # Sellers typically use profit margin
213
+ else:
214
+ metric_column = 'TOTAL_AMOUNT' # Products typically use revenue (column is TOTAL_AMOUNT)
215
+
216
+ result = {
217
+ 'viz_number': viz_num,
218
+ 'entity_value': entity,
219
+ 'entity_type': entity_type,
220
+ 'metric_column': metric_column,
221
+ 'target_value': target_value,
222
+ 'is_percentage': is_percentage,
223
+ 'percentage': percentage,
224
+ 'confidence': 'medium',
225
+ 'reasoning': f'Simple {entity_type} parse'
226
+ }
227
+
228
+ print(f"🔍 DEBUG _simple_parse result: entity='{entity}', percentage={percentage}, metric={metric_column}, viz_num={viz_num}")
229
+
230
+ if viz_num <= len(self.visualizations):
231
+ result['viz'] = self.visualizations[viz_num - 1]
232
+
233
+ return result
234
+
235
+ def match_request_to_viz(self, user_request: str) -> Optional[Dict]:
236
+ """
237
+ Use AI to match user request to specific visualization
238
+
239
+ Returns:
240
+ {
241
+ 'viz': {...},
242
+ 'confidence': 'high'|'medium'|'low',
243
+ 'entity_value': '1080p Webcam',
244
+ 'metric_column': 'TOTAL_AMOUNT',
245
+ 'target_value': 50000000000
246
+ }
247
+ """
248
+ # Try simple parse first (faster, no AI needed)
249
+ simple_result = self._simple_parse(user_request)
250
+ if simple_result:
251
+ print(f" ⚡ Quick parse: '{simple_result['entity_value']}' → {simple_result.get('percentage', simple_result.get('target_value'))}")
252
+ return simple_result
253
+
254
+ viz_list = "\n".join([
255
+ f"{i+1}. {v['name']} (columns: {', '.join(v['columns'])})"
256
+ for i, v in enumerate(self.visualizations)
257
+ ])
258
+
259
+ prompt = f"""User is looking at a ThoughtSpot liveboard and wants to adjust data.
260
+
261
+ User request: "{user_request}"
262
+
263
+ Available visualizations on the liveboard:
264
+ {viz_list}
265
+
266
+ Analyze the request and determine:
267
+ 1. Which visualization (by number) is the user referring to?
268
+ 2. What entity/product are they talking about? (e.g., "1080p Webcam")
269
+ 3. What metric should be adjusted? (TOTAL_AMOUNT, QUANTITY_SOLD, PROFIT_MARGIN)
270
+ 4. What's the target value?
271
+ - If absolute value (e.g., "40B", "100M"): convert to number (40B = 40000000000)
272
+ - If percentage increase (e.g., "increase by 20%"): set is_percentage=true and percentage=20
273
+ 5. How confident are you? (high/medium/low)
274
+
275
+ Return JSON:
276
+ {{
277
+ "viz_number": 1,
278
+ "entity_value": "1080p Webcam",
279
+ "metric_column": "TOTAL_AMOUNT",
280
+ "target_value": 50000000000,
281
+ "is_percentage": false,
282
+ "percentage": null,
283
+ "confidence": "high",
284
+ "reasoning": "User mentioned product and the top 10 products viz uses PRODUCT_NAME and TOTAL_AMOUNT"
285
+ }}
286
+
287
+ OR for percentage increase:
288
+ {{
289
+ "viz_number": 1,
290
+ "entity_value": "1080p Webcam",
291
+ "metric_column": "TOTAL_AMOUNT",
292
+ "target_value": null,
293
+ "is_percentage": true,
294
+ "percentage": 20,
295
+ "confidence": "high",
296
+ "reasoning": "User wants to increase revenue by 20%"
297
+ }}
298
+
299
+ CRITICAL: target_value and percentage must be numbers, never strings.
300
+ If unsure about ANY field, set confidence to "low" or "medium".
301
+ """
302
+
303
+ response = self.openai_client.chat.completions.create(
304
+ model="gpt-4o",
305
+ messages=[{"role": "user", "content": prompt}],
306
+ temperature=0
307
+ )
308
+
309
+ content = response.choices[0].message.content
310
+ if content.startswith('```'):
311
+ lines = content.split('\n')
312
+ content = '\n'.join(lines[1:-1])
313
+
314
+ try:
315
+ result = json.loads(content)
316
+
317
+ # Add the actual viz object
318
+ viz_num = result.get('viz_number', 1)
319
+ if 1 <= viz_num <= len(self.visualizations):
320
+ result['viz'] = self.visualizations[viz_num - 1]
321
+
322
+ return result
323
+ except:
324
+ return None
325
+
326
+ def _find_closest_entity(self, entity_value: str, entity_type: str = 'product') -> Optional[str]:
327
+ """Find the closest matching entity name (product or seller) in the database"""
328
+ cursor = self.conn.cursor()
329
+
330
+ # Get all entity names based on type
331
+ if entity_type == 'seller':
332
+ cursor.execute(f"""
333
+ SELECT DISTINCT SELLER_NAME
334
+ FROM {self.database}."{self.schema}".SELLERS
335
+ """)
336
+ else: # product
337
+ cursor.execute(f"""
338
+ SELECT DISTINCT PRODUCT_NAME
339
+ FROM {self.database}."{self.schema}".PRODUCTS
340
+ """)
341
+
342
+ entities = [row[0] for row in cursor.fetchall()]
343
+
344
+ # Normalize: lowercase and remove spaces for comparison
345
+ def normalize(s):
346
+ return s.lower().replace(' ', '').replace('-', '').replace('_', '')
347
+
348
+ entity_normalized = normalize(entity_value)
349
+
350
+ # First try exact case-insensitive match
351
+ entity_lower = entity_value.lower()
352
+ for entity in entities:
353
+ if entity.lower() == entity_lower:
354
+ return entity
355
+
356
+ # Try normalized match (ignoring spaces/dashes)
357
+ for entity in entities:
358
+ if normalize(entity) == entity_normalized:
359
+ return entity
360
+
361
+ # Try partial match (contains)
362
+ for entity in entities:
363
+ if entity_lower in entity.lower() or entity.lower() in entity_lower:
364
+ return entity
365
+
366
+ # Try normalized partial match
367
+ for entity in entities:
368
+ if entity_normalized in normalize(entity) or normalize(entity) in entity_normalized:
369
+ return entity
370
+
371
+ return None
372
+
373
+ def _find_closest_product(self, entity_value: str) -> Optional[str]:
374
+ """Backward compatibility wrapper"""
375
+ return self._find_closest_entity(entity_value, 'product')
376
+
377
+ def get_current_value(self, entity_value: str, metric_column: str, entity_type: str = 'product') -> float:
378
+ """Query current value from Snowflake"""
379
+ cursor = self.conn.cursor()
380
+
381
+ # Find closest matching entity
382
+ matched_entity = self._find_closest_entity(entity_value, entity_type)
383
+
384
+ if not matched_entity:
385
+ print(f"⚠️ Could not find {entity_type} matching '{entity_value}'")
386
+ return 0
387
+
388
+ if matched_entity.lower() != entity_value.lower():
389
+ print(f" 📝 Using closest match: '{matched_entity}'")
390
+
391
+ # Build query based on entity type
392
+ if entity_type == 'seller':
393
+ query = f"""
394
+ SELECT AVG(st.{metric_column})
395
+ FROM {self.database}."{self.schema}".SALES_TRANSACTIONS st
396
+ JOIN {self.database}."{self.schema}".SELLERS s ON st.SELLER_ID = s.SELLER_ID
397
+ WHERE LOWER(s.SELLER_NAME) = LOWER('{matched_entity}')
398
+ """
399
+ else: # product
400
+ query = f"""
401
+ SELECT SUM(st.{metric_column})
402
+ FROM {self.database}."{self.schema}".SALES_TRANSACTIONS st
403
+ JOIN {self.database}."{self.schema}".PRODUCTS p ON st.PRODUCT_ID = p.PRODUCT_ID
404
+ WHERE LOWER(p.PRODUCT_NAME) = LOWER('{matched_entity}')
405
+ """
406
+
407
+ cursor.execute(query)
408
+ result = cursor.fetchone()
409
+ return float(result[0]) if result and result[0] else 0
410
+
411
+ def generate_strategy(self, entity_value: str, metric_column: str, current_value: float, target_value: float = None, percentage: float = None, entity_type: str = 'product') -> Dict:
412
+ """Generate the best strategy (default to Strategy A for now)"""
413
+
414
+ print(f"🔍 DEBUG generate_strategy: entity='{entity_value}', metric={metric_column}, percentage={percentage}, current={current_value}")
415
+
416
+ # Find the actual entity name
417
+ matched_entity = self._find_closest_entity(entity_value, entity_type)
418
+ if not matched_entity:
419
+ matched_entity = entity_value # Fallback
420
+
421
+ print(f"🔍 DEBUG matched_entity: '{matched_entity}'")
422
+
423
+ # Calculate multiplier
424
+ if percentage is not None:
425
+ # Percentage-based: "decrease by 10%" means multiply by 0.9
426
+ multiplier = 1 + (percentage / 100)
427
+ percentage_change = percentage
428
+ target_value = current_value * multiplier
429
+ else:
430
+ # Absolute target value
431
+ multiplier = target_value / current_value if current_value > 0 else 1
432
+ percentage_change = (multiplier - 1) * 100
433
+
434
+ # Build SQL based on entity type
435
+ if entity_type == 'seller':
436
+ sql = f"""UPDATE {self.database}."{self.schema}".SALES_TRANSACTIONS
437
+ SET {metric_column} = {metric_column} * {multiplier:.6f}
438
+ WHERE SELLER_ID IN (
439
+ SELECT SELLER_ID FROM {self.database}."{self.schema}".SELLERS
440
+ WHERE LOWER(SELLER_NAME) = LOWER('{matched_entity}')
441
+ )"""
442
+ else: # product
443
+ sql = f"""UPDATE {self.database}."{self.schema}".SALES_TRANSACTIONS
444
+ SET {metric_column} = {metric_column} * {multiplier:.6f}
445
+ WHERE PRODUCT_ID IN (
446
+ SELECT PRODUCT_ID FROM {self.database}."{self.schema}".PRODUCTS
447
+ WHERE LOWER(PRODUCT_NAME) = LOWER('{matched_entity}')
448
+ )"""
449
+
450
+ print(f"🔍 DEBUG SQL generated:\n{sql}")
451
+
452
+ return {
453
+ 'id': 'A',
454
+ 'name': 'Distribute Across All Transactions',
455
+ 'description': f"Multiply all transactions by {multiplier:.2f}x ({percentage_change:+.1f}%)",
456
+ 'sql': sql,
457
+ 'matched_product': matched_entity, # Keep key name for compatibility
458
+ 'target_value': target_value
459
+ }
460
+
461
+ def present_smart_confirmation(self, match: Dict, current_value: float, strategy: Dict) -> str:
462
+ """Create a bundled confirmation prompt"""
463
+
464
+ viz_name = match['viz']['name']
465
+ entity = match['entity_value']
466
+ matched_product = strategy.get('matched_product', entity)
467
+ metric = match['metric_column']
468
+ target = strategy.get('target_value', match.get('target_value')) # Use calculated target from strategy
469
+ confidence = match['confidence']
470
+
471
+ # Show if we fuzzy matched
472
+ entity_display = entity
473
+ if matched_product.lower() != entity.lower():
474
+ entity_display = f"{entity} → '{matched_product}'"
475
+
476
+ confirmation = f"""
477
+ {'='*80}
478
+ 📋 SMART CONFIRMATION
479
+ {'='*80}
480
+
481
+ Liveboard: {self.liveboard_name}
482
+ Visualization: [{viz_name}]
483
+
484
+ Adjustment:
485
+ Entity: {entity_display}
486
+ Metric: {metric}
487
+ Current Value: ${current_value:,.0f}
488
+ Target Value: ${target:,.0f}
489
+ Change: ${target - current_value:+,.0f} ({(target/current_value - 1)*100:+.1f}%)
490
+
491
+ Strategy: {strategy['name']}
492
+ {strategy['description']}
493
+
494
+ Confidence: {confidence.upper()}
495
+ {match.get('reasoning', '')}
496
+
497
+ SQL Preview:
498
+ {strategy['sql'][:200]}...
499
+
500
+ """
501
+
502
+ if confidence == 'low':
503
+ confirmation += "\n⚠️ Low confidence - please verify this is correct\n"
504
+
505
+ confirmation += "\n" + "="*80 + "\n"
506
+ return confirmation
507
+
508
+ def execute_sql(self, sql: str) -> Dict:
509
+ """Execute the SQL update"""
510
+ print(f"🔍 DEBUG execute_sql: About to execute SQL")
511
+ print(f"SQL:\n{sql}")
512
+ cursor = self.conn.cursor()
513
+
514
+ try:
515
+ cursor.execute(sql)
516
+ rows_affected = cursor.rowcount
517
+ self.conn.commit()
518
+ print(f"✅ SQL executed successfully, rows affected: {rows_affected}")
519
+
520
+ return {
521
+ 'success': True,
522
+ 'rows_affected': rows_affected
523
+ }
524
+ except Exception as e:
525
+ self.conn.rollback()
526
+ return {
527
+ 'success': False,
528
+ 'error': str(e)
529
+ }
530
+
531
+ def close(self):
532
+ """Close connections"""
533
+ if self.conn:
534
+ self.conn.close()
535
+
536
+
537
+ def test_smart_adjuster():
538
+ """Test the smart adjuster"""
539
+ from dotenv import load_dotenv
540
+ load_dotenv()
541
+
542
+ print("""
543
+ ╔════════════════════════════════════════════════════════════╗
544
+ ║ ║
545
+ ║ Smart Data Adjuster Test ║
546
+ ║ ║
547
+ ╚════════════════════════════════════════════════════════════╝
548
+ """)
549
+
550
+ adjuster = SmartDataAdjuster(
551
+ database=os.getenv('SNOWFLAKE_DATABASE'),
552
+ schema="20251116_140933_AMAZO_SAL",
553
+ liveboard_guid="9a30c9e4-efba-424a-8359-b16eb3a43ec3"
554
+ )
555
+
556
+ adjuster.connect()
557
+ adjuster.load_liveboard_context()
558
+
559
+ # Test request
560
+ user_request = "make 1080p Webcam 50 billion"
561
+ print(f"\n💬 User: \"{user_request}\"")
562
+
563
+ # Match to viz
564
+ print(f"\n🤔 Analyzing request...")
565
+ match = adjuster.match_request_to_viz(user_request)
566
+
567
+ if not match:
568
+ print("❌ Could not understand request")
569
+ return
570
+
571
+ # Get current value
572
+ current = adjuster.get_current_value(match['entity_value'], match['metric_column'])
573
+
574
+ # Generate strategy (handle both absolute and percentage)
575
+ strategy = adjuster.generate_strategy(
576
+ match['entity_value'],
577
+ match['metric_column'],
578
+ current,
579
+ target_value=match.get('target_value'),
580
+ percentage=match.get('percentage')
581
+ )
582
+
583
+ # Present confirmation
584
+ confirmation = adjuster.present_smart_confirmation(match, current, strategy)
585
+ print(confirmation)
586
+
587
+ # Ask for confirmation
588
+ response = input("Run SQL? [yes/no]: ").strip().lower()
589
+
590
+ if response in ['yes', 'y']:
591
+ result = adjuster.execute_sql(strategy['sql'])
592
+ if result['success']:
593
+ print(f"\n✅ Success! Updated {result['rows_affected']} rows")
594
+ else:
595
+ print(f"\n❌ Failed: {result['error']}")
596
+ else:
597
+ print("\n❌ Cancelled")
598
+
599
+ adjuster.close()
600
+
601
+
602
+ if __name__ == "__main__":
603
+ test_smart_adjuster()
604
+
supabase_client.py CHANGED
@@ -355,7 +355,7 @@ def load_gradio_settings(email: str) -> Dict[str, Any]:
355
  "default_data_volume": "Medium (10K rows)",
356
  "default_warehouse": "COMPUTE_WH",
357
  "default_database": "DEMO_DB",
358
-
359
  # Data Generation Settings
360
  "fact_table_size": "10000",
361
  "dim_table_size": "100",
@@ -370,6 +370,10 @@ def load_gradio_settings(email: str) -> Dict[str, Any]:
370
  "snowflake_user": "",
371
  "snowflake_role": "ACCOUNTADMIN",
372
  "default_schema": "PUBLIC",
 
 
 
 
373
 
374
  # Advanced Options
375
  "batch_size": 5000,
 
355
  "default_data_volume": "Medium (10K rows)",
356
  "default_warehouse": "COMPUTE_WH",
357
  "default_database": "DEMO_DB",
358
+
359
  # Data Generation Settings
360
  "fact_table_size": "10000",
361
  "dim_table_size": "100",
 
370
  "snowflake_user": "",
371
  "snowflake_role": "ACCOUNTADMIN",
372
  "default_schema": "PUBLIC",
373
+
374
+ # Demo Configuration
375
+ "tag_name": "",
376
+ "object_naming_prefix": "",
377
 
378
  # Advanced Options
379
  "batch_size": 5000,
test_mcp_liveboard_isolated.py DELETED
@@ -1,205 +0,0 @@
1
- """
2
- Isolated MCP Liveboard Creation Test
3
-
4
- Test MCP liveboard creation with existing ThoughtSpot objects.
5
- Run this repeatedly to debug MCP issues without full deployment.
6
-
7
- Usage:
8
- python test_mcp_liveboard_isolated.py
9
-
10
- Requirements:
11
- - Existing ThoughtSpot connection, schema, and model
12
- - Update the CONFIG section below with your details
13
- """
14
-
15
- import os
16
- import asyncio
17
- import json
18
- from dotenv import load_dotenv
19
- from datetime import datetime
20
-
21
- load_dotenv()
22
-
23
- # ============================================================================
24
- # CONFIG - UPDATE THESE WITH YOUR THOUGHTSPOT OBJECTS
25
- # ============================================================================
26
- CONFIG = {
27
- # ThoughtSpot credentials
28
- 'ts_url': os.getenv('THOUGHTSPOT_URL', 'https://your-instance.thoughtspot.cloud'),
29
- 'ts_username': os.getenv('THOUGHTSPOT_USERNAME', 'your-username'),
30
- 'ts_password': os.getenv('THOUGHTSPOT_PASSWORD', ''),
31
- 'ts_secret': os.getenv('THOUGHTSPOT_SECRET_KEY', ''),
32
-
33
- # Existing ThoughtSpot objects (get these from a successful deployment)
34
- 'model_id': 'eb600ad2-ad91-4640-819a-f953602bd4c1', # Working model from user's test
35
- 'model_name': 'Working_Model', # Working model from user's test
36
-
37
- # Company/use case for liveboard
38
- 'company_name': 'Amazon.com',
39
- 'use_case': 'Sales Analytics',
40
-
41
- # Liveboard settings
42
- 'liveboard_name': 'Test MCP Liveboard',
43
- 'num_visualizations': 3, # Start small for testing
44
-
45
- # Timeout (seconds)
46
- 'timeout': 120 # 2 minutes max
47
- }
48
-
49
- # ============================================================================
50
- # TEST RUNNER
51
- # ============================================================================
52
-
53
- def print_section(title):
54
- """Print a formatted section header"""
55
- print(f"\n{'='*60}")
56
- print(f" {title}")
57
- print(f"{'='*60}\n")
58
-
59
-
60
- async def test_mcp_liveboard_creation():
61
- """Test MCP liveboard creation with timeout"""
62
-
63
- print_section("🧪 MCP Liveboard Creation Test")
64
- print(f"⏰ Start Time: {datetime.now().strftime('%H:%M:%S')}")
65
- print(f"⏱️ Timeout: {CONFIG['timeout']} seconds")
66
-
67
- # Validate config
68
- if not CONFIG['model_id'] or not CONFIG['model_name']:
69
- print("❌ ERROR: Please update CONFIG with your model_id and model_name!")
70
- print("\nTo find your model ID:")
71
- print("1. Go to ThoughtSpot and navigate to your model")
72
- print("2. The URL will contain the GUID: /data/models/[GUID]")
73
- print("3. Copy that GUID and paste it in CONFIG['model_id']")
74
- return False
75
-
76
- print(f"📊 Model ID: {CONFIG['model_id']}")
77
- print(f"📋 Model Name: {CONFIG['model_name']}")
78
- print(f"🏢 Company: {CONFIG['company_name']}")
79
- print(f"📈 Use Case: {CONFIG['use_case']}")
80
- print(f"🎨 Liveboard Name: {CONFIG['liveboard_name']}")
81
- print(f"📊 Visualizations: {CONFIG['num_visualizations']}")
82
-
83
- print_section("1️⃣ Calling MCP Liveboard Creation (with timeout)")
84
- print(" ℹ️ Note: MCP handles OAuth authentication automatically via mcp-remote")
85
- print(" ℹ️ Browser may open for first-time authentication")
86
-
87
- try:
88
- from liveboard_creator import create_liveboard_from_model_mcp
89
-
90
- # Build company data
91
- company_data = {
92
- 'name': CONFIG['company_name'],
93
- 'use_case': CONFIG['use_case']
94
- }
95
-
96
- print(f"🚀 Starting MCP liveboard creation...")
97
- print(f" Watch for progress below...")
98
- print(f" Will timeout after {CONFIG['timeout']}s if no response\n")
99
-
100
- # Call with timeout (no to_thread needed - function handles its own asyncio.run)
101
- start_time = datetime.now()
102
-
103
- # Run in executor to avoid blocking
104
- loop = asyncio.get_event_loop()
105
- result = await asyncio.wait_for(
106
- loop.run_in_executor(
107
- None,
108
- lambda: create_liveboard_from_model_mcp(
109
- ts_client=None, # MCP doesn't need REST client - uses OAuth via mcp-remote
110
- model_id=CONFIG['model_id'],
111
- model_name=CONFIG['model_name'],
112
- company_data=company_data,
113
- use_case=CONFIG['use_case'],
114
- num_visualizations=CONFIG['num_visualizations'],
115
- liveboard_name=CONFIG['liveboard_name']
116
- )
117
- ),
118
- timeout=CONFIG['timeout']
119
- )
120
-
121
- elapsed = (datetime.now() - start_time).total_seconds()
122
-
123
- if result.get('success'):
124
- print_section("✅ SUCCESS!")
125
- print(f"⏱️ Time: {elapsed:.1f}s")
126
- print(f"📊 Liveboard: {result.get('liveboard_name')}")
127
- print(f"🆔 GUID: {result.get('liveboard_guid')}")
128
- print(f"🔗 URL: {result.get('liveboard_url', 'N/A')}")
129
- print(f"📈 Visualizations: {result.get('visualizations_created', 'N/A')}")
130
- return True
131
- else:
132
- print_section("❌ FAILED")
133
- print(f"⏱️ Time: {elapsed:.1f}s")
134
- print(f"❌ Error: {result.get('error', 'Unknown error')}")
135
- return False
136
-
137
- except asyncio.TimeoutError:
138
- elapsed = (datetime.now() - start_time).total_seconds()
139
- print_section("⏰ TIMEOUT")
140
- print(f"❌ MCP liveboard creation timed out after {elapsed:.1f}s")
141
- print(f"\nPossible causes:")
142
- print("1. MCP server not responding")
143
- print("2. Network connectivity issues")
144
- print("3. ThoughtSpot instance not accessible")
145
- print("4. npx not installed or not in PATH")
146
- print("\nTry:")
147
- print("- Check if 'npx' command works: npx --version")
148
- print("- Check network connection to ThoughtSpot")
149
- print("- Try with fewer visualizations (CONFIG['num_visualizations'] = 1)")
150
- return False
151
-
152
- except Exception as e:
153
- elapsed = (datetime.now() - start_time).total_seconds()
154
- print_section("❌ ERROR")
155
- print(f"⏱️ Time: {elapsed:.1f}s")
156
- print(f"❌ Error: {e}")
157
- import traceback
158
- print(f"\nTraceback:")
159
- print(traceback.format_exc())
160
- return False
161
-
162
-
163
- def run_test():
164
- """Run the async test"""
165
- try:
166
- result = asyncio.run(test_mcp_liveboard_creation())
167
-
168
- print_section("🏁 Test Complete")
169
- print(f"⏰ End Time: {datetime.now().strftime('%H:%M:%S')}")
170
-
171
- if result:
172
- print("✅ Test PASSED")
173
- exit(0)
174
- else:
175
- print("❌ Test FAILED")
176
- exit(1)
177
-
178
- except KeyboardInterrupt:
179
- print("\n\n⚠️ Test interrupted by user (Ctrl+C)")
180
- exit(130)
181
-
182
-
183
- if __name__ == "__main__":
184
- print("""
185
- ╔════════════════════════════════════════════════════════════╗
186
- ║ ║
187
- ║ MCP Liveboard Creation Test Suite ║
188
- ║ ║
189
- ╚════════════════════════════════════════════════════════════╝
190
- """)
191
-
192
- # Check for required environment variables
193
- if not os.getenv('THOUGHTSPOT_URL'):
194
- print("⚠️ WARNING: THOUGHTSPOT_URL not set in environment")
195
- print(" Using value from CONFIG\n")
196
-
197
- print("📝 INSTRUCTIONS:")
198
- print(" 1. Update CONFIG section in this file with your model ID")
199
- print(" 2. Run: python test_mcp_liveboard_isolated.py")
200
- print(" 3. Press Ctrl+C to cancel if it hangs\n")
201
-
202
- print("🚀 Starting test...\n")
203
-
204
- run_test()
205
-
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
thoughtspot_deployer.py CHANGED
@@ -166,7 +166,29 @@ class ThoughtSpotDeployer:
166
 
167
  for line in column_lines:
168
  line = line.strip()
169
- if line and not line.upper().startswith(('PRIMARY KEY', 'FOREIGN KEY', 'CONSTRAINT')):
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
170
  # Parse: COLUMNNAME DATATYPE(params) [IDENTITY] [NOT NULL]
171
  parts = line.split()
172
  if len(parts) >= 2:
@@ -175,8 +197,6 @@ class ThoughtSpotDeployer:
175
  col_type_match = re.match(r'(\w+(?:\([^)]+\))?)', parts[1])
176
  col_type = col_type_match.group(1).upper() if col_type_match else parts[1].upper()
177
 
178
- # DEBUG: Removed - data type parsing is working correctly
179
-
180
  columns.append({
181
  'name': col_name,
182
  'type': col_type,
@@ -185,7 +205,7 @@ class ThoughtSpotDeployer:
185
 
186
  tables[table_name] = columns
187
 
188
- print(f"📊 Found {len(tables)} tables in DDL")
189
  return tables, foreign_keys
190
 
191
 
@@ -237,11 +257,12 @@ class ThoughtSpotDeployer:
237
  print(f" 📋 Response: {response.text}")
238
 
239
  def create_table_tml(self, table_name: str, columns: List, connection_name: str,
240
- database: str, schema: str, all_tables: Dict = None, table_guid: str = None) -> str:
241
  """Generate table TML matching working example structure
242
 
243
  Args:
244
  table_guid: If provided, use this GUID (for updating existing tables with joins)
 
245
  """
246
  tml_columns = []
247
 
@@ -304,7 +325,7 @@ class ThoughtSpotDeployer:
304
 
305
  # Add joins_with relationships (matching working example)
306
  if all_tables:
307
- joins_with = self._generate_table_joins(table_name, columns, all_tables)
308
  if joins_with:
309
  table_tml['table']['joins_with'] = joins_with
310
 
@@ -313,43 +334,38 @@ class ThoughtSpotDeployer:
313
  # Keep quotes around 'on' key as shown in working example
314
  return yaml_output
315
 
316
- def _generate_table_joins(self, table_name: str, columns: List, all_tables: Dict) -> List:
317
- """Generate joins_with structure matching working example"""
318
  joins = []
319
  table_name_upper = table_name.upper()
320
- table_cols = [col['name'].upper() for col in columns]
321
-
322
- # Find foreign key relationships
323
- for col_name in table_cols:
324
- if col_name.endswith('ID') and col_name != f"{table_name_upper}ID":
325
- # This looks like a foreign key - find the target table
326
- # Handle both CUSTOMER_ID and CUSTOMERID formats
327
- if col_name.endswith('_ID'):
328
- # CUSTOMER_ID -> CUSTOMERS
329
- potential_target = col_name[:-3] + 'S'
330
- else:
331
- # CUSTOMERID -> CUSTOMERS
332
- potential_target = col_name[:-2] + 'S'
333
 
334
- # Check if target table exists in THIS deployment AND it's not the same table
335
- # IMPORTANT: Only create joins to tables in the same schema/connection
336
  available_tables_upper = [t.upper() for t in all_tables.keys()]
337
- if (potential_target in available_tables_upper and
338
- potential_target != table_name_upper):
339
  constraint_id = f"SYS_CONSTRAINT_{self._generate_constraint_id()}"
340
  join_def = {
341
  'name': constraint_id,
342
  'destination': {
343
- 'name': potential_target
344
  },
345
- 'on': f"[{table_name_upper}::{col_name}] = [{potential_target}::{col_name}]",
346
  'type': 'INNER'
347
  }
348
  joins.append(join_def)
349
- print(f" 🔗 Generated join: {table_name_upper} -> {potential_target} on {col_name}")
350
  else:
351
- if potential_target not in available_tables_upper and potential_target != table_name_upper:
352
- print(f" ⏭️ Skipping join: {table_name_upper}.{col_name} -> {potential_target} (table not in this deployment)")
353
 
354
  return joins
355
 
@@ -1117,6 +1133,49 @@ class ThoughtSpotDeployer:
1117
  print(f" ⚠️ Could not create schema: {e}")
1118
  print(f" 📝 Will proceed assuming schema exists or will be created by table operations")
1119
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1120
  def _generate_demo_names(self, company_name: str = None, use_case: str = None):
1121
  """Generate standardized demo names using DM convention"""
1122
  from datetime import datetime
@@ -1152,7 +1211,8 @@ class ThoughtSpotDeployer:
1152
  def deploy_all(self, ddl: str, database: str, schema: str,
1153
  connection_name: str = None, company_name: str = None,
1154
  use_case: str = None, liveboard_name: str = None,
1155
- llm_model: str = None, progress_callback=None) -> Dict:
 
1156
  """
1157
  Deploy complete data model to ThoughtSpot
1158
 
@@ -1324,7 +1384,7 @@ class ThoughtSpotDeployer:
1324
 
1325
  for table_name, columns in tables.items():
1326
  print(f"[ThoughtSpot] Preparing {table_name.upper()}...", flush=True)
1327
- table_tml = self.create_table_tml(table_name, columns, connection_name, database, schema, all_tables=None)
1328
  table_tmls_batch1.append(table_tml)
1329
  table_names_order.append(table_name.upper())
1330
 
@@ -1378,6 +1438,12 @@ class ThoughtSpotDeployer:
1378
  log_progress(" ❌ No tables were created successfully in Batch 1")
1379
  return results
1380
 
 
 
 
 
 
 
1381
  batch1_time = time.time() - batch1_start
1382
  log_progress(f"✅ Batch 1 complete: {len(table_guids)} tables created ({batch1_time:.1f}s)")
1383
 
@@ -1404,7 +1470,7 @@ class ThoughtSpotDeployer:
1404
  # Create table TML WITH joins_with section AND the table GUID
1405
  table_tml = self.create_table_tml(
1406
  table_name, columns, connection_name, database, schema,
1407
- all_tables=tables, table_guid=table_guid
1408
  )
1409
  table_tmls_batch2.append(table_tml)
1410
  table_names_order_batch2.append(table_name_upper)
@@ -1527,6 +1593,11 @@ class ThoughtSpotDeployer:
1527
  log_progress(f"✅ Model created ({model_time:.1f}s)")
1528
  results['model'] = model_name
1529
  results['model_guid'] = model_guid
 
 
 
 
 
1530
 
1531
  # Step 3.5: Enable Spotter on the model via API
1532
  try:
@@ -1591,7 +1662,7 @@ class ThoughtSpotDeployer:
1591
  liveboard_name=liveboard_name,
1592
  llm_model=llm_model # Pass model selection
1593
  )
1594
-
1595
  # Check result (for both MCP and TML methods)
1596
  print(f"🔍 DEBUG: Liveboard result received: {liveboard_result}")
1597
  print(f"🔍 DEBUG: Success flag: {liveboard_result.get('success')}")
@@ -1617,24 +1688,70 @@ class ThoughtSpotDeployer:
1617
  obj_response = objects[0].get('response', {})
1618
  status = obj_response.get('status', {})
1619
  error_message = status.get('error_message', 'Unknown error')
 
 
 
 
 
 
1620
  error_code = status.get('error_code', 'N/A')
1621
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1622
  # Get any additional error details
1623
  full_response = json.dumps(objects[0], indent=2)
1624
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1625
  # Build comprehensive error message
1626
  error = f"Model validation failed: {error_message}"
1627
  if error_code != 'N/A':
1628
  error += f" (Error code: {error_code})"
1629
 
 
 
 
1630
  print(f"📋 Full model response: {full_response}") # DEBUG: Show full response
1631
  print(f" ❌ {error}")
1632
  log_progress(f" ❌ {error}")
1633
  log_progress(f" 📋 Full response details:")
1634
  log_progress(f"{full_response}")
1635
 
 
 
 
 
1636
  results['errors'].append(error)
1637
  results['errors'].append(f"Full API response: {full_response}")
 
1638
  else:
1639
  error = "Model failed: No objects in response"
1640
  log_progress(f" ❌ {error}")
 
166
 
167
  for line in column_lines:
168
  line = line.strip()
169
+ line_upper = line.upper()
170
+
171
+ # Parse FOREIGN KEY constraints
172
+ if line_upper.startswith('FOREIGN KEY'):
173
+ # FOREIGN KEY (CUSTOMER_ID) REFERENCES CUSTOMERS(CUSTOMER_ID)
174
+ fk_match = re.match(
175
+ r'FOREIGN\s+KEY\s*\((\w+)\)\s*REFERENCES\s+(\w+)\s*\((\w+)\)',
176
+ line,
177
+ re.IGNORECASE
178
+ )
179
+ if fk_match:
180
+ from_col = fk_match.group(1).upper()
181
+ to_table = fk_match.group(2).upper()
182
+ to_col = fk_match.group(3).upper()
183
+ foreign_keys.append({
184
+ 'from_table': table_name,
185
+ 'from_column': from_col,
186
+ 'to_table': to_table,
187
+ 'to_column': to_col
188
+ })
189
+ print(f" 🔗 Found FK: {table_name}.{from_col} -> {to_table}.{to_col}")
190
+
191
+ elif not line_upper.startswith(('PRIMARY KEY', 'CONSTRAINT')):
192
  # Parse: COLUMNNAME DATATYPE(params) [IDENTITY] [NOT NULL]
193
  parts = line.split()
194
  if len(parts) >= 2:
 
197
  col_type_match = re.match(r'(\w+(?:\([^)]+\))?)', parts[1])
198
  col_type = col_type_match.group(1).upper() if col_type_match else parts[1].upper()
199
 
 
 
200
  columns.append({
201
  'name': col_name,
202
  'type': col_type,
 
205
 
206
  tables[table_name] = columns
207
 
208
+ print(f"📊 Found {len(tables)} tables and {len(foreign_keys)} foreign keys in DDL")
209
  return tables, foreign_keys
210
 
211
 
 
257
  print(f" 📋 Response: {response.text}")
258
 
259
  def create_table_tml(self, table_name: str, columns: List, connection_name: str,
260
+ database: str, schema: str, all_tables: Dict = None, table_guid: str = None, foreign_keys: List = None) -> str:
261
  """Generate table TML matching working example structure
262
 
263
  Args:
264
  table_guid: If provided, use this GUID (for updating existing tables with joins)
265
+ foreign_keys: List of foreign key relationships parsed from DDL
266
  """
267
  tml_columns = []
268
 
 
325
 
326
  # Add joins_with relationships (matching working example)
327
  if all_tables:
328
+ joins_with = self._generate_table_joins(table_name, columns, all_tables, foreign_keys)
329
  if joins_with:
330
  table_tml['table']['joins_with'] = joins_with
331
 
 
334
  # Keep quotes around 'on' key as shown in working example
335
  return yaml_output
336
 
337
+ def _generate_table_joins(self, table_name: str, columns: List, all_tables: Dict, foreign_keys: List = None) -> List:
338
+ """Generate joins_with structure based on parsed foreign keys from DDL"""
339
  joins = []
340
  table_name_upper = table_name.upper()
341
+
342
+ if not foreign_keys:
343
+ print(f" ⚠️ No foreign keys provided for {table_name_upper}")
344
+ return joins
345
+
346
+ # Use actual foreign keys from DDL
347
+ for fk in foreign_keys:
348
+ if fk['from_table'] == table_name_upper:
349
+ to_table = fk['to_table']
350
+ from_col = fk['from_column']
351
+ to_col = fk['to_column']
 
 
352
 
353
+ # Check if target table exists in THIS deployment
 
354
  available_tables_upper = [t.upper() for t in all_tables.keys()]
355
+ if to_table in available_tables_upper:
 
356
  constraint_id = f"SYS_CONSTRAINT_{self._generate_constraint_id()}"
357
  join_def = {
358
  'name': constraint_id,
359
  'destination': {
360
+ 'name': to_table
361
  },
362
+ 'on': f"[{table_name_upper}::{from_col}] = [{to_table}::{to_col}]",
363
  'type': 'INNER'
364
  }
365
  joins.append(join_def)
366
+ print(f" 🔗 Generated join: {table_name_upper}.{from_col} -> {to_table}.{to_col}")
367
  else:
368
+ print(f" ⏭️ Skipping join: {table_name_upper}.{from_col} -> {to_table} (table not in this deployment)")
 
369
 
370
  return joins
371
 
 
1133
  print(f" ⚠️ Could not create schema: {e}")
1134
  print(f" 📝 Will proceed assuming schema exists or will be created by table operations")
1135
 
1136
+ def assign_tags_to_objects(self, object_guids: List[str], object_type: str, tag_name: str) -> bool:
1137
+ """
1138
+ Assign tags to ThoughtSpot objects using REST API v1.
1139
+
1140
+ Args:
1141
+ object_guids: List of object GUIDs to tag
1142
+ object_type: Type of objects (LOGICAL_TABLE for tables/models, PINBOARD_ANSWER_BOOK for liveboards)
1143
+ tag_name: Tag name to assign
1144
+
1145
+ Returns:
1146
+ True if successful, False otherwise
1147
+ """
1148
+ if not tag_name or not object_guids:
1149
+ return False
1150
+
1151
+ try:
1152
+ import json as json_module
1153
+
1154
+ # Use V1 API which actually works
1155
+ assign_response = self.session.post(
1156
+ f"{self.base_url}/tspublic/v1/metadata/assigntag",
1157
+ data={
1158
+ 'id': json_module.dumps(object_guids),
1159
+ 'type': object_type,
1160
+ 'tagname': json_module.dumps([tag_name])
1161
+ },
1162
+ headers={
1163
+ 'X-Requested-By': 'ThoughtSpot',
1164
+ 'Content-Type': 'application/x-www-form-urlencoded'
1165
+ }
1166
+ )
1167
+
1168
+ if assign_response.status_code in [200, 204]:
1169
+ print(f"[ThoughtSpot] ✅ Tagged {len(object_guids)} {object_type} objects with '{tag_name}'", flush=True)
1170
+ return True
1171
+ else:
1172
+ print(f"[ThoughtSpot] ⚠️ Tag assignment failed: {assign_response.status_code}", flush=True)
1173
+ return False
1174
+
1175
+ except Exception as e:
1176
+ print(f"[ThoughtSpot] ⚠️ Tag assignment error: {str(e)}", flush=True)
1177
+ return False
1178
+
1179
  def _generate_demo_names(self, company_name: str = None, use_case: str = None):
1180
  """Generate standardized demo names using DM convention"""
1181
  from datetime import datetime
 
1211
  def deploy_all(self, ddl: str, database: str, schema: str,
1212
  connection_name: str = None, company_name: str = None,
1213
  use_case: str = None, liveboard_name: str = None,
1214
+ llm_model: str = None, tag_name: str = None,
1215
+ progress_callback=None) -> Dict:
1216
  """
1217
  Deploy complete data model to ThoughtSpot
1218
 
 
1384
 
1385
  for table_name, columns in tables.items():
1386
  print(f"[ThoughtSpot] Preparing {table_name.upper()}...", flush=True)
1387
+ table_tml = self.create_table_tml(table_name, columns, connection_name, database, schema, all_tables=None, foreign_keys=foreign_keys)
1388
  table_tmls_batch1.append(table_tml)
1389
  table_names_order.append(table_name.upper())
1390
 
 
1438
  log_progress(" ❌ No tables were created successfully in Batch 1")
1439
  return results
1440
 
1441
+ # Assign tags to tables
1442
+ table_guid_list = list(table_guids.values())
1443
+ print(f"🔍 DEBUG BEFORE TAG CALL: tag_name='{tag_name}', table_guid_list={table_guid_list}")
1444
+ log_progress(f"🏷️ Assigning tag '{tag_name}' to {len(table_guid_list)} tables...")
1445
+ self.assign_tags_to_objects(table_guid_list, 'LOGICAL_TABLE', tag_name)
1446
+
1447
  batch1_time = time.time() - batch1_start
1448
  log_progress(f"✅ Batch 1 complete: {len(table_guids)} tables created ({batch1_time:.1f}s)")
1449
 
 
1470
  # Create table TML WITH joins_with section AND the table GUID
1471
  table_tml = self.create_table_tml(
1472
  table_name, columns, connection_name, database, schema,
1473
+ all_tables=tables, table_guid=table_guid, foreign_keys=foreign_keys
1474
  )
1475
  table_tmls_batch2.append(table_tml)
1476
  table_names_order_batch2.append(table_name_upper)
 
1593
  log_progress(f"✅ Model created ({model_time:.1f}s)")
1594
  results['model'] = model_name
1595
  results['model_guid'] = model_guid
1596
+
1597
+ # Assign tag to model
1598
+ print(f"🔍 DEBUG BEFORE TAG CALL: tag_name='{tag_name}', model_guid='{model_guid}'")
1599
+ log_progress(f"🏷️ Assigning tag '{tag_name}' to model...")
1600
+ self.assign_tags_to_objects([model_guid], 'LOGICAL_TABLE', tag_name)
1601
 
1602
  # Step 3.5: Enable Spotter on the model via API
1603
  try:
 
1662
  liveboard_name=liveboard_name,
1663
  llm_model=llm_model # Pass model selection
1664
  )
1665
+
1666
  # Check result (for both MCP and TML methods)
1667
  print(f"🔍 DEBUG: Liveboard result received: {liveboard_result}")
1668
  print(f"🔍 DEBUG: Success flag: {liveboard_result.get('success')}")
 
1688
  obj_response = objects[0].get('response', {})
1689
  status = obj_response.get('status', {})
1690
  error_message = status.get('error_message', 'Unknown error')
1691
+
1692
+ # Clean HTML tags from error message (ThoughtSpot sometimes returns HTML)
1693
+ error_message = re.sub(r'<[^>]+>', '', error_message).strip()
1694
+ if not error_message:
1695
+ error_message = 'Schema validation failed (no details provided)'
1696
+
1697
  error_code = status.get('error_code', 'N/A')
1698
 
1699
+ # Try to extract additional error details from various response fields
1700
+ error_details = []
1701
+
1702
+ # Check for detailed error messages in different response structures
1703
+ if 'error_details' in status:
1704
+ error_details.append(f"Error details: {status.get('error_details')}")
1705
+
1706
+ if 'validation_errors' in obj_response:
1707
+ error_details.append(f"Validation errors: {obj_response.get('validation_errors')}")
1708
+
1709
+ if 'warnings' in obj_response:
1710
+ error_details.append(f"Warnings: {obj_response.get('warnings')}")
1711
+
1712
+ # Check header for additional info
1713
+ header = obj_response.get('header', {})
1714
+ if 'error' in header:
1715
+ error_details.append(f"Header error: {header.get('error')}")
1716
+
1717
  # Get any additional error details
1718
  full_response = json.dumps(objects[0], indent=2)
1719
 
1720
+ # Save the TML that failed for debugging
1721
+ import tempfile
1722
+ # os is already imported at module level
1723
+ try:
1724
+ debug_dir = os.path.join(tempfile.gettempdir(), 'thoughtspot_debug')
1725
+ os.makedirs(debug_dir, exist_ok=True)
1726
+ failed_tml_path = os.path.join(debug_dir, f'failed_model_{datetime.now().strftime("%Y%m%d_%H%M%S")}.tml')
1727
+ with open(failed_tml_path, 'w') as f:
1728
+ f.write(model_tml)
1729
+ log_progress(f"💾 Failed TML saved to: {failed_tml_path}")
1730
+ print(f"💾 Failed TML saved to: {failed_tml_path}")
1731
+ except Exception as save_error:
1732
+ log_progress(f"⚠️ Could not save failed TML: {save_error}")
1733
+
1734
  # Build comprehensive error message
1735
  error = f"Model validation failed: {error_message}"
1736
  if error_code != 'N/A':
1737
  error += f" (Error code: {error_code})"
1738
 
1739
+ if error_details:
1740
+ error += f"\n\nAdditional details:\n" + "\n".join(error_details)
1741
+
1742
  print(f"📋 Full model response: {full_response}") # DEBUG: Show full response
1743
  print(f" ❌ {error}")
1744
  log_progress(f" ❌ {error}")
1745
  log_progress(f" 📋 Full response details:")
1746
  log_progress(f"{full_response}")
1747
 
1748
+ # Include the TML snippet in error for quick debugging
1749
+ tml_preview = model_tml[:500] + "..." if len(model_tml) > 500 else model_tml
1750
+ log_progress(f"\n📄 TML that was sent (first 500 chars):\n{tml_preview}")
1751
+
1752
  results['errors'].append(error)
1753
  results['errors'].append(f"Full API response: {full_response}")
1754
+ results['errors'].append(f"Failed TML saved to: {failed_tml_path if 'failed_tml_path' in locals() else 'N/A'}")
1755
  else:
1756
  error = "Model failed: No objects in response"
1757
  log_progress(f" ❌ {error}")
verify_outliers.py DELETED
@@ -1,165 +0,0 @@
1
- #!/usr/bin/env python3
2
- """
3
- Verify the strategic outliers that were injected
4
- """
5
-
6
- from dotenv import load_dotenv
7
- import os
8
- import snowflake.connector
9
-
10
- def get_snowflake_connection():
11
- """Get Snowflake connection for the specific schema"""
12
- from snowflake_auth import get_snowflake_connection_params
13
-
14
- conn_params = get_snowflake_connection_params()
15
- conn_params.pop('schema', None) # Remove schema to avoid duplicate
16
-
17
- conn = snowflake.connector.connect(
18
- **conn_params,
19
- schema='20250923_090309_THOUG_SAL'
20
- )
21
- return conn
22
-
23
- def verify_outliers():
24
- """Verify the strategic outliers that were created"""
25
-
26
- print("🔍 VERIFYING STRATEGIC OUTLIERS")
27
- print("=" * 50)
28
-
29
- conn = get_snowflake_connection()
30
- cursor = conn.cursor()
31
-
32
- try:
33
- # 1. Check high-value customers with poor outcomes
34
- print("📊 High-value customers with poor interaction outcomes:")
35
- cursor.execute("""
36
- SELECT
37
- c.CUSTOMERID,
38
- c.PREFERENCES,
39
- COUNT(DISTINCT st.TRANSACTIONID) as transaction_count,
40
- AVG(st.AMOUNT) as avg_transaction_value,
41
- COUNT(DISTINCT ci.INTERACTIONID) as interaction_count,
42
- SUM(CASE WHEN ci.OUTCOME = 'Unsuccessful' THEN 1 ELSE 0 END) as unsuccessful_interactions
43
- FROM CUSTOMERS c
44
- LEFT JOIN SALESTRANSACTIONS st ON c.CUSTOMERID = st.CUSTOMERID
45
- LEFT JOIN CUSTOMERINTERACTIONS ci ON c.CUSTOMERID = ci.CUSTOMERID
46
- WHERE c.PREFERENCES LIKE '%High-value%'
47
- GROUP BY c.CUSTOMERID, c.PREFERENCES
48
- ORDER BY avg_transaction_value DESC
49
- LIMIT 5
50
- """)
51
-
52
- results = cursor.fetchall()
53
- for row in results:
54
- print(f" Customer {row[0]}: {row[2]} transactions, avg ${row[3]:.2f}, {row[5]} unsuccessful interactions")
55
-
56
- # 2. Check channel performance
57
- print("\n📊 Channel performance analysis:")
58
- cursor.execute("""
59
- SELECT
60
- c.CHANNELID,
61
- c.NAME,
62
- c.TYPE,
63
- COUNT(ci.INTERACTIONID) as total_interactions,
64
- SUM(CASE WHEN ci.OUTCOME = 'Successful' THEN 1 ELSE 0 END) as successful_interactions,
65
- SUM(CASE WHEN ci.OUTCOME = 'Unsuccessful' THEN 1 ELSE 0 END) as unsuccessful_interactions,
66
- ROUND(SUM(CASE WHEN ci.OUTCOME = 'Successful' THEN 1 ELSE 0 END) * 100.0 / COUNT(ci.INTERACTIONID), 2) as success_rate
67
- FROM CHANNELS c
68
- LEFT JOIN CUSTOMERINTERACTIONS ci ON c.CHANNELID = ci.CHANNELID
69
- GROUP BY c.CHANNELID, c.NAME, c.TYPE
70
- ORDER BY success_rate DESC
71
- """)
72
-
73
- results = cursor.fetchall()
74
- for row in results:
75
- print(f" Channel {row[0]} ({row[1]}): {row[3]} interactions, {row[4]} successful, {row[5]} unsuccessful, {row[6]}% success rate")
76
-
77
- # 3. Check recent performance degradation
78
- print("\n📊 Recent performance (last 30 days):")
79
- cursor.execute("""
80
- SELECT
81
- DATE(ci.DATE) as interaction_date,
82
- COUNT(ci.INTERACTIONID) as total_interactions,
83
- SUM(CASE WHEN ci.OUTCOME = 'Successful' THEN 1 ELSE 0 END) as successful_interactions,
84
- SUM(CASE WHEN ci.OUTCOME = 'Unsuccessful' THEN 1 ELSE 0 END) as unsuccessful_interactions,
85
- ROUND(SUM(CASE WHEN ci.OUTCOME = 'Unsuccessful' THEN 1 ELSE 0 END) * 100.0 / COUNT(ci.INTERACTIONID), 2) as failure_rate
86
- FROM CUSTOMERINTERACTIONS ci
87
- WHERE ci.DATE >= CURRENT_DATE - 30
88
- GROUP BY DATE(ci.DATE)
89
- ORDER BY interaction_date DESC
90
- LIMIT 10
91
- """)
92
-
93
- results = cursor.fetchall()
94
- for row in results:
95
- print(f" {row[0]}: {row[1]} interactions, {row[2]} successful, {row[3]} unsuccessful, {row[4]}% failure rate")
96
-
97
- # 4. Check cross-channel inconsistency
98
- print("\n📊 Cross-channel inconsistency patterns:")
99
- cursor.execute("""
100
- SELECT
101
- ci.CUSTOMERID,
102
- c.NAME as channel_name,
103
- COUNT(ci.INTERACTIONID) as interactions,
104
- SUM(CASE WHEN ci.OUTCOME = 'Successful' THEN 1 ELSE 0 END) as successful,
105
- SUM(CASE WHEN ci.OUTCOME = 'Unsuccessful' THEN 1 ELSE 0 END) as unsuccessful
106
- FROM CUSTOMERINTERACTIONS ci
107
- JOIN CHANNELS c ON ci.CHANNELID = c.CHANNELID
108
- WHERE ci.CUSTOMERID IN (
109
- SELECT CUSTOMERID FROM CUSTOMERINTERACTIONS
110
- WHERE CUSTOMERID IN (10, 11, 12, 13, 14, 15, 16, 17, 18, 19)
111
- GROUP BY CUSTOMERID
112
- HAVING COUNT(DISTINCT CHANNELID) > 1
113
- )
114
- GROUP BY ci.CUSTOMERID, c.NAME
115
- ORDER BY ci.CUSTOMERID, c.NAME
116
- """)
117
-
118
- results = cursor.fetchall()
119
- current_customer = None
120
- for row in results:
121
- if current_customer != row[0]:
122
- print(f" Customer {row[0]}:")
123
- current_customer = row[0]
124
- print(f" {row[1]}: {row[2]} interactions, {row[3]} successful, {row[4]} unsuccessful")
125
-
126
- # 5. Check missed opportunities
127
- print("\n📊 High-value recent transactions without interactions:")
128
- cursor.execute("""
129
- SELECT
130
- st.CUSTOMERID,
131
- COUNT(st.TRANSACTIONID) as recent_transactions,
132
- AVG(st.AMOUNT) as avg_amount,
133
- COUNT(ci.INTERACTIONID) as recent_interactions
134
- FROM SALESTRANSACTIONS st
135
- LEFT JOIN CUSTOMERINTERACTIONS ci ON st.CUSTOMERID = ci.CUSTOMERID
136
- AND ci.DATE >= CURRENT_DATE - 7
137
- WHERE st.DATE >= CURRENT_DATE - 7
138
- AND st.AMOUNT > 800
139
- GROUP BY st.CUSTOMERID
140
- HAVING COUNT(ci.INTERACTIONID) = 0
141
- ORDER BY avg_amount DESC
142
- LIMIT 5
143
- """)
144
-
145
- results = cursor.fetchall()
146
- for row in results:
147
- print(f" Customer {row[0]}: {row[1]} high-value transactions (avg ${row[2]:.2f}), 0 recent interactions")
148
-
149
- print("\n" + "=" * 50)
150
- print("✅ OUTLIER VERIFICATION COMPLETE!")
151
- print("These patterns are perfect for demonstrating:")
152
- print("• Poor customer experience across channels")
153
- print("• Missed revenue opportunities")
154
- print("• Channel performance disparities")
155
- print("• Need for AI-driven insights and next best actions")
156
-
157
- except Exception as e:
158
- print(f"❌ Error verifying outliers: {e}")
159
- raise
160
- finally:
161
- cursor.close()
162
- conn.close()
163
-
164
- if __name__ == "__main__":
165
- verify_outliers()