Rajan Sharma commited on
Commit
a3c9eb2
·
verified ·
1 Parent(s): 112ad16

Update app.py

Browse files
Files changed (1) hide show
  1. app.py +7 -13
app.py CHANGED
@@ -10,9 +10,6 @@ import gradio as gr
10
  import pandas as pd
11
  from datetime import datetime
12
 
13
- # --- THE FINAL FIX IS HERE: Re-introducing the missing import ---
14
- import regex as re2
15
-
16
  # --- BACKEND IMPORTS ---
17
  from langchain_cohere import ChatCohere
18
 
@@ -37,11 +34,11 @@ def load_markdown_text(filepath: str) -> str:
37
 
38
  def _sanitize_text(s: str) -> str:
39
  if not isinstance(s, str): return s
40
- # This now works because 're2' is defined from the import above
41
  return re2.sub(r'[\p{C}--[\n\t]]+', '', s)
42
 
43
  def _create_python_script(user_scenario: str, schema_context: str) -> str:
44
  """Uses an LLM to act as an "AI Coder", writing a complete Python script."""
 
45
  prompt_for_coder = f"""
46
  You are an expert Python data scientist. Your sole job is to write a single, complete, and executable Python script to answer the user's request.
47
  You have access to a list of pandas dataframes loaded into a variable named `dfs`.
@@ -50,15 +47,12 @@ You have access to a list of pandas dataframes loaded into a variable named `dfs
50
  {schema_context}
51
  --- END SCHEMA ---
52
 
53
- CRITICAL RULE: You MUST use the exact column names provided in the DATA SCHEMA. Column names are case-sensitive. Pay close attention to capitalization (e.g., 'Zone' vs 'zone'). A KeyError will cause a failure.
54
-
55
- Based on the user's scenario below, write a single Python script that performs the entire analysis.
56
-
57
- RULES FOR YOUR SCRIPT:
58
- 1. **Use the DataFrames:** Your script MUST use the `dfs` list and the exact column names from the schema.
59
- 2. **Print Your Findings:** Use the `print()` function at each step to output the results as a formatted report.
60
- 3. **No Placeholders:** Do not use placeholder data.
61
- 4. **Self-Contained:** The script must be entirely self-contained, starting with `import pandas as pd`.
62
 
63
  --- USER'S SCENARIO ---
64
  {user_scenario}
 
10
  import pandas as pd
11
  from datetime import datetime
12
 
 
 
 
13
  # --- BACKEND IMPORTS ---
14
  from langchain_cohere import ChatCohere
15
 
 
34
 
35
  def _sanitize_text(s: str) -> str:
36
  if not isinstance(s, str): return s
 
37
  return re2.sub(r'[\p{C}--[\n\t]]+', '', s)
38
 
39
  def _create_python_script(user_scenario: str, schema_context: str) -> str:
40
  """Uses an LLM to act as an "AI Coder", writing a complete Python script."""
41
+ # --- THE FINAL PROMPT FIX IS HERE ---
42
  prompt_for_coder = f"""
43
  You are an expert Python data scientist. Your sole job is to write a single, complete, and executable Python script to answer the user's request.
44
  You have access to a list of pandas dataframes loaded into a variable named `dfs`.
 
47
  {schema_context}
48
  --- END SCHEMA ---
49
 
50
+ CRITICAL RULES FOR YOUR SCRIPT:
51
+ 1. **HANDLE DATA TYPES:** Before performing any mathematical operations (like addition or division), you MUST explicitly convert string values (e.g., '5.5%', '$100') to numeric types (`float` or `int`). Failure to do this will cause a fatal `TypeError`.
52
+ 2. **CHECK COLUMN NAMES:** You MUST use the exact, case-sensitive column names provided in the DATA SCHEMA. A `KeyError` will cause a failure.
53
+ 3. **USE THE DATAFRAMES:** Your script MUST use the `dfs` list to access the data.
54
+ 4. **PRINT FINDINGS:** Use the `print()` function at each step to output your results as a formatted report.
55
+ 5. **NO PLACEHOLDERS:** Do not use placeholder data.
 
 
 
56
 
57
  --- USER'S SCENARIO ---
58
  {user_scenario}