Rajan Sharma commited on
Commit
ddf056f
·
verified ·
1 Parent(s): a990f93

Update app.py

Browse files
Files changed (1) hide show
  1. app.py +4 -5
app.py CHANGED
@@ -49,16 +49,15 @@ You have access to a list of pandas dataframes loaded into a variable named `dfs
49
  --- END SCHEMA ---
50
 
51
  CRITICAL RULES FOR YOUR SCRIPT:
52
- 1. **ROBUST STRING CLEANING:** Before converting a string to a number (e.g., with `.astype(float)`), you MUST first remove ALL non-numeric characters that are not a digit or a decimal point. This includes characters like `$`, `%`, `~`, and commas. Use `.str.replace()` with a regular expression like `r'[^0-9.-]'` to do this safely. Failure to do this will cause a fatal `ValueError`.
53
  2. **CHECK COLUMN NAMES:** You MUST use the exact, case-sensitive column names provided in the DATA SCHEMA. A `KeyError` will cause a failure.
54
- 3. **USE THE DATAFRAMES:** Your script MUST use the `dfs` list to access the data.
55
- 4. **PRINT FINDINGS:** Use the `print()` function at each step to output your results as a formatted report.
56
 
57
  --- USER'S SCENARIO ---
58
  {user_scenario}
59
 
60
  --- PYTHON SCRIPT ---
61
- Now, write the complete Python script to be executed.
62
  ```python
63
  """
64
  generated_text = cohere_chat(prompt_for_coder)
@@ -106,7 +105,7 @@ def handle(user_msg: str, files: list) -> str:
106
  schema_context = "\n".join(schema_parts)
107
  analysis_script = _create_python_script(safe_in, schema_context)
108
 
109
- execution_namespace = {"dfs": dataframes, "pd": pd}
110
  output_buffer = io.StringIO()
111
 
112
  try:
 
49
  --- END SCHEMA ---
50
 
51
  CRITICAL RULES FOR YOUR SCRIPT:
52
+ 1. **ROBUST STRING CLEANING:** When you extract a SINGLE string value from a dataframe (e.g., using `.loc` or `.iloc`), you MUST clean it using the standard `re.sub()` function before converting it to a number. DO NOT use pandas' `.str` accessor on single strings, as it will cause a fatal `AttributeError`. For example: `my_string = health_indicators.loc[0, 'Value']` -> `cleaned_string = re.sub(r'[^0-9.-]', '', my_string)` -> `my_float = float(cleaned_string)`.
53
  2. **CHECK COLUMN NAMES:** You MUST use the exact, case-sensitive column names provided in the DATA SCHEMA. A `KeyError` will cause a failure.
54
+ 3. **PRINT FINDINGS:** Use the `print()` function at each step to output your results as a formatted report.
 
55
 
56
  --- USER'S SCENARIO ---
57
  {user_scenario}
58
 
59
  --- PYTHON SCRIPT ---
60
+ Now, write the complete Python script to be executed. The script MUST start with `import pandas as pd` and `import re`.
61
  ```python
62
  """
63
  generated_text = cohere_chat(prompt_for_coder)
 
105
  schema_context = "\n".join(schema_parts)
106
  analysis_script = _create_python_script(safe_in, schema_context)
107
 
108
+ execution_namespace = {"dfs": dataframes, "pd": pd, "re": re}
109
  output_buffer = io.StringIO()
110
 
111
  try: