CSV

Sleeping

App Files Files Community

yatabase1 commited on Mar 23, 2025

Commit

baed98d

verified ·

1 Parent(s): 03b8d2f

Update app.py

Browse files

Files changed (1) hide show

app.py +6 -1

app.py CHANGED Viewed

@@ -69,6 +69,7 @@ def generate_basic_understanding_code(df_preview):
     - For numeric columns, summary statistics (mean, median, std, etc.).
     - For non-numeric columns, counts, unique values, mode, and frequency distributions.
     If charts are generated, ensure plt.show() is called after each chart so they can be captured.
     """
     prompt = f"""
 You are a data analysis expert. Write Python code that performs an exploratory analysis of the DataFrame.
@@ -78,6 +79,7 @@ Assign the exploratory summary to a variable named 'basic_info' as a dictionary.
 For each column in df, include its data type.
 - For numeric columns (use pd.api.types.is_numeric_dtype), include summary statistics (mean, median, std, etc.).
 - For non-numeric columns, treat them as categorical variables and include counts, unique values, mode, and frequency distributions.
 If your analysis includes charts, call plt.show() after each chart so they can be captured.
 Note: The following safe built-ins are available: list, dict, set, tuple, abs, min, max, sum, len, range, print, pd, plt, __import__.
@@ -107,6 +109,7 @@ def generate_problem_solving_code(nl_query, df_preview, basic_info):
     The final analysis should be assigned to a variable named 'result' as a dictionary with keys:
     'summary', 'detailed_stats', 'insights', and 'chart_descriptions'.
     If charts are generated, call plt.show() after each chart so they can be captured.
     """
     prompt = f"""
 You are a data analysis expert. Write Python code that performs the analysis as described below.
@@ -118,6 +121,7 @@ When processing the DataFrame, first inspect each column’s data type:
 - For non-numeric columns, treat them as categorical variables and compute appropriate descriptive statistics (counts, unique values, mode, and frequency distributions).
 - Only generate charts and tables that are relevant to the problem at hand. Exclude fields that are not relevant to the problem from the charts and tables.
 Incorporate insights from 'basic_info' if relevant.
 If your analysis includes charts, call plt.show() after each chart so they can be captured.
 Note: The following safe built-ins are available: list, dict, set, tuple, abs, min, max, sum, len, range, print, pd, plt, __import__.
@@ -198,7 +202,8 @@ def safe_exec_code(code, df, capture_charts=True, interactive=False, extra_globa
                 output = safe_locals.get("basic_info", None)
         except Exception as ex:
             error_details = traceback.format_exc()
-            # Append a hint for KeyError related issues.
             if "KeyError" in error_details:
                 error_details += "\nHint: The generated code might be referencing columns that do not exist in your DataFrame."
             return f"An error occurred during code execution:\n{error_details}", safe_globals["charts"]

     - For numeric columns, summary statistics (mean, median, std, etc.).
     - For non-numeric columns, counts, unique values, mode, and frequency distributions.
     If charts are generated, ensure plt.show() is called after each chart so they can be captured.
+    Note: When converting dates, use pd.to_datetime() without a fixed format or with dayfirst=True.
     """
     prompt = f"""
 You are a data analysis expert. Write Python code that performs an exploratory analysis of the DataFrame.
 For each column in df, include its data type.
 - For numeric columns (use pd.api.types.is_numeric_dtype), include summary statistics (mean, median, std, etc.).
 - For non-numeric columns, treat them as categorical variables and include counts, unique values, mode, and frequency distributions.
+When converting date strings to datetime, use pd.to_datetime() without a fixed format or with dayfirst=True.
 If your analysis includes charts, call plt.show() after each chart so they can be captured.
 Note: The following safe built-ins are available: list, dict, set, tuple, abs, min, max, sum, len, range, print, pd, plt, __import__.
     The final analysis should be assigned to a variable named 'result' as a dictionary with keys:
     'summary', 'detailed_stats', 'insights', and 'chart_descriptions'.
     If charts are generated, call plt.show() after each chart so they can be captured.
+    Note: When converting date strings to datetime, use pd.to_datetime() without a fixed format or with dayfirst=True.
     """
     prompt = f"""
 You are a data analysis expert. Write Python code that performs the analysis as described below.
 - For non-numeric columns, treat them as categorical variables and compute appropriate descriptive statistics (counts, unique values, mode, and frequency distributions).
 - Only generate charts and tables that are relevant to the problem at hand. Exclude fields that are not relevant to the problem from the charts and tables.
 Incorporate insights from 'basic_info' if relevant.
+When converting date strings to datetime, use pd.to_datetime() without a fixed format or with dayfirst=True.
 If your analysis includes charts, call plt.show() after each chart so they can be captured.
 Note: The following safe built-ins are available: list, dict, set, tuple, abs, min, max, sum, len, range, print, pd, plt, __import__.
                 output = safe_locals.get("basic_info", None)
         except Exception as ex:
             error_details = traceback.format_exc()
+            if "ValueError: time data" in error_details:
+                error_details += "\nHint: The generated code might be using a fixed datetime format. Consider using pd.to_datetime() without a fixed format or with dayfirst=True."
             if "KeyError" in error_details:
                 error_details += "\nHint: The generated code might be referencing columns that do not exist in your DataFrame."
             return f"An error occurred during code execution:\n{error_details}", safe_globals["charts"]