Spaces:

k-pavlo
/

excel-ai-analyzer

Sleeping

App Files Files Community

Pavlo Kostianov commited on Sep 13

Commit

a6bbf7e

1 Parent(s): 8d3574a

Update loader function to load whole Budget sheet and improve prompt

Browse files

Files changed (1) hide show

app.py +22 -23

app.py CHANGED Viewed

@@ -25,24 +25,13 @@ df2_all = pd.read_excel(
 df1 = {}
 for sheet, raw_df in df1_all.items():
     if sheet == "Budget":
-        # Special preprocess for Budget sheet
-        budget_raw = pd.read_excel(
-            os.path.join("data_source", "OC Onboarding Information.xlsx"),
-            sheet_name="Budget",
-            header=None
-        )
-        # Part 1: detailed breakdown (rows 3–20 with headers at row 3)
-        budget_details = pd.read_excel(
             os.path.join("data_source", "OC Onboarding Information.xlsx"),
             sheet_name="Budget",
             header=2,
-            nrows=17  # stop before totals
         )
-        # Part 2: summary section (rows 21+)
-        budget_summary = budget_raw.iloc[20:].reset_index(drop=True)
-        # Add into df1 dict
-        df1["Budget_Details"] = budget_details
-        df1["Budget_Summary"] = budget_summary
     elif sheet == "Rooms per category":
         # use row 4 as header
         df1[sheet] = pd.read_excel(
@@ -76,14 +65,24 @@ for sheet, raw_df in df2_all.items():
 def get_schema_info():
     lines = ["Report 1 - OC Onboarding Information:"]
     for sheet, df in df1.items():
-        lines.append(f"Sheet: {sheet}, Columns: {list(df.columns)}")
-        sample = df.head(1).to_dict(orient="records")[0]
-        lines.append(f"Example row: {sample}")
     lines.append("\nReport 2 - The Alex Ideas Report:")
     for sheet, df in df2.items():
         lines.append(f"Sheet: {sheet}, Columns: {list(df.columns)}")
-        sample = df.head(1).to_dict(orient="records")[0]
-        lines.append(f"Example row: {sample}")
     return "\n".join(lines)
 schema_info = get_schema_info()
@@ -127,19 +126,19 @@ If absolutely no relation to provided sheets, respond with:
 The reports are OC Onboarding Information and The Alex Ideas Report.
 OC Onboarding Information (df1) is the initial hotel data including a budget set by the hotel for the year.
-- "Budget_Details": line-item breakdown (channels, rates, revenue, etc.)
-- "Budget_Summary": summary rows (total occupancy, RevPar for each month, etc.)
 The Alex Ideas Report (df2) is the real reports about revenue of the hotel for the year.
-They have the following schema:
 {schema_info}
 The user asked:
 {message}
 Rules:
-- Use only pandas, df1, df2, and Python built-ins.
 - Do NOT write import statements (pandas is already imported as pd).
 - Always put the answer in a variable named `result`.
 - Return ONLY Python code, nothing else.
 - If multiple values are tied for the maximum, include all of them in a list.

 df1 = {}
 for sheet, raw_df in df1_all.items():
     if sheet == "Budget":
+        # Load rows 3–33 with headers at row 3
+        df1[sheet] = pd.read_excel(
             os.path.join("data_source", "OC Onboarding Information.xlsx"),
             sheet_name="Budget",
             header=2,
+            nrows=31  # rows 3–33 inclusive
         )
     elif sheet == "Rooms per category":
         # use row 4 as header
         df1[sheet] = pd.read_excel(
 def get_schema_info():
     lines = ["Report 1 - OC Onboarding Information:"]
     for sheet, df in df1.items():
+        if sheet == "Budget":
+            lines.append(f'Sheet: {sheet} (rows 3–33). This sheet contains the full hotel budget, including both detailed line items (by channel/segment) and summary metrics (total revenue, occupancy %, RevPAR, capacity).')
+        else:
+            lines.append(f"Sheet: {sheet}, Columns: {list(df.columns)}")
+        # Add one example row for context
+        try:
+            sample = df.head(1).to_dict(orient="records")[0]
+            lines.append(f"Example row: {sample}")
+        except Exception:
+            lines.append("Example row: [no data available]")
     lines.append("\nReport 2 - The Alex Ideas Report:")
     for sheet, df in df2.items():
         lines.append(f"Sheet: {sheet}, Columns: {list(df.columns)}")
+        try:
+            sample = df.head(1).to_dict(orient="records")[0]
+            lines.append(f"Example row: {sample}")
+        except Exception:
+            lines.append("Example row: [no data available]")
     return "\n".join(lines)
 schema_info = get_schema_info()
 The reports are OC Onboarding Information and The Alex Ideas Report.
 OC Onboarding Information (df1) is the initial hotel data including a budget set by the hotel for the year.
 The Alex Ideas Report (df2) is the real reports about revenue of the hotel for the year.
+They have the following schema (sheet names, columns, and example rows, but Budget sheet is fully loaded):
 {schema_info}
 The user asked:
 {message}
 Rules:
+- Use ONLY pandas, df1, df2, and Python built-ins.
 - Do NOT write import statements (pandas is already imported as pd).
+- Access all dataframes ONLY as df1["SheetName"] or df2["SheetName"].
+  Never assign them to new variables like budget_df, property_df, etc.
 - Always put the answer in a variable named `result`.
 - Return ONLY Python code, nothing else.
 - If multiple values are tied for the maximum, include all of them in a list.