Spaces:

VEDAGI1
/

Medica_DecisionSupportAI

Sleeping

App Files Files Community

Rajan Sharma commited on Oct 6

Commit

c578f08

verified ·

1 Parent(s): 2cfdf5a

Update app.py

Browse files

Files changed (1) hide show

app.py +27 -14

app.py CHANGED Viewed

@@ -11,7 +11,7 @@ import gradio as gr
 import pandas as pd
 from datetime import datetime
 import regex as re2
-import re
 # --- BACKEND IMPORTS ---
 from langchain_cohere import ChatCohere
@@ -44,26 +44,36 @@ def _sanitize_text(s: str) -> str:
 def _create_python_script(user_scenario: str, schema_context: str) -> str:
     """Asks the AI to write a Python script that outputs raw, structured JSON."""
-    # --- THE FINAL ALIGNMENT FIX IS HERE ---
     prompt_for_coder = f"""
-You are an expert Python data scientist. Your job is to write a script to analyze the provided data and print the findings as a single JSON object.
---- DATA CONTEXT ---
-The data is pre-loaded into a Python list of pandas DataFrames called `dfs`.
 {schema_context}
---- END DATA CONTEXT ---
 CRITICAL RULES:
-1.  **DO NOT READ FILES:** You MUST NOT include `pd.read_csv`. The data is already in the `dfs` variable. You MUST use this variable.
-2.  **JSON OUTPUT ONLY:** Your script's ONLY output must be a single JSON object printed to stdout.
-3.  **JSON SERIALIZATION:** Before adding data to your final dictionary for JSON conversion, you MUST convert any pandas-specific types (like `int64` or `float64`) to standard Python types using `.item()` for single values or `.tolist()` for lists. For example: `my_count = df['column'].count().item()`. Failure to do this will cause a fatal `TypeError`.
-4.  **BE PRECISE:** Use the exact, case-sensitive column names from the schema and robustly clean strings (`re.sub()`) before converting them to numbers.
 --- USER'S SCENARIO ---
 {user_scenario}
 --- PYTHON SCRIPT ---
-Now, write the complete Python script that analyzes the `dfs` variable and prints a single, serializable JSON object.
 ```python
 """
     generated_text = cohere_chat(prompt_for_coder)
@@ -91,9 +101,12 @@ def _generate_long_report(prompt: str) -> str:
 def _generate_final_report(user_scenario: str, raw_data_json: str) -> str:
     """Asks the AI to act as a consultant and write a polished report from the raw data."""
     prompt_for_writer = f"""
-You are an expert management consultant. A data science script has extracted key findings. Your task is to synthesize these findings into a professional report that answers the user's questions.
---- USER'S ORIGINAL SCENARIO ---
 {user_scenario}
 --- END SCENARIO ---
@@ -105,7 +118,7 @@ Now, write the final, polished report. The report MUST:
 1.  Follow the "Expected Output Format" requested by the user.
 2.  Use tables, bullet points, and DETAILED narrative justifications for each recommendation.
 3.  Synthesize the raw data into actionable insights. Do not just copy the raw numbers; interpret them.
-4.  Ensure you fully address ALL evaluation questions.
 """
     return _generate_long_report(prompt_for_writer)

 import pandas as pd
 from datetime import datetime
 import regex as re2
+import re  # Standard library regex module
 # --- BACKEND IMPORTS ---
 from langchain_cohere import ChatCohere
 def _create_python_script(user_scenario: str, schema_context: str) -> str:
     """Asks the AI to write a Python script that outputs raw, structured JSON."""
+    # --- THE FINAL ALIGNMENT AND BUG FIX IS HERE ---
+    EXPERT_ANALYTICAL_GUIDELINES = """
+--- EXPERT ANALYTICAL GUIDELINES ---
+When writing your script, you MUST follow these expert business rules:
+1.  **Linking Datasets Rule:** If you need to connect facilities to health zones, you cannot assume the zone is in the facility list. You must first identify the high-priority zone from the beds data, and then find the major city (by facility count) in the facility list, and *then* assess that city's capacity. Do not try to filter the facility list by a 'zone' column if it does not exist in the schema.
+2.  **Prioritization Rule:** To prioritize locations, you MUST combine the most recent population data with specific high-risk health indicators to create a multi-factor risk score.
+3.  **Capacity Calculation Rule:** For capacity over a 3-month window, assume **60 working days**.
+4.  **Cost Calculation Rule:** Sum 'Startup cost' and 'Ongoing cost' per person before multiplying.
+"""
     prompt_for_coder = f"""
+You are an expert Python data scientist. Your job is to write a script to extract the data needed to answer the user's request.
+You have dataframes in a list `dfs`.
+{EXPERT_ANALYTICAL_GUIDELINES}
+--- DATA SCHEMA ---
 {schema_context}
+--- END SCHEMA ---
 CRITICAL RULES:
+1.  Your script's ONLY output should be a single JSON object printed to stdout containing the raw data findings.
+2.  Use the exact, case-sensitive column names from the schema.
+3.  Before converting strings to numbers, you MUST robustly clean them of all non-numeric characters (e.g., $, %, ~) using `re.sub()`.
 --- USER'S SCENARIO ---
 {user_scenario}
 --- PYTHON SCRIPT ---
+Now, write the complete Python script that performs the analysis and prints a single JSON object with the results.
 ```python
 """
     generated_text = cohere_chat(prompt_for_coder)
 def _generate_final_report(user_scenario: str, raw_data_json: str) -> str:
     """Asks the AI to act as a consultant and write a polished report from the raw data."""
     prompt_for_writer = f"""
+You are an expert management consultant and data analyst.
+A data science script has run to extract key findings. You have the user's original request and the raw JSON data.
+Your task is to synthesize these raw findings into a single, comprehensive, and professional report that directly answers all of the user's questions with detailed justifications.
+--- USER'S ORIGINAL SCENARIO & DELIVERABLES ---
 {user_scenario}
 --- END SCENARIO ---
 1.  Follow the "Expected Output Format" requested by the user.
 2.  Use tables, bullet points, and DETAILED narrative justifications for each recommendation.
 3.  Synthesize the raw data into actionable insights. Do not just copy the raw numbers; interpret them.
+4.  Ensure you fully address ALL evaluation questions, especially the final recommendations.
 """
     return _generate_long_report(prompt_for_writer)