VEDAGI1 commited on
Commit
4280841
·
verified ·
1 Parent(s): c6e557d

Update app.py

Browse files
Files changed (1) hide show
  1. app.py +2 -2
app.py CHANGED
@@ -677,7 +677,7 @@ When writing your script, you MUST follow these expert analytical principles:
677
  **DATA INTEGRATION & LINKING:**
678
  1. When linking datasets, identify the correct join keys by examining column names and values. Never assume column names match across datasets.
679
  2. If a required column doesn't exist in a dataset, derive it from related data or clearly note its absence in the output.
680
- 3. Use the most recent/relevant time period data when multiple periods exist (e.g., prefer 2021 over 2013 census data if both available).
681
 
682
  **AGGREGATION & GROUPING:**
683
  4. When asked about "specialties," "categories," or "types," group by the broadest categorical column first (e.g., 'Specialty' not 'Procedure').
@@ -685,7 +685,7 @@ When writing your script, you MUST follow these expert analytical principles:
685
  6. Always verify the appropriate level of aggregation matches the user's question.
686
 
687
  **PRIORITIZATION & RANKING:**
688
- 7. To prioritize locations/facilities, create a composite risk score combining: (a) population/volume, (b) relevant health indicators, and (c) recency of data.
689
  8. When ranking, consider both absolute values AND relative performance against benchmarks (provincial/national averages).
690
  9. Include sample sizes/record counts alongside rankings to indicate statistical reliability.
691
 
 
677
  **DATA INTEGRATION & LINKING:**
678
  1. When linking datasets, identify the correct join keys by examining column names and values. Never assume column names match across datasets.
679
  2. If a required column doesn't exist in a dataset, derive it from related data or clearly note its absence in the output.
680
+ 3. **DATA RECENCY IS CRITICAL:** Always prefer the most recent data when multiple time periods exist. If you have both 2013 and 2021 data, use 2021 data as the PRIMARY factor in any ranking or prioritization. Older data should only supplement, not override, recent data.
681
 
682
  **AGGREGATION & GROUPING:**
683
  4. When asked about "specialties," "categories," or "types," group by the broadest categorical column first (e.g., 'Specialty' not 'Procedure').
 
685
  6. Always verify the appropriate level of aggregation matches the user's question.
686
 
687
  **PRIORITIZATION & RANKING:**
688
+ 7. To prioritize locations/facilities, create a composite score using: (a) most recent population/membership data as PRIMARY weight (60-70%), (b) health risk indicators as SECONDARY weight (30-40%). Recent data reflects current reality better than historical data.
689
  8. When ranking, consider both absolute values AND relative performance against benchmarks (provincial/national averages).
690
  9. Include sample sizes/record counts alongside rankings to indicate statistical reliability.
691