VEDAGI1 commited on
Commit
d128705
·
verified ·
1 Parent(s): f5c4f68

Update app.py

Browse files
Files changed (1) hide show
  1. app.py +1 -0
app.py CHANGED
@@ -726,6 +726,7 @@ CRITICAL RULES:
726
  9. **SAFE ITERATION:** When iterating over mixed data structures, always check types before accessing attributes. Not all list items are dicts (some may be strings), not all values have `.items()`.
727
  10. **KEY-VALUE DATA PATTERN:** Many healthcare datasets use key-value format (e.g., columns: 'Indicator'/'Value' or 'Metric'/'Amount'). To extract a specific value, filter rows by the key column, then access the value column: `df.loc[df['Indicator'] == 'Cost per client', 'Value'].iloc[0]`
728
  11. **CONVERT STRINGS BEFORE MATH:** Always clean and convert strings to float/int BEFORE performing arithmetic. Use `re.sub(r'[^\\d.]', '', value)` to strip currency symbols ($), percentage signs (%), commas, and other non-numeric characters. For ranges like "8–10", split first, clean each part, convert to float, then calculate: `parts = text.split('–'); avg = (float(re.sub(r'[^\\d.]', '', parts[0])) + float(re.sub(r'[^\\d.]', '', parts[1]))) / 2`
 
729
 
730
  --- USER'S SCENARIO ---
731
  {user_scenario}
 
726
  9. **SAFE ITERATION:** When iterating over mixed data structures, always check types before accessing attributes. Not all list items are dicts (some may be strings), not all values have `.items()`.
727
  10. **KEY-VALUE DATA PATTERN:** Many healthcare datasets use key-value format (e.g., columns: 'Indicator'/'Value' or 'Metric'/'Amount'). To extract a specific value, filter rows by the key column, then access the value column: `df.loc[df['Indicator'] == 'Cost per client', 'Value'].iloc[0]`
728
  11. **CONVERT STRINGS BEFORE MATH:** Always clean and convert strings to float/int BEFORE performing arithmetic. Use `re.sub(r'[^\\d.]', '', value)` to strip currency symbols ($), percentage signs (%), commas, and other non-numeric characters. For ranges like "8–10", split first, clean each part, convert to float, then calculate: `parts = text.split('–'); avg = (float(re.sub(r'[^\\d.]', '', parts[0])) + float(re.sub(r'[^\\d.]', '', parts[1]))) / 2`
729
+ 12. **SCALAR VS VECTORIZED:** When applying a cleaning function to DataFrame columns, use `.apply()` for element-wise operations: `df['col'].apply(clean_func)`. Do NOT pass a Series to a function expecting a single value. For a single extracted value, use `.iloc[0]` to get the scalar before processing.
730
 
731
  --- USER'S SCENARIO ---
732
  {user_scenario}