Donlagon007 commited on
Commit
1ad58f5
·
verified ·
1 Parent(s): d4390ce

Upload 4 files

Browse files
Files changed (4) hide show
  1. README.md +148 -12
  2. hypertension_model_fixed2.py +702 -0
  3. personalized_ht4.py +1440 -0
  4. requirements.txt +8 -2
README.md CHANGED
@@ -1,20 +1,156 @@
1
  ---
2
- title: Personalized Ht
3
- emoji: 🚀
4
  colorFrom: red
5
- colorTo: red
6
- sdk: docker
7
- app_port: 8501
8
- tags:
9
- - streamlit
10
  pinned: false
11
- short_description: Streamlit template space
12
  license: mit
13
  ---
14
 
15
- # Welcome to Streamlit!
16
 
17
- Edit `/src/streamlit_app.py` to customize this app to your heart's desire. :heart:
18
 
19
- If you have any questions, checkout our [documentation](https://docs.streamlit.io) and [community
20
- forums](https://discuss.streamlit.io).
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ title: Hypertension Personalized Cost-Effectiveness Analysis
3
+ emoji: ❤️
4
  colorFrom: red
5
+ colorTo: pink
6
+ sdk: streamlit
7
+ sdk_version: "1.40.0"
8
+ app_file: personalized_ht4.py
 
9
  pinned: false
 
10
  license: mit
11
  ---
12
 
13
+ # ❤️ Hypertension Personalized Cost-Effectiveness Analysis
14
 
15
+ An AI-powered cost-effectiveness analysis tool for hypertension management with personalized recommendations using LangChain and OpenAI.
16
 
17
+ ## 🌟 Features
18
+
19
+ ### 1. **AI Assistant**
20
+ - Interactive chatbot to help understand hypertension analysis
21
+ - Answers questions about Markov models, ICER, QALYs, and risk factors
22
+
23
+ ### 2. **Hypertension Progression Analysis**
24
+ - Visualize disease progression over time
25
+ - 4-state Markov model: Normal → Prehypertension → Stage 1 HTN → Stage 2 HTN
26
+ - Calculate lifetime risk and 5/10-year progression probabilities
27
+
28
+ ### 3. **Intervention Comparison**
29
+ - Compare multiple interventions side-by-side
30
+ - Available interventions:
31
+ - Weight loss (BMI reduction)
32
+ - Waist circumference reduction
33
+ - Smoking cessation
34
+ - Alcohol reduction
35
+ - Regular exercise
36
+ - Uric acid management
37
+ - Cholesterol control
38
+ - Glucose management
39
+
40
+ ### 4. **Cost-Effectiveness Analysis**
41
+ - **Deterministic CEA**: Point estimates for costs and QALYs
42
+ - **Probabilistic CEA**: Monte Carlo simulations with uncertainty
43
+ - ICER calculations with WTP threshold analysis
44
+ - Cost-effectiveness plane and CEAC curves
45
+
46
+ ### 5. **Personalized Health Chat**
47
+ - AI health coach with personalized recommendations
48
+ - Evidence-based advice for lifestyle modifications
49
+ - Patient-specific risk factor management
50
+
51
+ ### 6. **Patient Database Management**
52
+ - Upload CSV/Excel files with patient data
53
+ - Quick retrieval by Patient ID
54
+ - Optional vector store for semantic search
55
+
56
+ ## �� Requirements
57
+
58
+ ### OpenAI API Key (Required)
59
+ This application requires an **OpenAI API key** to enable AI features:
60
+ 1. Get your API key from [OpenAI Platform](https://platform.openai.com/api-keys)
61
+ 2. Enter it in the text box at the top right of the app
62
+ 3. AI Assistant and Personalized Chat features will be enabled
63
+
64
+ ## 📊 Input Parameters
65
+
66
+ ### Patient Demographics
67
+ - Sex (Male/Female)
68
+ - Age
69
+ - Education level
70
+
71
+ ### Anthropometrics
72
+ - BMI (kg/m²)
73
+ - Waist circumference (cm)
74
+
75
+ ### Laboratory Values
76
+ - Fasting glucose (mg/dL)
77
+ - Total cholesterol (mg/dL)
78
+ - Uric acid (mg/dL)
79
+
80
+ ### Lifestyle Factors
81
+ - Smoking status
82
+ - Alcohol consumption
83
+ - Exercise frequency
84
+ - Betel nut chewing (for males)
85
+ - Family history of hypertension
86
+
87
+ ## 🎯 How to Use
88
+
89
+ 1. **Enter your OpenAI API key** in the top-right corner
90
+ 2. **Input patient information** in the left sidebar:
91
+ - Option A: Upload a CSV/Excel file with patient data
92
+ - Option B: Manually enter patient characteristics
93
+ 3. **Explore different tabs**:
94
+ - Start with "AI Assistant" to learn about the tool
95
+ - View "Hypertension Progression" for baseline risk
96
+ - Compare interventions in "Intervention Comparison"
97
+ - Run detailed CEA in the analysis tabs
98
+ 4. **Get personalized recommendations** in "Personalized Chat"
99
+
100
+ ## 📈 Cost-Effectiveness Metrics
101
+
102
+ - **ICER**: Incremental Cost-Effectiveness Ratio ($/QALY)
103
+ - **QALY**: Quality-Adjusted Life Years
104
+ - **NNT**: Number Needed to Treat
105
+ - **WTP**: Willingness-to-Pay threshold (default: $50,000/QALY)
106
+
107
+ ## 🔬 Model Details
108
+
109
+ ### Markov Model Structure
110
+ - **4 Health States**: Normal BP, Prehypertension, Stage 1 HTN, Stage 2 HTN
111
+ - **Transitions**: Based on Cox proportional hazards models
112
+ - **Risk Factors**: Gender-specific beta coefficients with standard errors
113
+ - **Time Horizon**: Configurable (5-30 years)
114
+ - **Discounting**: 3% annual discount rate (adjustable)
115
+
116
+ ### Interventions
117
+ Each intervention modifies specific risk factors to reduce hypertension progression:
118
+ - Lifestyle modifications (diet, exercise, smoking cessation)
119
+ - Clinical interventions (medication, monitoring)
120
+ - Combined approaches for personalized care
121
+
122
+ ## 🛠️ Technical Stack
123
+
124
+ - **Frontend**: Streamlit
125
+ - **AI/ML**: LangChain, OpenAI GPT-4o-mini
126
+ - **Data Processing**: NumPy, Pandas
127
+ - **Visualization**: Matplotlib
128
+ - **Vector Store**: ChromaDB (for RAG)
129
+ - **Statistical Analysis**: SciPy
130
+
131
+ ## 📝 Citation
132
+
133
+ If you use this tool in your research, please cite:
134
+ ```
135
+ Hypertension Personalized Cost-Effectiveness Analysis Tool
136
+ AI-powered CEA with LangChain and OpenAI
137
+ [Year] [Institution]
138
+ ```
139
+
140
+ ## ⚠️ Disclaimer
141
+
142
+ This tool is for **educational and research purposes only**. It should not be used as a substitute for professional medical advice, diagnosis, or treatment. Always consult with qualified healthcare providers for medical decisions.
143
+
144
+ ## 📄 License
145
+
146
+ MIT License - See LICENSE file for details
147
+
148
+ ## 🤝 Support
149
+
150
+ For issues or questions:
151
+ - Open an issue on the GitHub repository
152
+ - Contact the development team
153
+
154
+ ---
155
+
156
+ **Powered by LangChain & OpenAI | Hypertension CEA Tool v2.0**
hypertension_model_fixed2.py ADDED
@@ -0,0 +1,702 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # -*- coding: utf-8 -*-
2
+ """
3
+ Hypertension Disease Progression and Intervention Effects Model
4
+ with Beta (SE) structure and stochastic PSA option
5
+ """
6
+
7
+ import numpy as np
8
+ import pandas as pd
9
+ from scipy.linalg import expm
10
+ import matplotlib.pyplot as plt
11
+ import io
12
+ import base64
13
+
14
+ # -----------------------------------------------------------
15
+ # 1) β coefficients (log-HR) + SE (from your provided table)
16
+ # -----------------------------------------------------------
17
+
18
+ # Transition names for display
19
+ TRANSITION_NAMES_EN = ["N→P", "P→S1", "S1→S2", "P→N"]
20
+
21
+ beta_men = {
22
+ "N2P": { # Normal → Prehypertension (β1)
23
+ "Education_high": {"beta": -0.2950, "se": 0.1721},
24
+ "BMI_ge25": {"beta": 0.5275, "se": 0.1568},
25
+ "Waist_ge90": {"beta": 0.5040, "se": 0.2526},
26
+ "Fasting_glu_high": {"beta": -0.0164, "se": 0.3720},
27
+ "TC_ge200": {"beta": 0.1328, "se": 0.1385},
28
+ "UA_high": {"beta": 0.0398, "se": 0.1469},
29
+ "Smoking_current": {"beta": 0.1014, "se": 0.1455},
30
+ "Betel_current": {"beta": -0.3524, "se": 0.1871},
31
+ "Alcohol_current": {"beta": -0.2193, "se": 0.1458},
32
+ "Exercise_freq": {"beta": 0.3028, "se": 0.1754},
33
+ "FHx_yes": {"beta": 0.0896, "se": 0.1786}
34
+ },
35
+ "P2S1": { # Prehypertension → Stage 1 hypertension (β2)
36
+ "Education_high": {"beta": -0.1105, "se": 0.1079},
37
+ "BMI_ge25": {"beta": 0.1272, "se": 0.1062},
38
+ "Waist_ge90": {"beta": -0.0909, "se": 0.1222},
39
+ "Fasting_glu_high": {"beta": 0.0078, "se": 0.1778},
40
+ "TC_ge200": {"beta": -0.1197, "se": 0.0961},
41
+ "UA_high": {"beta": 0.3740, "se": 0.0986},
42
+ "Smoking_current": {"beta": -0.0505, "se": 0.1017},
43
+ "Betel_current": {"beta": 0.2878, "se": 0.1433},
44
+ "Alcohol_current": {"beta": 0.0422, "se": 0.1003},
45
+ "Exercise_freq": {"beta": 0.0642, "se": 0.1497},
46
+ "FHx_yes": {"beta": 0.2461, "se": 0.1280}
47
+ },
48
+ "S12S2": { # Stage 1 → Stage 2 hypertension (β3)
49
+ "Education_high": {"beta": -0.6211, "se": 0.3150},
50
+ "BMI_ge25": {"beta": -0.6488, "se": 0.3189},
51
+ "Waist_ge90": {"beta": 0.2272, "se": 0.3577},
52
+ "Fasting_glu_high": {"beta": 0.3553, "se": 0.4633},
53
+ "TC_ge200": {"beta": -0.0633, "se": 0.2687},
54
+ "UA_high": {"beta": 0.0411, "se": 0.2725},
55
+ "Smoking_current": {"beta": -0.3919, "se": 0.2850},
56
+ "Betel_current": {"beta": -0.0243, "se": 0.4090},
57
+ "Alcohol_current": {"beta": 0.6950, "se": 0.2863},
58
+ "Exercise_freq": {"beta": -0.5746, "se": 0.3871},
59
+ "FHx_yes": {"beta": -0.2716, "se": 0.4013}
60
+ },
61
+ "P2N": { # Prehypertension → Normal (β4)
62
+ "Education_high": {"beta": -0.3251, "se": 0.2192},
63
+ "BMI_ge25": {"beta": 0.0265, "se": 0.1978},
64
+ "Waist_ge90": {"beta": 0.4057, "se": 0.3004},
65
+ "Fasting_glu_high": {"beta": -0.3138, "se": 0.4235},
66
+ "TC_ge200": {"beta": -0.0306, "se": 0.1749},
67
+ "UA_high": {"beta": -0.3187, "se": 0.1964},
68
+ "Smoking_current": {"beta": 0.4710, "se": 0.1816},
69
+ "Betel_current": {"beta": -0.6040, "se": 0.2568},
70
+ "Alcohol_current": {"beta": -0.5499, "se": 0.1849},
71
+ "Exercise_freq": {"beta": 0.4304, "se": 0.2314},
72
+ "FHx_yes": {"beta": -0.0033, "se": 0.2351}
73
+ }
74
+ }
75
+
76
+ beta_women = {
77
+ "N2P": { # Normal → Prehypertension (β1)
78
+ "Education_high": {"beta": -0.1497, "se": 0.1029},
79
+ "BMI_ge25": {"beta": 0.3171, "se": 0.1128},
80
+ "Waist_ge80": {"beta": -0.1668, "se": 0.1167},
81
+ "Fasting_glu_high": {"beta": 0.5199, "se": 0.2591},
82
+ "TC_ge200": {"beta": 0.2077, "se": 0.0940},
83
+ "UA_high": {"beta": 0.0705, "se": 0.1161},
84
+ "Smoking_current": {"beta": -0.5675, "se": 0.1968},
85
+ "Alcohol_current": {"beta": -0.1241, "se": 0.1670},
86
+ "Exercise_freq": {"beta": -0.1400, "se": 0.1177},
87
+ "FHx_yes": {"beta": -0.0344, "se": 0.1078}
88
+ },
89
+ "P2S1": { # Prehypertension → Stage 1 hypertension (β2)
90
+ "Education_high": {"beta": -0.0813, "se": 0.1061},
91
+ "BMI_ge25": {"beta": 0.1029, "se": 0.0982},
92
+ "Waist_ge80": {"beta": -0.0223, "se": 0.1020},
93
+ "Fasting_glu_high": {"beta": 0.1663, "se": 0.1416},
94
+ "TC_ge200": {"beta": -0.0473, "se": 0.0805},
95
+ "UA_high": {"beta": 0.2912, "se": 0.0957},
96
+ "Smoking_current": {"beta": -0.4045, "se": 0.2225},
97
+ "Alcohol_current": {"beta": -0.0472, "se": 0.1685},
98
+ "Exercise_freq": {"beta": -0.0872, "se": 0.1160},
99
+ "FHx_yes": {"beta": 0.3253, "se": 0.1075}
100
+ },
101
+ "S12S2": { # Stage 1 → Stage 2 hypertension (β3)
102
+ "Education_high": {"beta": -0.3908, "se": 0.3162},
103
+ "BMI_ge25": {"beta": -0.3142, "se": 0.2948},
104
+ "Waist_ge80": {"beta": 0.1843, "se": 0.2955},
105
+ "Fasting_glu_high": {"beta": -1.3789, "se": 0.7272},
106
+ "TC_ge200": {"beta": 0.1766, "se": 0.2380},
107
+ "UA_high": {"beta": 0.0682, "se": 0.2806},
108
+ "Smoking_current": {"beta": -8.0955, "se": 21.3033},
109
+ "Alcohol_current": {"beta": 0.0780, "se": 0.5369},
110
+ "Exercise_freq": {"beta": 0.2557, "se": 0.4326},
111
+ "FHx_yes": {"beta": 0.2805, "se": 0.3081}
112
+ },
113
+ "P2N": { # Prehypertension → Normal (β4)
114
+ "Education_high": {"beta": 0.0195, "se": 0.1214},
115
+ "BMI_ge25": {"beta": -0.2213, "se": 0.1391},
116
+ "Waist_ge80": {"beta": -0.4769, "se": 0.1649},
117
+ "Fasting_glu_high": {"beta": 0.1771, "se": 0.3247},
118
+ "TC_ge200": {"beta": 0.0501, "se": 0.1156},
119
+ "UA_high": {"beta": -0.2798, "se": 0.1517},
120
+ "Smoking_current": {"beta": -0.1689, "se": 0.2330},
121
+ "Alcohol_current": {"beta": -0.1731, "se": 0.1995},
122
+ "Exercise_freq": {"beta": -0.0527, "se": 0.1429},
123
+ "FHx_yes": {"beta": -0.4005, "se": 0.1375}
124
+ }
125
+ }
126
+
127
+ # -----------------------------------------------------------
128
+ # 2) Baseline hazards
129
+ # -----------------------------------------------------------
130
+ lam10_0, lam20_0, lam30_0, rho0_0 = 0.08, 0.10, 0.12, 0.05
131
+
132
+ # -----------------------------------------------------------
133
+ # 3) Lambda calculation with stochastic option
134
+ # -----------------------------------------------------------
135
+ def calc_lambda(betas: dict, features: dict, baseline: float, randomize=False):
136
+ """Compute λ = λ0 * exp(Xβ), optionally sampling β ~ Normal(mean, SE)"""
137
+ logHR = 0.0
138
+ for k, vals in betas.items():
139
+ beta = vals["beta"]
140
+ se = vals["se"]
141
+ if randomize:
142
+ beta = np.random.normal(beta, se)
143
+ logHR += beta * features.get(k, 0)
144
+ return baseline * np.exp(logHR)
145
+
146
+
147
+ def hazards_from_beta(sex: str, features: dict,
148
+ lam10, lam20, lam30, rho0, randomize=False):
149
+ B = beta_men if sex.upper().startswith('M') else beta_women
150
+ l1 = calc_lambda(B["N2P"], features, lam10, randomize)
151
+ l2 = calc_lambda(B["P2S1"], features, lam20, randomize)
152
+ l3 = calc_lambda(B["S12S2"], features, lam30, randomize)
153
+ r = calc_lambda(B["P2N"], features, rho0, randomize)
154
+ return l1, l2, l3, r
155
+
156
+
157
+ def Q_matrix(l1, l2, l3, rho):
158
+ # State order: [N, P, S1, S2]
159
+ Q = np.zeros((4, 4))
160
+ Q[0, 1] = l1
161
+ Q[0, 0] = -l1
162
+ Q[1, 0] = rho
163
+ Q[1, 2] = l2
164
+ Q[1, 1] = -(rho + l2)
165
+ Q[2, 3] = l3
166
+ Q[2, 2] = -l3
167
+ Q[3, 3] = 0.0
168
+ return Q
169
+
170
+
171
+ def discrete_P(Q, years=1.0):
172
+ P = expm(Q * years)
173
+ # Numerical stability
174
+ P = np.clip(P, 0, 1)
175
+ P = P / P.sum(axis=1, keepdims=True)
176
+ return P
177
+
178
+
179
+ # -----------------------------------------------------------
180
+ # 4) (Optional) Calibration: achieve target 5-year S2 cumulative proportion
181
+ # -----------------------------------------------------------
182
+ def calibrate_scale(sex: str, features_ref: dict,
183
+ lam10, lam20, lam30, rho0,
184
+ target_s2_5y: float,
185
+ max_iter=40):
186
+ lo, hi = 0.2, 5.0
187
+ for _ in range(max_iter):
188
+ mid = 0.5 * (lo + hi)
189
+ l1, l2, l3, r = hazards_from_beta(sex, features_ref,
190
+ lam10 * mid, lam20 * mid, lam30 * mid, rho0)
191
+ Q = Q_matrix(l1, l2, l3, r)
192
+ P = discrete_P(Q, 1.0)
193
+ s = np.array([1, 0, 0, 0], float)
194
+ for _ in range(5):
195
+ s = s @ P
196
+ if s[3] < target_s2_5y:
197
+ lo = mid
198
+ else:
199
+ hi = mid
200
+ return 0.5 * (lo + hi)
201
+
202
+
203
+ # -----------------------------------------------------------
204
+ # 5) Markov CEA: cost, utility, discounting, ICER
205
+ # -----------------------------------------------------------
206
+ def run_markov(P, C, U, start_dist, cycles=10, discount=0.03):
207
+ s = start_dist.astype(float)
208
+ total_cost, total_qaly = 0.0, 0.0
209
+ trace = [s.copy()]
210
+ for t in range(cycles):
211
+ total_cost += float(s @ C) / ((1 + discount) ** t)
212
+ total_qaly += float(s @ U) / ((1 + discount) ** t)
213
+ s = s @ P
214
+ trace.append(s.copy())
215
+ return total_cost, total_qaly, np.vstack(trace)
216
+
217
+
218
+ def icer(costA, qalyA, costB, qalyB):
219
+ """Calculate ICER, handling edge cases"""
220
+ deltaC = costB - costA
221
+ deltaQ = qalyB - qalyA
222
+
223
+ # Handle special cases
224
+ if abs(deltaQ) < 1e-9: # QALY difference too small, consider equal
225
+ return float('inf') if deltaC > 0 else float('-inf'), deltaC, deltaQ
226
+
227
+ # Normal case
228
+ return deltaC / deltaQ, deltaC, deltaQ
229
+
230
+
231
+ # -----------------------------------------------------------
232
+ # 6) Graphics: CE plane, CEAC curve
233
+ # -----------------------------------------------------------
234
+ def plot_ce_plane(deltaQ, deltaC, icer_val, intervention_name="Intervention"):
235
+ plt.figure(figsize=(10, 7)) # 加大圖表尺寸
236
+
237
+ # Set up axes and quadrant lines
238
+ plt.axhline(0, color='gray', linestyle='--', alpha=0.7, linewidth=1)
239
+ plt.axvline(0, color='gray', linestyle='--', alpha=0.7, linewidth=1)
240
+
241
+ # Plot ICER point with larger marker
242
+ plt.scatter(deltaQ, deltaC, s=200, color='#DC143C', edgecolors='darkred',
243
+ linewidths=2, zorder=5, alpha=0.9)
244
+
245
+ # Add different explanations based on quadrant
246
+ if deltaQ > 0 and deltaC > 0: # Northeast quadrant
247
+ title_text = f"ICER = ${icer_val:,.1f}/QALY - More expensive but more effective"
248
+ quadrant = "NE"
249
+ elif deltaQ < 0 and deltaC > 0: # Northwest quadrant
250
+ title_text = f"ICER = ${icer_val:,.1f}/QALY - More expensive and less effective"
251
+ quadrant = "NW"
252
+ elif deltaQ < 0 and deltaC < 0: # Southwest quadrant
253
+ title_text = f"ICER = ${icer_val:,.1f}/QALY - Less expensive but less effective"
254
+ quadrant = "SW"
255
+ else: # Southeast quadrant
256
+ title_text = f"ICER = ${icer_val:,.1f}/QALY - Less expensive and more effective"
257
+ quadrant = "SE"
258
+
259
+ plt.title(title_text, fontsize=14, fontweight='bold', pad=20)
260
+
261
+ # Add labels
262
+ plt.xlabel("Effect Difference (QALYs)", fontsize=12, fontweight='bold')
263
+ plt.ylabel("Cost Difference ($)", fontsize=12, fontweight='bold')
264
+
265
+ # Add WTP threshold line ($50,000/QALY)
266
+ wtp = 50000
267
+ x_max = max(abs(deltaQ) * 1.3, 0.05) # 確保有足夠的範圍
268
+ x_range = [-x_max * 0.1, x_max]
269
+ plt.xlim(x_range)
270
+
271
+ # 計算 y 軸範圍
272
+ y_max = max(abs(deltaC) * 1.3, wtp * x_max * 0.5)
273
+ y_range = [-y_max * 0.2, y_max]
274
+ plt.ylim(y_range)
275
+
276
+ # 畫 WTP 閾值線
277
+ plt.plot([0, x_range[1]], [0, x_range[1] * wtp], 'k--', alpha=0.5,
278
+ linewidth=2, label=f'WTP Threshold ${wtp:,}/QALY')
279
+
280
+ # Add annotation with better positioning
281
+ # 根據點的位置調整標註位置
282
+ if deltaQ > 0 and deltaC < 0:
283
+ # 右下象限 - 標註放在左上
284
+ xytext = (-80, 40)
285
+ ha = 'right'
286
+ elif deltaQ > 0 and deltaC > 0:
287
+ # 右上象限 - 標註放在左下
288
+ xytext = (-80, -40)
289
+ ha = 'right'
290
+ elif deltaQ < 0 and deltaC < 0:
291
+ # 左下象限 - 標註放在右上
292
+ xytext = (80, 40)
293
+ ha = 'left'
294
+ else:
295
+ # 左上象限 - 標註放在右下
296
+ xytext = (80, -40)
297
+ ha = 'left'
298
+
299
+ plt.annotate(
300
+ f"{intervention_name}\nΔC=${deltaC:.1f}\nΔQ={deltaQ:.3f}",
301
+ xy=(deltaQ, deltaC),
302
+ xytext=xytext,
303
+ textcoords="offset points",
304
+ fontsize=11,
305
+ fontweight='bold',
306
+ ha=ha,
307
+ bbox=dict(boxstyle='round,pad=0.5', facecolor='yellow', alpha=0.7, edgecolor='black'),
308
+ arrowprops=dict(arrowstyle="->", connectionstyle="arc3,rad=.2",
309
+ lw=2, color='black')
310
+ )
311
+
312
+ plt.grid(alpha=0.3, linestyle=':', linewidth=0.5)
313
+ plt.legend(fontsize=11, loc='upper left', framealpha=0.9)
314
+
315
+ # 調整邊距
316
+ plt.tight_layout()
317
+
318
+ # Save figure as base64 format
319
+ buf = io.BytesIO()
320
+ plt.savefig(buf, format='png', dpi=150, bbox_inches='tight') # 提高 DPI
321
+ plt.close()
322
+ buf.seek(0)
323
+ img_str = base64.b64encode(buf.read()).decode('utf-8')
324
+
325
+ return f"data:image/png;base64,{img_str}"
326
+
327
+
328
+ def generate_psa_samples(costA, qalyA, costB, qalyB, n_samples=1000, cv=0.2):
329
+ """Generate probabilistic sensitivity analysis samples"""
330
+ samples = []
331
+
332
+ # Assume costs and QALYs follow lognormal distribution
333
+ for _ in range(n_samples):
334
+ c_a = np.random.lognormal(np.log(costA), cv)
335
+ q_a = np.random.lognormal(np.log(qalyA), cv)
336
+ c_b = np.random.lognormal(np.log(costB), cv)
337
+ q_b = np.random.lognormal(np.log(qalyB), cv)
338
+
339
+ # Calculate increments
340
+ delta_c = c_b - c_a
341
+ delta_q = q_b - q_a
342
+
343
+ # Handle division by zero
344
+ if abs(delta_q) < 1e-9:
345
+ continue
346
+
347
+ # Calculate ICER
348
+ icer_val = delta_c / delta_q
349
+
350
+ samples.append((delta_q, delta_c, icer_val))
351
+
352
+ return samples
353
+
354
+
355
+ def plot_ceac(costA, qalyA, costB, qalyB, intervention_name="Intervention", n_samples=1000):
356
+ # Generate PSA samples
357
+ samples = generate_psa_samples(costA, qalyA, costB, qalyB, n_samples)
358
+
359
+ # Set up WTP threshold range
360
+ wtp_range = np.linspace(0, 100000, 100)
361
+ prob_B_ce = []
362
+
363
+ # Calculate probability of B being cost-effective at each WTP threshold
364
+ for wtp in wtp_range:
365
+ count_ce = 0
366
+ for delta_q, delta_c, _ in samples:
367
+ # Condition for B being more cost-effective than A:
368
+ # Either (saves money and improves health) or (incremental cost/effect < WTP)
369
+ if (delta_c < 0 and delta_q > 0) or (delta_q > 0 and (delta_c / delta_q) < wtp):
370
+ count_ce += 1
371
+ prob_B_ce.append(count_ce / len(samples))
372
+
373
+ # Plot CEAC curve
374
+ plt.figure(figsize=(8, 6))
375
+ plt.plot(wtp_range, prob_B_ce, 'b-', linewidth=2)
376
+ plt.axhline(0.5, color='gray', linestyle='--', alpha=0.5)
377
+ plt.grid(alpha=0.3)
378
+ plt.xlabel("Willingness-to-Pay Threshold ($/QALY)")
379
+ plt.ylabel("Probability of Intervention Being Cost-Effective")
380
+ plt.title(f"{intervention_name} Cost-Effectiveness Acceptability Curve (CEAC)")
381
+ plt.ylim(0, 1)
382
+
383
+ # Save figure as base64 format
384
+ buf = io.BytesIO()
385
+ plt.savefig(buf, format='png', dpi=100)
386
+ plt.close()
387
+ buf.seek(0)
388
+ img_str = base64.b64encode(buf.read()).decode('utf-8')
389
+
390
+ return f"data:image/png;base64,{img_str}", wtp_range, prob_B_ce
391
+
392
+
393
+ def plot_state_distribution(trace, states, title="Health State Distribution"):
394
+ plt.figure(figsize=(8, 6))
395
+ for i, state in enumerate(states):
396
+ plt.plot(range(len(trace)), trace[:, i], label=state)
397
+
398
+ plt.xlabel("Year")
399
+ plt.ylabel("Proportion")
400
+ plt.title(title)
401
+ plt.grid(alpha=0.3)
402
+ plt.legend()
403
+
404
+ # Save figure as base64 format
405
+ buf = io.BytesIO()
406
+ plt.savefig(buf, format='png', dpi=100)
407
+ plt.close()
408
+ buf.seek(0)
409
+ img_str = base64.b64encode(buf.read()).decode('utf-8')
410
+
411
+ return f"data:image/png;base64,{img_str}"
412
+
413
+
414
+ # -----------------------------------------------------------
415
+ # 7) Wrapper: integrated function for general intervention analysis
416
+ # -----------------------------------------------------------
417
+ def person_P(sex: str, features: dict, alpha=1.0, dt=1.0):
418
+ """Get a person's annual transition probability matrix"""
419
+ l1, l2, l3, r = hazards_from_beta(sex, features,
420
+ lam10_0 * alpha, lam20_0 * alpha, lam30_0 * alpha, rho0_0)
421
+ Q = Q_matrix(l1, l2, l3, r)
422
+ P = discrete_P(Q, dt)
423
+ return P, [l1, l2, l3, r]
424
+
425
+
426
+ def run_analysis(sex: str, features: dict, intervention_feature: str,
427
+ C_A, C_B, U, cycles=10, discount_rate=0.03, target_s2_5y=0.20):
428
+ """Single intervention effect analysis"""
429
+ # Define state names
430
+ states = ["Normal", "Prehypertension", "Stage 1", "Stage 2"]
431
+
432
+ # Create copies of baseline and intervention feature dictionaries
433
+ features_base = features.copy()
434
+ features_int = features.copy()
435
+
436
+ # Set baseline and intervention values based on intervention type
437
+ if intervention_feature == "Exercise_freq":
438
+ # Exercise intervention: 0 -> 1 (increase exercise)
439
+ features_base[intervention_feature] = 0
440
+ features_int[intervention_feature] = 1
441
+ intervention_name = "Increase Exercise"
442
+
443
+ elif intervention_feature in ["BMI_ge25", "Waist_ge90", "Waist_ge80",
444
+ "Fasting_glu_high", "TC_ge200", "UA_high",
445
+ "Smoking_current", "Betel_current", "Alcohol_current"]:
446
+ # These interventions go from 1->0 (reduce risk factor)
447
+ features_base[intervention_feature] = 1
448
+ features_int[intervention_feature] = 0
449
+
450
+ if intervention_feature == "BMI_ge25":
451
+ intervention_name = "Reduce BMI to <25"
452
+ elif intervention_feature in ["Waist_ge90", "Waist_ge80"]:
453
+ intervention_name = "Reduce Waist Circumference"
454
+ elif intervention_feature == "Fasting_glu_high":
455
+ intervention_name = "Lower Fasting Glucose"
456
+ elif intervention_feature == "TC_ge200":
457
+ intervention_name = "Lower Cholesterol"
458
+ elif intervention_feature == "UA_high":
459
+ intervention_name = "Lower Uric Acid"
460
+ elif intervention_feature == "Smoking_current":
461
+ intervention_name = "Quit Smoking"
462
+ elif intervention_feature == "Betel_current":
463
+ intervention_name = "Quit Betel Nut"
464
+ elif intervention_feature == "Alcohol_current":
465
+ intervention_name = "Quit Drinking"
466
+
467
+ elif intervention_feature == "Education_high":
468
+ # Education intervention: 0 -> 1 (increase education level)
469
+ features_base[intervention_feature] = 0
470
+ features_int[intervention_feature] = 1
471
+ intervention_name = "Improve Education"
472
+
473
+ else:
474
+ # Other cases, default baseline=0, intervention=1
475
+ features_base[intervention_feature] = 0
476
+ features_int[intervention_feature] = 1
477
+ intervention_name = f"Intervention ({intervention_feature})"
478
+
479
+ # Print parameter information
480
+ print(f"Sex: {'Male' if sex.upper().startswith('M') else 'Female'}")
481
+ print(f"Intervention: {intervention_name}")
482
+ print(f"Baseline parameter: {features_base[intervention_feature]}")
483
+ print(f"Post-intervention parameter: {features_int[intervention_feature]}")
484
+
485
+ # Set reference features for calibration
486
+ ref_features = {k: 0 for k in features.keys()}
487
+
488
+ # Calibration
489
+ alpha = calibrate_scale(
490
+ sex=sex, features_ref=ref_features,
491
+ lam10=lam10_0, lam20=lam20_0, lam30=lam30_0, rho0=rho0_0,
492
+ target_s2_5y=target_s2_5y
493
+ )
494
+
495
+ # Get transition matrices
496
+ P_A, lamA = person_P(sex, features_base, alpha=alpha)
497
+ P_B, lamB = person_P(sex, features_int, alpha=alpha)
498
+
499
+ # Starting distribution
500
+ start_dist = np.array([1, 0, 0, 0], float) # Start in Normal state
501
+
502
+ # Run Markov model
503
+ cost_A, qaly_A, trace_A = run_markov(P_A, C_A, U, start_dist, cycles, discount_rate)
504
+ cost_B, qaly_B, trace_B = run_markov(P_B, C_B, U, start_dist, cycles, discount_rate)
505
+
506
+ # Calculate ICER
507
+ ICER, dC, dQ = icer(cost_A, qaly_A, cost_B, qaly_B)
508
+
509
+ # Generate charts
510
+ ce_plane_img = plot_ce_plane(dQ, dC, ICER, intervention_name)
511
+ ceac_img, wtp_range, prob_B_ce = plot_ceac(cost_A, qaly_A, cost_B, qaly_B, intervention_name)
512
+ stateA_img = plot_state_distribution(trace_A, states, "No Intervention")
513
+ stateB_img = plot_state_distribution(trace_B, states, intervention_name)
514
+
515
+ # Prepare results
516
+ results = {
517
+ "intervention": intervention_name,
518
+ "feature_name": intervention_feature,
519
+ "feature_base_value": features_base[intervention_feature],
520
+ "feature_int_value": features_int[intervention_feature],
521
+ "hazards_A": dict(zip(TRANSITION_NAMES_EN, np.round(lamA, 4))),
522
+ "hazards_B": dict(zip(TRANSITION_NAMES_EN, np.round(lamB, 4))),
523
+ "transition_matrix_A": pd.DataFrame(P_A, index=states, columns=states).round(4).to_dict(),
524
+ "transition_matrix_B": pd.DataFrame(P_B, index=states, columns=states).round(4).to_dict(),
525
+ "cost_A": cost_A,
526
+ "cost_B": cost_B,
527
+ "qaly_A": qaly_A,
528
+ "qaly_B": qaly_B,
529
+ "delta_cost": dC,
530
+ "delta_qaly": dQ,
531
+ "ICER": ICER,
532
+ "CE_plane_img": ce_plane_img,
533
+ "CEAC_img": ceac_img,
534
+ "stateA_img": stateA_img,
535
+ "stateB_img": stateB_img,
536
+ "wtp_values": wtp_range.tolist(),
537
+ "probability_cost_effective": prob_B_ce
538
+ }
539
+
540
+ # Display main results
541
+ print(f"\n--- {intervention_name} Cost-Effectiveness Analysis Results ---")
542
+ print(
543
+ f"{cycles}-year total cost: Baseline=${results['cost_A']:.1f}, Intervention=${results['cost_B']:.1f}, ΔC=${results['delta_cost']:.1f}")
544
+ print(
545
+ f"{cycles}-year total QALY: Baseline={results['qaly_A']:.3f}, Intervention={results['qaly_B']:.3f}, ΔQ={results['delta_qaly']:.3f}")
546
+ print(f"ICER = ${results['ICER']:.1f}/QALY")
547
+
548
+ # Intervention effect explanation
549
+ if dQ > 0:
550
+ if dC <= 0:
551
+ print("Conclusion: This intervention both saves money and improves health (Dominant)")
552
+ elif ICER < 50000:
553
+ print("Conclusion: This intervention is cost-effective (ICER < $50,000/QALY)")
554
+ else:
555
+ print("Conclusion: This intervention is not cost-effective")
556
+ else:
557
+ if dC >= 0:
558
+ print("Conclusion: This intervention both costs more and worsens health (Dominated)")
559
+ else:
560
+ print("Conclusion: This intervention saves money but worsens health")
561
+
562
+ return results
563
+
564
+
565
+ # -----------------------------------------------------------
566
+ # 7) Standard simulation main function - test different interventions
567
+ # -----------------------------------------------------------
568
+ def main_simulation():
569
+ # Standard male features
570
+ male_features = {
571
+ "Education_high": 0,
572
+ "BMI_ge25": 1,
573
+ "Waist_ge90": 1,
574
+ "Fasting_glu_high": 0,
575
+ "TC_ge200": 0,
576
+ "UA_high": 1,
577
+ "Smoking_current": 1,
578
+ "Betel_current": 0,
579
+ "Alcohol_current": 1,
580
+ "Exercise_freq": 0,
581
+ "FHx_yes": 1
582
+ }
583
+
584
+ # Standard female features
585
+ female_features = {
586
+ "Education_high": 0,
587
+ "BMI_ge25": 1,
588
+ "Waist_ge80": 1,
589
+ "Fasting_glu_high": 0,
590
+ "TC_ge200": 0,
591
+ "UA_high": 1,
592
+ "Smoking_current": 0,
593
+ "Betel_current": 0,
594
+ "Alcohol_current": 0,
595
+ "Exercise_freq": 0,
596
+ "FHx_yes": 1
597
+ }
598
+
599
+ # Cost and utility
600
+ C_A = np.array([200, 600, 1200, 2200]) # No intervention
601
+ U = np.array([1.00, 0.90, 0.70, 0.50]) # Utilities for each state
602
+
603
+ # Test weight loss intervention (BMI_ge25)
604
+ print("\n" + "=" * 50)
605
+ print("Testing Weight Loss Intervention (BMI_ge25: 1->0)")
606
+ print("=" * 50)
607
+
608
+ # Male weight loss
609
+ C_B_bmi = np.array([300, 650, 1250, 2250]) # Weight loss increases cost
610
+ results_bmi_m = run_analysis(
611
+ sex="M",
612
+ features=male_features,
613
+ intervention_feature="BMI_ge25", # Reduce BMI to <25
614
+ C_A=C_A,
615
+ C_B=C_B_bmi,
616
+ U=U,
617
+ cycles=10,
618
+ discount_rate=0.03,
619
+ target_s2_5y=0.2365
620
+ )
621
+
622
+ # Female weight loss
623
+ results_bmi_f = run_analysis(
624
+ sex="F",
625
+ features=female_features,
626
+ intervention_feature="BMI_ge25", # Reduce BMI to <25
627
+ C_A=C_A,
628
+ C_B=C_B_bmi,
629
+ U=U,
630
+ cycles=10,
631
+ discount_rate=0.03,
632
+ target_s2_5y=0.2365
633
+ )
634
+
635
+ # Test exercise intervention (Exercise_freq)
636
+ print("\n" + "=" * 50)
637
+ print("Testing Exercise Intervention (Exercise_freq: 0->1)")
638
+ print("=" * 50)
639
+
640
+ # Male exercise
641
+ C_B_exercise = np.array([250, 600, 1200, 2200]) # Exercise increases cost slightly
642
+ results_exercise_m = run_analysis(
643
+ sex="M",
644
+ features=male_features,
645
+ intervention_feature="Exercise_freq", # Increase exercise frequency
646
+ C_A=C_A,
647
+ C_B=C_B_exercise,
648
+ U=U,
649
+ cycles=10,
650
+ discount_rate=0.03,
651
+ target_s2_5y=0.2365
652
+ )
653
+
654
+ # Female exercise
655
+ results_exercise_f = run_analysis(
656
+ sex="F",
657
+ features=female_features,
658
+ intervention_feature="Exercise_freq", # Increase exercise frequency
659
+ C_A=C_A,
660
+ C_B=C_B_exercise,
661
+ U=U,
662
+ cycles=10,
663
+ discount_rate=0.03,
664
+ target_s2_5y=0.2365
665
+ )
666
+
667
+ # Test smoking cessation intervention (Smoking_current)
668
+ print("\n" + "=" * 50)
669
+ print("Testing Smoking Cessation Intervention (Smoking_current: 1->0)")
670
+ print("=" * 50)
671
+
672
+ # Male smoking cessation
673
+ C_B_smoking = np.array([220, 600, 1200, 2200]) # Smoking cessation increases cost slightly
674
+ results_smoking_m = run_analysis(
675
+ sex="M",
676
+ features=male_features,
677
+ intervention_feature="Smoking_current", # Quit smoking
678
+ C_A=C_A,
679
+ C_B=C_B_smoking,
680
+ U=U,
681
+ cycles=10,
682
+ discount_rate=0.03,
683
+ target_s2_5y=0.2365
684
+ )
685
+
686
+ return {
687
+ "BMI_male": results_bmi_m,
688
+ "BMI_female": results_bmi_f,
689
+ "Exercise_male": results_exercise_m,
690
+ "Exercise_female": results_exercise_f,
691
+ "Smoking_male": results_smoking_m
692
+ }
693
+
694
+
695
+ # Main program
696
+ if __name__ == "__main__":
697
+ print("=" * 50)
698
+ print("Hypertension Disease Progression and Intervention Effects Model")
699
+ print("=" * 50)
700
+
701
+ # Run main simulation
702
+ results = main_simulation()
personalized_ht4.py ADDED
@@ -0,0 +1,1440 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Streamlit App for Hypertension Cost-Effectiveness Analysis with LangChain
3
+ Enhanced with AI Assistant and Personalized Chat using OpenAI
4
+ Added: CSV upload and patient ID retrieval feature
5
+ """
6
+
7
+ import streamlit as st
8
+ import numpy as np
9
+ import pandas as pd
10
+ import matplotlib.pyplot as plt
11
+ from typing import Dict, List, Any
12
+
13
+ # LangChain imports
14
+ from langchain_openai import ChatOpenAI, OpenAIEmbeddings
15
+ from langchain_community.vectorstores import Chroma
16
+ from langchain_core.documents import Document
17
+
18
+ # Import the hypertension model functions
19
+ from hypertension_model_fixed2 import (
20
+ run_analysis, beta_men, beta_women,
21
+ hazards_from_beta, Q_matrix, discrete_P,
22
+ run_markov, icer
23
+ )
24
+
25
+ # Set page configuration
26
+ st.set_page_config(
27
+ page_title="Hypertension CEA Tool with AI",
28
+ page_icon="❤️",
29
+ layout="wide",
30
+ initial_sidebar_state="expanded",
31
+ )
32
+
33
+ # Initialize session state for chat histories
34
+ if 'assistant_messages' not in st.session_state:
35
+ st.session_state.assistant_messages = []
36
+ if 'recommendation_messages' not in st.session_state:
37
+ st.session_state.recommendation_messages = []
38
+ if 'summary_generated' not in st.session_state:
39
+ st.session_state.summary_generated = False
40
+ if 'patients_df' not in st.session_state:
41
+ st.session_state.patients_df = None
42
+ if 'vectorstore' not in st.session_state:
43
+ st.session_state.vectorstore = None
44
+
45
+ # Header with OpenAI API Key input
46
+ col1, col2 = st.columns([3, 1])
47
+ with col1:
48
+ st.title("❤️ Hypertension Personalized Cost-effectiveness Analysis")
49
+ st.markdown("*AI-powered cost-effectiveness analysis tool with LangChain*")
50
+
51
+ with col2:
52
+ openai_api_key = st.text_input(
53
+ "🔑 OpenAI API Key",
54
+ type="password",
55
+ placeholder="sk-...",
56
+ help="Enter your OpenAI API key to enable AI features"
57
+ )
58
+
59
+ if openai_api_key:
60
+ st.success("✓ API Key set")
61
+ else:
62
+ st.warning("⚠️ Enter API key")
63
+
64
+
65
+ # Check if API key is provided
66
+ def get_llm():
67
+ """Initialize LangChain LLM with OpenAI"""
68
+ if not openai_api_key:
69
+ return None
70
+
71
+ try:
72
+ llm = ChatOpenAI(
73
+ model="gpt-4o-mini",
74
+ temperature=0.7,
75
+ openai_api_key=openai_api_key
76
+ )
77
+ return llm
78
+ except Exception as e:
79
+ st.error(f"Error initializing OpenAI: {str(e)}")
80
+ return None
81
+
82
+
83
+ # Create vector store from patient data
84
+ def create_patient_vectorstore(patients_df: pd.DataFrame):
85
+ """Create vector store from patient dataframe for RAG retrieval"""
86
+ if not openai_api_key:
87
+ return None
88
+
89
+ try:
90
+ documents = []
91
+ for idx, row in patients_df.iterrows():
92
+ patient_text = f"""Patient ID: {row['patient_id']}
93
+ Sex: {row['sex']}, Age: {row['age']}, Education: {row['education']}
94
+ BMI: {row['bmi']} kg/m², Waist: {row['waist']} cm
95
+ Fasting Glucose: {row['fasting_glucose']} mg/dL
96
+ Total Cholesterol: {row['total_cholesterol']} mg/dL
97
+ Uric Acid: {row['uric_acid']} mg/dL
98
+ Smoking: {row['smoking']}, Alcohol: {row['alcohol']}, Exercise: {row['exercise']}
99
+ Betel: {row.get('betel', 'No')}, Family History: {row['family_history']}"""
100
+
101
+ doc = Document(
102
+ page_content=patient_text,
103
+ metadata={"patient_id": row['patient_id']}
104
+ )
105
+ documents.append(doc)
106
+
107
+ embeddings = OpenAIEmbeddings(openai_api_key=openai_api_key)
108
+ vectorstore = Chroma.from_documents(documents=documents, embedding=embeddings)
109
+ return vectorstore
110
+
111
+ except Exception as e:
112
+ st.error(f"Error creating vector store: {str(e)}")
113
+ return None
114
+
115
+
116
+ # Retrieve patient by ID
117
+ def retrieve_patient_by_id(patient_id: str):
118
+ """Retrieve patient from dataframe by ID"""
119
+ if st.session_state.patients_df is None:
120
+ return None
121
+
122
+ patient_row = st.session_state.patients_df[
123
+ st.session_state.patients_df['patient_id'] == patient_id
124
+ ]
125
+
126
+ if patient_row.empty:
127
+ return None
128
+
129
+ return patient_row.iloc[0].to_dict()
130
+
131
+
132
+ # Sidebar for patient information
133
+ st.sidebar.header("👤 Patient Information")
134
+
135
+ # ===== NEW: Patient Data Upload Section =====
136
+ with st.sidebar.expander("📁 Upload Patient Database (Optional)", expanded=False):
137
+ uploaded_file = st.file_uploader(
138
+ "Upload CSV/Excel with patient data",
139
+ type=['csv', 'xlsx'],
140
+ help="Upload a file with multiple patients to enable quick retrieval by ID"
141
+ )
142
+
143
+ if uploaded_file is not None:
144
+ try:
145
+ if uploaded_file.name.endswith('.csv'):
146
+ df = pd.read_csv(uploaded_file)
147
+ else:
148
+ df = pd.read_excel(uploaded_file)
149
+
150
+ st.session_state.patients_df = df
151
+ st.success(f"✅ Loaded {len(df)} patients")
152
+
153
+ # Optionally create vector store
154
+ if openai_api_key and st.button("🔄 Create Vector Store for Smart Search"):
155
+ with st.spinner("Creating vector store..."):
156
+ vectorstore = create_patient_vectorstore(df)
157
+ if vectorstore:
158
+ st.session_state.vectorstore = vectorstore
159
+ st.success("✅ Vector store created!")
160
+
161
+ except Exception as e:
162
+ st.error(f"Error loading file: {str(e)}")
163
+
164
+ # Patient ID retrieval
165
+ if st.session_state.patients_df is not None:
166
+ st.markdown("---")
167
+ patient_id_input = st.text_input("🔍 Enter Patient ID", placeholder="P001")
168
+
169
+ if st.button("📥 Load Patient Data"):
170
+ if patient_id_input:
171
+ patient_data = retrieve_patient_by_id(patient_id_input)
172
+ if patient_data:
173
+ st.session_state.loaded_patient = patient_data
174
+ st.success(f"✅ Loaded {patient_id_input}")
175
+ st.rerun()
176
+ else:
177
+ st.error(f"Patient {patient_id_input} not found")
178
+
179
+ st.sidebar.markdown("---")
180
+
181
+ # Check if we have loaded patient data
182
+ if 'loaded_patient' in st.session_state:
183
+ # Use loaded patient data
184
+ p = st.session_state.loaded_patient
185
+
186
+ st.sidebar.info(f"📋 Loaded: {p['patient_id']}")
187
+
188
+ sex = st.sidebar.radio("Biological Sex", ["Male", "Female"],
189
+ index=0 if p['sex'] == 'Male' else 1, key="sex_loaded")
190
+
191
+ st.sidebar.markdown("### Demographics")
192
+ age = st.sidebar.number_input("Age", min_value=18, max_value=100, value=int(p['age']), key="age_loaded")
193
+ education_idx = 0 if p['education'] in ['High', 'High (College or above)'] else 1
194
+ education = st.sidebar.selectbox("Education Level",
195
+ ["High (College or above)", "Low (Below college)"],
196
+ index=education_idx, key="edu_loaded")
197
+
198
+ st.sidebar.markdown("### Anthropometrics")
199
+ bmi = st.sidebar.number_input("BMI (kg/m²)", min_value=15.0, max_value=50.0,
200
+ value=float(p['bmi']), format="%.1f", key="bmi_loaded")
201
+ waist = st.sidebar.number_input("Waist Circumference (cm)", min_value=50, max_value=150,
202
+ value=int(p['waist']), key="waist_loaded")
203
+
204
+ st.sidebar.markdown("### Laboratory Values")
205
+ fasting_glucose = st.sidebar.number_input("Fasting Glucose (mg/dL)", min_value=70, max_value=300,
206
+ value=int(p['fasting_glucose']), key="glucose_loaded")
207
+ total_cholesterol = st.sidebar.number_input("Total Cholesterol (mg/dL)", min_value=100, max_value=400,
208
+ value=int(p['total_cholesterol']), key="chol_loaded")
209
+ uric_acid = st.sidebar.number_input("Uric Acid (mg/dL)", min_value=2.0, max_value=15.0,
210
+ value=float(p['uric_acid']), format="%.1f", key="ua_loaded")
211
+
212
+ st.sidebar.markdown("### Lifestyle Factors")
213
+ smoking_idx = 0 if p['smoking'] in ['No', 'Non-smoker'] else 1
214
+ smoking = st.sidebar.selectbox("Smoking Status", ["Non-smoker", "Current smoker"],
215
+ index=smoking_idx, key="smoke_loaded")
216
+
217
+ alcohol_idx = 0 if p['alcohol'] in ['No', 'None/Occasional'] else 1
218
+ alcohol = st.sidebar.selectbox("Alcohol Consumption", ["None/Occasional", "Regular drinker"],
219
+ index=alcohol_idx, key="alcohol_loaded")
220
+
221
+ exercise_idx = 0 if p['exercise'] in ['No', 'Infrequent'] else 1
222
+ exercise = st.sidebar.selectbox("Exercise Frequency", ["Infrequent", "Regular (≥3 times/week)"],
223
+ index=exercise_idx, key="exercise_loaded")
224
+
225
+ if sex == "Male":
226
+ betel_idx = 0 if p.get('betel', 'No') == 'No' else 1
227
+ betel = st.sidebar.selectbox("Betel Nut Chewing", ["No", "Yes"],
228
+ index=betel_idx, key="betel_loaded")
229
+
230
+ family_idx = 0 if p['family_history'] == 'No' else 1
231
+ family_history = st.sidebar.selectbox("Family History of Hypertension", ["No", "Yes"],
232
+ index=family_idx, key="fh_loaded")
233
+
234
+ else:
235
+ # Manual input (original functionality)
236
+ sex = st.sidebar.radio("Biological Sex", ["Male", "Female"])
237
+
238
+ st.sidebar.markdown("### Demographics")
239
+ age = st.sidebar.number_input("Age", min_value=18, max_value=100, value=45)
240
+ education = st.sidebar.selectbox("Education Level", ["High (College or above)", "Low (Below college)"])
241
+
242
+ st.sidebar.markdown("### Anthropometrics")
243
+ bmi = st.sidebar.number_input("BMI (kg/m²)", min_value=15.0, max_value=50.0, value=27.0, format="%.1f")
244
+ if sex == "Male":
245
+ waist = st.sidebar.number_input("Waist Circumference (cm)", min_value=60, max_value=150, value=88)
246
+ else:
247
+ waist = st.sidebar.number_input("Waist Circumference (cm)", min_value=50, max_value=150, value=78)
248
+
249
+ st.sidebar.markdown("### Laboratory Values")
250
+ fasting_glucose = st.sidebar.number_input("Fasting Glucose (mg/dL)", min_value=70, max_value=300, value=100)
251
+ total_cholesterol = st.sidebar.number_input("Total Cholesterol (mg/dL)", min_value=100, max_value=400, value=190)
252
+ if sex == "Male":
253
+ uric_acid = st.sidebar.number_input("Uric Acid (mg/dL)", min_value=2.0, max_value=15.0, value=6.5,
254
+ format="%.1f")
255
+ else:
256
+ uric_acid = st.sidebar.number_input("Uric Acid (mg/dL)", min_value=2.0, max_value=15.0, value=5.5,
257
+ format="%.1f")
258
+
259
+ st.sidebar.markdown("### Lifestyle Factors")
260
+ smoking = st.sidebar.selectbox("Smoking Status", ["Non-smoker", "Current smoker"])
261
+ alcohol = st.sidebar.selectbox("Alcohol Consumption", ["None/Occasional", "Regular drinker"])
262
+ exercise = st.sidebar.selectbox("Exercise Frequency", ["Infrequent", "Regular (≥3 times/week)"])
263
+
264
+ if sex == "Male":
265
+ betel = st.sidebar.selectbox("Betel Nut Chewing", ["No", "Yes"])
266
+
267
+ family_history = st.sidebar.selectbox("Family History of Hypertension", ["No", "Yes"])
268
+
269
+
270
+ # Convert inputs to feature dictionary
271
+ def create_feature_dict():
272
+ features = {}
273
+
274
+ if sex == "Male":
275
+ features["Education_high"] = 1 if education == "High (College or above)" else 0
276
+ features["BMI_ge25"] = 1 if bmi >= 25 else 0
277
+ features["Waist_ge90"] = 1 if waist >= 90 else 0
278
+ features["Fasting_glu_high"] = 1 if fasting_glucose >= 110 else 0
279
+ features["TC_ge200"] = 1 if total_cholesterol >= 200 else 0
280
+ features["UA_high"] = 1 if uric_acid >= 7 else 0
281
+ features["Smoking_current"] = 1 if smoking == "Current smoker" else 0
282
+ features["Betel_current"] = 1 if betel == "Yes" else 0
283
+ features["Alcohol_current"] = 1 if alcohol == "Regular drinker" else 0
284
+ features["Exercise_freq"] = 1 if exercise == "Regular (≥3 times/week)" else 0
285
+ features["FHx_yes"] = 1 if family_history == "Yes" else 0
286
+ else: # Female
287
+ features["Education_high"] = 1 if education == "High (College or above)" else 0
288
+ features["BMI_ge25"] = 1 if bmi >= 25 else 0
289
+ features["Waist_ge80"] = 1 if waist >= 80 else 0
290
+ features["Fasting_glu_high"] = 1 if fasting_glucose >= 110 else 0
291
+ features["TC_ge200"] = 1 if total_cholesterol >= 200 else 0
292
+ features["UA_high"] = 1 if uric_acid >= 6 else 0
293
+ features["Smoking_current"] = 1 if smoking == "Current smoker" else 0
294
+ features["Alcohol_current"] = 1 if alcohol == "Regular drinker" else 0
295
+ features["Exercise_freq"] = 1 if exercise == "Regular (≥3 times/week)" else 0
296
+ features["FHx_yes"] = 1 if family_history == "Yes" else 0
297
+
298
+ return features
299
+
300
+
301
+ # Get patient features and info string
302
+ patient_features = create_feature_dict()
303
+
304
+
305
+ def get_patient_info_string():
306
+ """Generate patient info string for AI context"""
307
+ risk_factors = []
308
+
309
+ if patient_features.get("BMI_ge25", 0) == 1:
310
+ risk_factors.append(f"BMI {bmi:.1f} kg/m² (≥25)")
311
+ if (sex == "Male" and patient_features.get("Waist_ge90", 0) == 1) or \
312
+ (sex == "Female" and patient_features.get("Waist_ge80", 0) == 1):
313
+ risk_factors.append(f"Waist circumference {waist} cm (high)")
314
+ if patient_features.get("Smoking_current", 0) == 1:
315
+ risk_factors.append("Current smoker")
316
+ if patient_features.get("Alcohol_current", 0) == 1:
317
+ risk_factors.append("Regular alcohol consumption")
318
+ if sex == "Male" and patient_features.get("Betel_current", 0) == 1:
319
+ risk_factors.append("Betel nut chewing")
320
+ if patient_features.get("Exercise_freq", 0) == 0:
321
+ risk_factors.append("Insufficient exercise")
322
+ if patient_features.get("FHx_yes", 0) == 1:
323
+ risk_factors.append("Family history of hypertension")
324
+ if patient_features.get("UA_high", 0) == 1:
325
+ risk_factors.append(f"High uric acid ({uric_acid:.1f} mg/dL)")
326
+ if patient_features.get("Fasting_glu_high", 0) == 1:
327
+ risk_factors.append(f"High fasting glucose ({fasting_glucose} mg/dL)")
328
+ if patient_features.get("TC_ge200", 0) == 1:
329
+ risk_factors.append(f"High total cholesterol ({total_cholesterol} mg/dL)")
330
+
331
+ patient_id_str = ""
332
+ if 'loaded_patient' in st.session_state:
333
+ patient_id_str = f"Patient ID: {st.session_state.loaded_patient['patient_id']}\n"
334
+
335
+ info = f"""{patient_id_str}{sex}, Age {age}
336
+ BMI: {bmi:.1f} kg/m²
337
+ Waist: {waist} cm
338
+ Exercise: {exercise}
339
+ Smoking: {smoking}
340
+ Alcohol: {alcohol}
341
+
342
+ Risk Factors Identified:
343
+ """
344
+ if risk_factors:
345
+ info += "\n".join(f"- {rf}" for rf in risk_factors)
346
+ else:
347
+ info += "- No major modifiable risk factors detected"
348
+
349
+ return info
350
+
351
+
352
+ # Create tabs
353
+ tab1, tab2, tab3, tab4, tab5, tab6 = st.tabs([
354
+ "🤖 AI Assistant",
355
+ "📊 Hypertension Progression",
356
+ "📈 Intervention Comparison",
357
+ "💰 CEA (Deterministic)",
358
+ "🎲 CEA (Probabilistic)",
359
+ "💬 Personalized Chat"
360
+ ])
361
+
362
+ # Tab 1: AI Assistant
363
+ with tab1:
364
+ st.subheader("AI Assistant - Ask Questions About This Tool")
365
+
366
+ if not openai_api_key:
367
+ st.warning("⚠️ Please enter your OpenAI API key in the top right corner to use the AI Assistant.")
368
+ else:
369
+ # Display chat messages
370
+ for message in st.session_state.assistant_messages:
371
+ with st.chat_message(message["role"]):
372
+ st.markdown(message["content"])
373
+
374
+ # Chat input
375
+ if prompt := st.chat_input("Ask me anything about hypertension analysis..."):
376
+ # Add user message
377
+ st.session_state.assistant_messages.append({"role": "user", "content": prompt})
378
+ with st.chat_message("user"):
379
+ st.markdown(prompt)
380
+
381
+ # Get AI response
382
+ with st.chat_message("assistant"):
383
+ with st.spinner("Thinking..."):
384
+ try:
385
+ llm = get_llm()
386
+ if llm:
387
+ history_text = ""
388
+ for msg in st.session_state.assistant_messages[-10:]:
389
+ role = "User" if msg["role"] == "user" else "Assistant"
390
+ history_text += f"{role}: {msg['content']}\n\n"
391
+
392
+ full_prompt = f"""You are an expert AI assistant for a Hypertension Cost-Effectiveness Analysis tool.
393
+
394
+ Your role is to help users understand:
395
+ - How to use this analysis tool
396
+ - Markov model methodology (4 states: Normal → Prehypertension → Stage 1 HTN → Stage 2 HTN)
397
+ - Cost-effectiveness metrics (ICER, QALY, CEAC)
398
+ - Risk factor interpretation
399
+ - Available interventions
400
+
401
+ Be clear, concise, and educational. Use examples when helpful.
402
+
403
+ IMPORTANT:
404
+ - Use plain text formatting only (no LaTeX, no \\text{{}} or \\frac{{}}{{}} syntax)
405
+ - Write mathematical formulas in plain text like: ICER = (Cost_B - Cost_A) / (QALY_B - QALY_A)
406
+ - Use simple markdown formatting (**, -, numbers) for emphasis
407
+ - Avoid special characters that may not render correctly
408
+
409
+ Conversation History:
410
+ {history_text}
411
+
412
+ User Question: {prompt}
413
+
414
+ Your Response:"""
415
+
416
+ response = llm.invoke(full_prompt).content
417
+ response = response.replace("\\text{", "").replace("}", "")
418
+ response = response.replace("\\frac{", "(").replace("}{", ")/(")
419
+ st.markdown(response, unsafe_allow_html=False)
420
+ st.session_state.assistant_messages.append({"role": "assistant", "content": response})
421
+ else:
422
+ st.error("Failed to initialize AI. Please check your API key.")
423
+ except Exception as e:
424
+ st.error(f"Error: {str(e)}")
425
+
426
+ # Tab 2: Hypertension Progression
427
+ with tab2:
428
+ st.subheader("Baseline Hypertension Progression Risk")
429
+
430
+ # Create columns for different visualizations
431
+ prog_col1, prog_col2 = st.columns([1, 1])
432
+
433
+ # Prediction settings
434
+ with prog_col1:
435
+ st.write("Progression Projection Settings")
436
+ projection_years = st.slider("Projection Horizon (Years)", min_value=1, max_value=20, value=10)
437
+
438
+ # Calculate the transition rates
439
+ sex_code = "M" if sex == "Male" else "F"
440
+ l1, l2, l3, r = hazards_from_beta(sex_code, patient_features,
441
+ lam10=0.08, lam20=0.10, lam30=0.12, rho0=0.05, randomize=True)
442
+
443
+ # Display the annual transition rates
444
+ st.write("Annual Transition Rates:")
445
+ rates_df = pd.DataFrame({
446
+ "Transition": ["Normal → Prehypertension", "Prehypertension → Stage 1",
447
+ "Stage 1 → Stage 2", "Prehypertension → Normal"],
448
+ "Annual Rate (%)": [l1 * 100, l2 * 100, l3 * 100, r * 100]
449
+ })
450
+ st.dataframe(rates_df)
451
+
452
+ # Calculate lifetime risk
453
+ Q = Q_matrix(l1, l2, l3, r)
454
+ P = discrete_P(Q, 1.0)
455
+
456
+ # Project the state distribution
457
+ states = ["Normal", "Prehypertension", "Stage 1", "Stage 2"]
458
+ s = np.array([1, 0, 0, 0], float) # Start in Normal state
459
+
460
+ # Project over time
461
+ projections = [s.copy()]
462
+ for _ in range(projection_years):
463
+ s = s @ P
464
+ projections.append(s.copy())
465
+
466
+ proj_df = pd.DataFrame(projections, columns=states)
467
+ proj_df.index.name = "Year"
468
+
469
+ # Calculate lifetime risk of progressing to Stage 2
470
+ lifetime_risk_s2 = proj_df["Stage 2"].iloc[-1] * 100
471
+
472
+ # Visualization of progression over time
473
+ with prog_col2:
474
+ # Plot state distribution over time
475
+ fig, ax = plt.subplots(figsize=(8, 5))
476
+ for i, state in enumerate(states):
477
+ ax.plot(range(projection_years + 1), proj_df[state], label=state)
478
+
479
+ ax.set_xlabel("Year")
480
+ ax.set_ylabel("Proportion")
481
+ ax.set_title("Projected Hypertension State Distribution Over Time")
482
+ ax.legend()
483
+ ax.grid(alpha=0.3)
484
+
485
+ st.pyplot(fig)
486
+
487
+ # Summary metrics
488
+ st.subheader("Summary Risk Metrics")
489
+
490
+ risk_cols = st.columns(5)
491
+
492
+ # 5-year risks
493
+ risk_5yr_p = projections[5][1]
494
+ risk_5yr_htn = projections[5][2] + projections[5][3]
495
+ risk_5yr_s2 = projections[5][3]
496
+
497
+ # 10-year risks
498
+ year_10_idx = min(10, projection_years)
499
+ risk_10yr_p = projections[year_10_idx][1]
500
+ risk_10yr_htn = projections[year_10_idx][2] + projections[year_10_idx][3]
501
+ risk_10yr_s2 = projections[year_10_idx][3]
502
+
503
+ # Display risk metrics
504
+ with risk_cols[0]:
505
+ st.metric("5-Year Prehypertension Risk", f"{risk_5yr_p * 100:.1f}%")
506
+
507
+ with risk_cols[1]:
508
+ st.metric("5-Year Any Hypertension Risk", f"{risk_5yr_htn * 100:.1f}%")
509
+
510
+ with risk_cols[2]:
511
+ st.metric("5-Year Stage 2 Risk", f"{risk_5yr_s2 * 100:.1f}%")
512
+
513
+ with risk_cols[3]:
514
+ st.metric("10-Year Any Hypertension Risk", f"{risk_10yr_htn * 100:.1f}%")
515
+
516
+ with risk_cols[4]:
517
+ st.metric("Lifetime Stage 2 Risk", f"{lifetime_risk_s2:.1f}%")
518
+
519
+ # Comparison to population averages
520
+ st.write("#### Risk Comparison to Population Average")
521
+
522
+ avg_5yr_htn = 0.08
523
+ avg_10yr_htn = 0.18
524
+
525
+ risk_ratio_5yr = risk_5yr_htn / avg_5yr_htn if avg_5yr_htn > 0 else 1.0
526
+ risk_ratio_10yr = risk_10yr_htn / avg_10yr_htn if avg_10yr_htn > 0 else 1.0
527
+
528
+ st.write(f"This patient's 5-year risk of hypertension is **{risk_ratio_5yr:.1f}x** the population average.")
529
+ st.write(f"This patient's 10-year risk of hypertension is **{risk_ratio_10yr:.1f}x** the population average.")
530
+
531
+ # Risk factors explanation
532
+ st.write("#### Key Risk Factors")
533
+
534
+ # Identify high risk factors
535
+ risk_factors = []
536
+
537
+ if patient_features.get("BMI_ge25", 0) == 1:
538
+ risk_factors.append(f"BMI ≥ 25 kg/m² (current: {bmi:.1f})")
539
+
540
+ if (sex == "Male" and patient_features.get("Waist_ge90", 0) == 1) or (
541
+ sex == "Female" and patient_features.get("Waist_ge80", 0) == 1):
542
+ risk_factors.append(f"High waist circumference (current: {waist} cm)")
543
+
544
+ if patient_features.get("Smoking_current", 0) == 1:
545
+ risk_factors.append("Current smoker")
546
+
547
+ if patient_features.get("Alcohol_current", 0) == 1:
548
+ risk_factors.append("Regular alcohol consumption")
549
+
550
+ if sex == "Male" and patient_features.get("Betel_current", 0) == 1:
551
+ risk_factors.append("Betel nut chewing")
552
+
553
+ if patient_features.get("Exercise_freq", 0) == 0:
554
+ risk_factors.append("Infrequent exercise")
555
+
556
+ if patient_features.get("FHx_yes", 0) == 1:
557
+ risk_factors.append("Family history of hypertension")
558
+
559
+ if patient_features.get("UA_high", 0) == 1:
560
+ risk_factors.append(f"High uric acid (current: {uric_acid:.1f} mg/dL)")
561
+
562
+ if patient_features.get("Fasting_glu_high", 0) == 1:
563
+ risk_factors.append(f"High fasting glucose (current: {fasting_glucose} mg/dL)")
564
+
565
+ if patient_features.get("TC_ge200", 0) == 1:
566
+ risk_factors.append(f"High total cholesterol (current: {total_cholesterol} mg/dL)")
567
+
568
+ # Display risk factors
569
+ if risk_factors:
570
+ st.write("This patient has the following risk factors:")
571
+ for factor in risk_factors:
572
+ st.write(f"- {factor}")
573
+ else:
574
+ st.write("This patient has no major modifiable risk factors.")
575
+
576
+ # Tab 3: Intervention Comparison
577
+ with tab3:
578
+ st.subheader("Compare Intervention Effects")
579
+
580
+ # Choose interventions to compare
581
+ available_interventions = []
582
+
583
+ if patient_features.get("BMI_ge25", 0) == 1:
584
+ available_interventions.append(("BMI_ge25", "Weight Loss (BMI < 25 kg/m²)"))
585
+
586
+ if (sex == "Male" and patient_features.get("Waist_ge90", 0) == 1) or (
587
+ sex == "Female" and patient_features.get("Waist_ge80", 0) == 1):
588
+ waist_feature = "Waist_ge90" if sex == "Male" else "Waist_ge80"
589
+ available_interventions.append((waist_feature, "Waist Circumference Reduction"))
590
+
591
+ if patient_features.get("Smoking_current", 0) == 1:
592
+ available_interventions.append(("Smoking_current", "Smoking Cessation"))
593
+
594
+ if patient_features.get("Alcohol_current", 0) == 1:
595
+ available_interventions.append(("Alcohol_current", "Alcohol Reduction"))
596
+
597
+ if sex == "Male" and patient_features.get("Betel_current", 0) == 1:
598
+ available_interventions.append(("Betel_current", "Betel Nut Cessation"))
599
+
600
+ if patient_features.get("Exercise_freq", 0) == 0:
601
+ available_interventions.append(("Exercise_freq", "Regular Exercise"))
602
+
603
+ if patient_features.get("UA_high", 0) == 1:
604
+ available_interventions.append(("UA_high", "Uric Acid Reduction"))
605
+
606
+ if patient_features.get("TC_ge200", 0) == 1:
607
+ available_interventions.append(("TC_ge200", "Cholesterol Reduction"))
608
+
609
+ if patient_features.get("Fasting_glu_high", 0) == 1:
610
+ available_interventions.append(("Fasting_glu_high", "Glucose Control"))
611
+
612
+ if not available_interventions:
613
+ st.write("No modifiable risk factors available for intervention.")
614
+ else:
615
+ selected_intervention_names = st.multiselect(
616
+ "Select interventions to compare:",
617
+ [name for _, name in available_interventions],
618
+ max_selections=3
619
+ )
620
+
621
+ selected_interventions = [
622
+ feature for feature, name in available_interventions
623
+ if name in selected_intervention_names
624
+ ]
625
+
626
+ comp_years = st.slider("Comparison Projection (Years)", min_value=1, max_value=20, value=10, key="comp_years")
627
+
628
+ if selected_interventions:
629
+ sex_code = "M" if sex == "Male" else "F"
630
+ l1_base, l2_base, l3_base, r_base = hazards_from_beta(
631
+ sex_code, patient_features,
632
+ lam10=0.08, lam20=0.10, lam30=0.12, rho0=0.05
633
+ )
634
+ Q_base = Q_matrix(l1_base, l2_base, l3_base, r_base)
635
+ P_base = discrete_P(Q_base, 1.0)
636
+
637
+ states = ["Normal", "Prehypertension", "Stage 1", "Stage 2"]
638
+ s_base = np.array([1, 0, 0, 0], float)
639
+ proj_base = [s_base.copy()]
640
+
641
+ for _ in range(comp_years):
642
+ s_base = s_base @ P_base
643
+ proj_base.append(s_base.copy())
644
+
645
+ proj_base_df = pd.DataFrame(proj_base, columns=states)
646
+ proj_base_df.index.name = "Year"
647
+
648
+ intervention_data = []
649
+
650
+ for feature in selected_interventions:
651
+ int_features = patient_features.copy()
652
+
653
+ if feature == "Exercise_freq":
654
+ int_features[feature] = 1
655
+ else:
656
+ int_features[feature] = 0
657
+
658
+ l1_int, l2_int, l3_int, r_int = hazards_from_beta(
659
+ sex_code, int_features,
660
+ lam10=0.08, lam20=0.10, lam30=0.12, rho0=0.05
661
+ )
662
+
663
+ Q_int = Q_matrix(l1_int, l2_int, l3_int, r_int)
664
+ P_int = discrete_P(Q_int, 1.0)
665
+
666
+ s_int = np.array([1, 0, 0, 0], float)
667
+ proj_int = [s_int.copy()]
668
+
669
+ for _ in range(comp_years):
670
+ s_int = s_int @ P_int
671
+ proj_int.append(s_int.copy())
672
+
673
+ proj_int_df = pd.DataFrame(proj_int, columns=states)
674
+
675
+ baseline_htn_risk = proj_base_df["Stage 1"].iloc[-1] + proj_base_df["Stage 2"].iloc[-1]
676
+ int_htn_risk = proj_int_df["Stage 1"].iloc[-1] + proj_int_df["Stage 2"].iloc[-1]
677
+
678
+ absolute_risk_reduction = baseline_htn_risk - int_htn_risk
679
+ relative_risk_reduction = absolute_risk_reduction / baseline_htn_risk if baseline_htn_risk > 0 else 0
680
+
681
+ nnt = 1 / absolute_risk_reduction if absolute_risk_reduction > 0 else float('inf')
682
+
683
+ int_name = next(name for feat, name in available_interventions if feat == feature)
684
+
685
+ intervention_data.append({
686
+ "feature": feature,
687
+ "name": int_name,
688
+ "projection": proj_int_df,
689
+ "risk_reduction_abs": absolute_risk_reduction,
690
+ "risk_reduction_rel": relative_risk_reduction,
691
+ "nnt": nnt
692
+ })
693
+
694
+ st.write("#### Hypertension Risk Comparison")
695
+
696
+ fig, ax = plt.subplots(figsize=(10, 6))
697
+
698
+ baseline_htn_risk = [
699
+ proj_base_df["Stage 1"].iloc[i] + proj_base_df["Stage 2"].iloc[i]
700
+ for i in range(len(proj_base_df))
701
+ ]
702
+ ax.plot(range(comp_years + 1), baseline_htn_risk, 'k-', linewidth=2, label="No Intervention")
703
+
704
+ colors = ['b', 'g', 'r', 'c', 'm', 'y']
705
+ for i, int_data in enumerate(intervention_data):
706
+ int_htn_risk = [
707
+ int_data["projection"]["Stage 1"].iloc[j] + int_data["projection"]["Stage 2"].iloc[j]
708
+ for j in range(len(int_data["projection"]))
709
+ ]
710
+ ax.plot(
711
+ range(comp_years + 1),
712
+ int_htn_risk,
713
+ f"{colors[i % len(colors)]}-",
714
+ linewidth=2,
715
+ label=int_data["name"]
716
+ )
717
+
718
+ ax.set_xlabel("Year")
719
+ ax.set_ylabel("Probability of Hypertension (Stage 1 or 2)")
720
+ ax.set_title(f"Effect of Interventions on {comp_years}-Year Hypertension Risk")
721
+ ax.legend()
722
+ ax.grid(alpha=0.3)
723
+
724
+ st.pyplot(fig)
725
+
726
+ st.write("#### Effectiveness Comparison")
727
+
728
+ metrics_data = {
729
+ "Intervention": ["No Intervention"] + [int_data["name"] for int_data in intervention_data],
730
+ f"{comp_years}-Year HTN Risk": [
731
+ baseline_htn_risk[-1] * 100
732
+ ] + [
733
+ (baseline_htn_risk[-1] - int_data["risk_reduction_abs"]) * 100
734
+ for int_data in intervention_data
735
+ ],
736
+ "Absolute Risk Reduction (%)": [
737
+ 0
738
+ ] + [
739
+ int_data["risk_reduction_abs"] * 100
740
+ for int_data in intervention_data
741
+ ],
742
+ "Relative Risk Reduction (%)": [
743
+ 0
744
+ ] + [
745
+ int_data["risk_reduction_rel"] * 100
746
+ for int_data in intervention_data
747
+ ],
748
+ "Number Needed to Treat": [
749
+ "N/A"
750
+ ] + [
751
+ f"{int_data['nnt']:.1f}" if int_data["nnt"] < 100 else "100+"
752
+ for int_data in intervention_data
753
+ ]
754
+ }
755
+
756
+ metrics_df = pd.DataFrame(metrics_data)
757
+ st.table(metrics_df.set_index("Intervention"))
758
+
759
+ if intervention_data:
760
+ most_effective = max(intervention_data, key=lambda x: x["risk_reduction_abs"])
761
+
762
+ st.info(
763
+ f"**Recommendation**: Based on this analysis, "
764
+ f"**{most_effective['name']}** provides the greatest reduction in "
765
+ f"{comp_years}-year hypertension risk "
766
+ f"({most_effective['risk_reduction_abs'] * 100:.1f}% absolute reduction)."
767
+ )
768
+
769
+ else:
770
+ st.write("Please select at least one intervention to compare.")
771
+
772
+ # Tab 4: Cost-Effectiveness Analysis (Deterministic)
773
+ # Tab 4: Cost-Effectiveness Analysis (Deterministic)
774
+ with tab4:
775
+ st.subheader("Cost-Effectiveness Analysis (Deterministic)")
776
+ st.info("📌 This analysis uses **point estimates** (single values) for all parameters")
777
+
778
+ st.write("### Analysis Settings")
779
+
780
+ if available_interventions:
781
+ cea_intervention = st.selectbox(
782
+ "Select intervention to analyze:",
783
+ [name for _, name in available_interventions],
784
+ index=0
785
+ )
786
+
787
+ cea_feature = next(
788
+ feature for feature, name in available_interventions
789
+ if name == cea_intervention
790
+ )
791
+
792
+ param_col1, param_col2, param_col3 = st.columns(3)
793
+
794
+ with param_col1:
795
+ cea_cycles = st.slider("Time Horizon (Years)", min_value=5, max_value=30, value=10)
796
+ discount_rate = st.slider("Discount Rate (%)", min_value=0, max_value=10, value=3) / 100
797
+
798
+ with param_col2:
799
+ st.write("#### Cost Settings ($ per year)")
800
+ cost_normal = st.number_input("Cost - Normal BP", min_value=0, max_value=5000, value=200)
801
+ cost_pre = st.number_input("Cost - Prehypertension", min_value=0, max_value=5000, value=600)
802
+ cost_s1 = st.number_input("Cost - Stage 1 HTN", min_value=0, max_value=5000, value=1200)
803
+ cost_s2 = st.number_input("Cost - Stage 2 HTN", min_value=0, max_value=5000, value=2200)
804
+
805
+ with param_col3:
806
+ st.write("#### Utility Settings (QOL 0-1)")
807
+ util_normal = st.slider("Utility - Normal BP", min_value=0.0, max_value=1.0, value=1.0, step=0.05)
808
+ util_pre = st.slider("Utility - Prehypertension", min_value=0.0, max_value=1.0, value=0.9, step=0.05)
809
+ util_s1 = st.slider("Utility - Stage 1 HTN", min_value=0.0, max_value=1.0, value=0.7, step=0.05)
810
+ util_s2 = st.slider("Utility - Stage 2 HTN", min_value=0.0, max_value=1.0, value=0.5, step=0.05)
811
+
812
+ st.write("#### Intervention Settings")
813
+ int_cost_increase = st.number_input(
814
+ "Additional Intervention Cost ($/year)",
815
+ min_value=0,
816
+ max_value=2000,
817
+ value=500
818
+ )
819
+
820
+ st.markdown("---")
821
+
822
+ C_A = np.array([cost_normal, cost_pre, cost_s1, cost_s2])
823
+ C_B = C_A.copy()
824
+ C_B[0] += int_cost_increase
825
+
826
+ U = np.array([util_normal, util_pre, util_s1, util_s2])
827
+
828
+ sex_code = "M" if sex == "Male" else "F"
829
+
830
+ try:
831
+ # Calculate baseline scenario
832
+ l1_base, l2_base, l3_base, r_base = hazards_from_beta(
833
+ sex_code, patient_features,
834
+ lam10=0.08, lam20=0.10, lam30=0.12, rho0=0.05
835
+ )
836
+ Q_base = Q_matrix(l1_base, l2_base, l3_base, r_base)
837
+ P_base = discrete_P(Q_base, 1.0)
838
+
839
+ start_dist = np.array([1, 0, 0, 0], float)
840
+ cost_A, qaly_A, _ = run_markov(P_base, C_A, U, start_dist, cea_cycles, discount_rate)
841
+
842
+ # Calculate intervention scenario
843
+ int_features = patient_features.copy()
844
+ if cea_feature == "Exercise_freq":
845
+ int_features[cea_feature] = 1
846
+ else:
847
+ int_features[cea_feature] = 0
848
+
849
+ l1_int, l2_int, l3_int, r_int = hazards_from_beta(
850
+ sex_code, int_features,
851
+ lam10=0.08, lam20=0.10, lam30=0.12, rho0=0.05
852
+ )
853
+ Q_int = Q_matrix(l1_int, l2_int, l3_int, r_int)
854
+ P_int = discrete_P(Q_int, 1.0)
855
+
856
+ cost_B, qaly_B, _ = run_markov(P_int, C_B, U, start_dist, cea_cycles, discount_rate)
857
+
858
+ # Calculate incremental values
859
+ cost_diff = cost_B - cost_A
860
+ qaly_diff = qaly_B - qaly_A
861
+ icer_val = icer(cost_A, qaly_A, cost_B, qaly_B)
862
+
863
+ # ✅ 新增:成本效益平面圖 (單點)
864
+ st.write("### Cost-Effectiveness Plane")
865
+
866
+ fig, ax = plt.subplots(figsize=(10, 8))
867
+
868
+ # Plot the single point
869
+ ax.scatter(qaly_diff, cost_diff, s=300, c='red', marker='*',
870
+ edgecolors='black', linewidths=2, label='Intervention vs Baseline', zorder=5)
871
+
872
+ # Add quadrant lines
873
+ ax.axhline(0, color='gray', linestyle='--', alpha=0.5, linewidth=1)
874
+ ax.axvline(0, color='gray', linestyle='--', alpha=0.5, linewidth=1)
875
+
876
+ # Add WTP threshold line
877
+ wtp_threshold = 50000
878
+ xlim = ax.get_xlim()
879
+ ylim = ax.get_ylim()
880
+
881
+ # Extend the line across the plot
882
+ x_range = np.linspace(min(xlim[0], -0.01), max(xlim[1], 0.01), 100)
883
+ ax.plot(x_range, x_range * wtp_threshold, 'k--', alpha=0.5,
884
+ linewidth=2, label=f'WTP ${wtp_threshold:,}/QALY')
885
+
886
+ # Add quadrant labels
887
+ ax.text(0.95, 0.95, 'More Costly\nMore Effective',
888
+ transform=ax.transAxes, ha='right', va='top', fontsize=10, alpha=0.5)
889
+ ax.text(0.05, 0.95, 'More Costly\nLess Effective',
890
+ transform=ax.transAxes, ha='left', va='top', fontsize=10, alpha=0.5)
891
+ ax.text(0.95, 0.05, 'Less Costly\nMore Effective',
892
+ transform=ax.transAxes, ha='right', va='bottom', fontsize=10, alpha=0.5)
893
+ ax.text(0.05, 0.05, 'Less Costly\nLess Effective',
894
+ transform=ax.transAxes, ha='left', va='bottom', fontsize=10, alpha=0.5)
895
+
896
+ ax.set_xlabel("Incremental QALYs", fontsize=12, fontweight='bold')
897
+ ax.set_ylabel("Incremental Cost ($)", fontsize=12, fontweight='bold')
898
+ ax.set_title("Cost-Effectiveness Plane (Deterministic Analysis)",
899
+ fontsize=14, fontweight='bold')
900
+ ax.legend(fontsize=11)
901
+ ax.grid(alpha=0.3)
902
+
903
+ st.pyplot(fig)
904
+
905
+ st.write("### Summary Metrics")
906
+
907
+ metrics_cols = st.columns(3)
908
+
909
+ with metrics_cols[0]:
910
+ st.metric("Incremental Cost", f"${cost_diff:.2f}")
911
+
912
+ with metrics_cols[1]:
913
+ st.metric("Incremental QALYs", f"{qaly_diff:.3f}")
914
+
915
+ with metrics_cols[2]:
916
+ st.metric("ICER ($/QALY)", f"${icer_val:.2f}")
917
+
918
+ # ✅ 修改:決策建議(不使用機率)
919
+ st.write("### Cost-Effectiveness Decision")
920
+
921
+ wtp_threshold = 50000
922
+
923
+ if qaly_diff > 0:
924
+ if cost_diff <= 0:
925
+ st.success("✅ **DOMINANT**: This intervention saves money and improves health.")
926
+ st.info(
927
+ f"💡 The intervention provides {qaly_diff:.3f} additional QALYs while saving ${-cost_diff:.2f}.")
928
+ elif icer_val < wtp_threshold:
929
+ st.success(
930
+ f"✅ **COST-EFFECTIVE**: ICER = ${icer_val:.2f}/QALY is below the ${wtp_threshold:,}/QALY threshold.")
931
+ st.info(
932
+ f"💡 For every additional QALY gained, it costs ${icer_val:.2f}, which is considered acceptable.")
933
+ else:
934
+ st.warning(
935
+ f"⚠️ **NOT COST-EFFECTIVE**: ICER = ${icer_val:.2f}/QALY exceeds the ${wtp_threshold:,}/QALY threshold.")
936
+ st.info(
937
+ f"💡 The intervention would need to cost ${qaly_diff * wtp_threshold:.2f} or less to be cost-effective at this threshold.")
938
+ else:
939
+ if cost_diff >= 0:
940
+ st.error("❌ **DOMINATED**: This intervention costs more and worsens health outcomes.")
941
+ else:
942
+ st.warning("⚠️ **TRADE-OFF**: This intervention saves money but reduces QALYs.")
943
+ st.info(f"💡 It saves ${-cost_diff:.2f} but loses {-qaly_diff:.3f} QALYs.")
944
+
945
+ # ✅ 新增:敏感度分析表格
946
+ st.write("### Sensitivity to WTP Threshold")
947
+ st.caption(
948
+ "This shows whether the intervention would be considered cost-effective at different willingness-to-pay thresholds.")
949
+
950
+ wtp_thresholds = [25000, 50000, 75000, 100000, 150000]
951
+ decision_data = []
952
+
953
+ for wtp in wtp_thresholds:
954
+ if qaly_diff > 0:
955
+ if cost_diff <= 0:
956
+ decision = "✅ Dominant (Cost-Effective)"
957
+ elif icer_val < wtp:
958
+ decision = "✅ Cost-Effective"
959
+ else:
960
+ decision = "❌ Not Cost-Effective"
961
+ else:
962
+ if cost_diff >= 0:
963
+ decision = "❌ Dominated"
964
+ else:
965
+ decision = "⚠️ Saves Money, Loses QALYs"
966
+
967
+ decision_data.append({
968
+ "WTP Threshold": f"${wtp:,}/QALY",
969
+ "Decision": decision
970
+ })
971
+
972
+ decision_df = pd.DataFrame(decision_data)
973
+ st.table(decision_df.set_index("WTP Threshold"))
974
+
975
+ except Exception as e:
976
+ st.error(f"An error occurred while running the cost-effectiveness analysis: {str(e)}")
977
+ st.error("Try different parameters or a different intervention.")
978
+
979
+ else:
980
+ st.write("No modifiable risk factors available for intervention.")
981
+
982
+ # Tab 5: Cost-Effectiveness Analysis (Probabilistic)
983
+ with tab5:
984
+ st.subheader("Cost-Effectiveness Analysis (Probabilistic)")
985
+ st.info(
986
+ "📌 This analysis uses **distributions** (with Standard Errors) for all parameters to account for uncertainty")
987
+
988
+ st.write("### Analysis Settings")
989
+
990
+ if available_interventions:
991
+ psa_intervention = st.selectbox(
992
+ "Select intervention to analyze:",
993
+ [name for _, name in available_interventions],
994
+ index=0,
995
+ key="psa_intervention"
996
+ )
997
+
998
+ psa_feature = next(
999
+ feature for feature, name in available_interventions
1000
+ if name == psa_intervention
1001
+ )
1002
+
1003
+ psa_col1, psa_col2, psa_col3 = st.columns(3)
1004
+
1005
+ with psa_col1:
1006
+ st.write("#### Simulation Settings")
1007
+ psa_cycles = st.slider("Time Horizon (Years)", min_value=5, max_value=30, value=10, key="psa_cycles")
1008
+ psa_discount_rate = st.slider("Discount Rate (%)", min_value=0, max_value=10, value=3,
1009
+ key="psa_discount") / 100
1010
+ n_simulations = st.slider("Number of Simulations", min_value=100, max_value=10000, value=1000, step=100)
1011
+
1012
+ with psa_col2:
1013
+ st.write("#### Cost Parameters (Mean ± SE)")
1014
+ cost_normal_mean = st.number_input("Cost - Normal BP (Mean)", min_value=0, max_value=5000, value=200,
1015
+ key="psa_cn_mean")
1016
+ cost_normal_se = st.number_input("Cost - Normal BP (SE)", min_value=0, max_value=500, value=20,
1017
+ key="psa_cn_se")
1018
+
1019
+ cost_pre_mean = st.number_input("Cost - Prehypertension (Mean)", min_value=0, max_value=5000, value=600,
1020
+ key="psa_cp_mean")
1021
+ cost_pre_se = st.number_input("Cost - Prehypertension (SE)", min_value=0, max_value=500, value=60,
1022
+ key="psa_cp_se")
1023
+
1024
+ cost_s1_mean = st.number_input("Cost - Stage 1 HTN (Mean)", min_value=0, max_value=5000, value=1200,
1025
+ key="psa_cs1_mean")
1026
+ cost_s1_se = st.number_input("Cost - Stage 1 HTN (SE)", min_value=0, max_value=500, value=120,
1027
+ key="psa_cs1_se")
1028
+
1029
+ cost_s2_mean = st.number_input("Cost - Stage 2 HTN (Mean)", min_value=0, max_value=5000, value=2200,
1030
+ key="psa_cs2_mean")
1031
+ cost_s2_se = st.number_input("Cost - Stage 2 HTN (SE)", min_value=0, max_value=500, value=220,
1032
+ key="psa_cs2_se")
1033
+
1034
+ with psa_col3:
1035
+ st.write("#### Utility Parameters (Mean ± SE)")
1036
+ util_normal_mean = st.slider("Utility - Normal BP (Mean)", 0.0, 1.0, 1.0, 0.01, key="psa_un_mean")
1037
+ util_normal_se = st.slider("Utility - Normal BP (SE)", 0.0, 0.1, 0.01, 0.001, key="psa_un_se")
1038
+
1039
+ util_pre_mean = st.slider("Utility - Prehypertension (Mean)", 0.0, 1.0, 0.9, 0.01, key="psa_up_mean")
1040
+ util_pre_se = st.slider("Utility - Prehypertension (SE)", 0.0, 0.1, 0.02, 0.001, key="psa_up_se")
1041
+
1042
+ util_s1_mean = st.slider("Utility - Stage 1 HTN (Mean)", 0.0, 1.0, 0.7, 0.01, key="psa_us1_mean")
1043
+ util_s1_se = st.slider("Utility - Stage 1 HTN (SE)", 0.0, 0.1, 0.03, 0.001, key="psa_us1_se")
1044
+
1045
+ util_s2_mean = st.slider("Utility - Stage 2 HTN (Mean)", 0.0, 1.0, 0.5, 0.01, key="psa_us2_mean")
1046
+ util_s2_se = st.slider("Utility - Stage 2 HTN (SE)", 0.0, 0.1, 0.05, 0.001, key="psa_us2_se")
1047
+
1048
+ st.write("#### Intervention Settings")
1049
+ psa_int_cost_mean = st.number_input("Additional Intervention Cost (Mean)", 0, 2000, 500, key="psa_int_cost_mean")
1050
+ psa_int_cost_se = st.number_input("Additional Intervention Cost (SE)", 0, 200, 50, key="psa_int_cost_se")
1051
+
1052
+ st.markdown("---")
1053
+
1054
+ if st.button("🎲 Run Probabilistic Analysis", type="primary"):
1055
+ with st.spinner(f"Running {n_simulations} Monte Carlo simulations..."):
1056
+ try:
1057
+ # Storage for simulation results
1058
+ results_cost_A = []
1059
+ results_cost_B = []
1060
+ results_qaly_A = []
1061
+ results_qaly_B = []
1062
+ results_icer = []
1063
+ results_delta_cost = []
1064
+ results_delta_qaly = []
1065
+
1066
+ progress_bar = st.progress(0)
1067
+
1068
+ for sim in range(n_simulations):
1069
+ # Sample from distributions (using normal distribution with SE)
1070
+ # For costs - use gamma distribution (non-negative)
1071
+ # For utilities - use beta distribution (bounded 0-1)
1072
+
1073
+ # Sample costs (using gamma approximation)
1074
+ def sample_cost(mean, se):
1075
+ if se == 0:
1076
+ return mean
1077
+ shape = (mean / se) ** 2
1078
+ scale = se ** 2 / mean
1079
+ return np.random.gamma(shape, scale)
1080
+
1081
+
1082
+ # Sample utilities (using beta approximation)
1083
+ def sample_utility(mean, se):
1084
+ if se == 0 or mean == 0 or mean == 1:
1085
+ return np.clip(mean, 0, 1)
1086
+ # Beta distribution parameters
1087
+ alpha = mean * ((mean * (1 - mean) / (se ** 2)) - 1)
1088
+ beta = (1 - mean) * ((mean * (1 - mean) / (se ** 2)) - 1)
1089
+ if alpha > 0 and beta > 0:
1090
+ return np.random.beta(alpha, beta)
1091
+ else:
1092
+ return np.clip(np.random.normal(mean, se), 0, 1)
1093
+
1094
+
1095
+ # Sample parameters for this iteration
1096
+ C_A_sim = np.array([
1097
+ sample_cost(cost_normal_mean, cost_normal_se),
1098
+ sample_cost(cost_pre_mean, cost_pre_se),
1099
+ sample_cost(cost_s1_mean, cost_s1_se),
1100
+ sample_cost(cost_s2_mean, cost_s2_se)
1101
+ ])
1102
+
1103
+ int_cost_add = sample_cost(psa_int_cost_mean, psa_int_cost_se)
1104
+ C_B_sim = C_A_sim.copy()
1105
+ C_B_sim[0] += int_cost_add
1106
+
1107
+ U_sim = np.array([
1108
+ sample_utility(util_normal_mean, util_normal_se),
1109
+ sample_utility(util_pre_mean, util_pre_se),
1110
+ sample_utility(util_s1_mean, util_s1_se),
1111
+ sample_utility(util_s2_mean, util_s2_se)
1112
+ ])
1113
+
1114
+ # Run analysis with sampled parameters
1115
+ sex_code = "M" if sex == "Male" else "F"
1116
+
1117
+ # Get transition matrices (using point estimates for transition probabilities)
1118
+ l1, l2, l3, r = hazards_from_beta(sex_code, patient_features,
1119
+ lam10=0.08, lam20=0.10, lam30=0.12, rho0=0.05)
1120
+
1121
+ # Baseline
1122
+ Q_A = Q_matrix(l1, l2, l3, r)
1123
+ P_A = discrete_P(Q_A, 1.0)
1124
+
1125
+ # Intervention
1126
+ int_features = patient_features.copy()
1127
+ if psa_feature == "Exercise_freq":
1128
+ int_features[psa_feature] = 1
1129
+ else:
1130
+ int_features[psa_feature] = 0
1131
+
1132
+ l1_b, l2_b, l3_b, r_b = hazards_from_beta(sex_code, int_features,
1133
+ lam10=0.08, lam20=0.10, lam30=0.12, rho0=0.05)
1134
+ Q_B = Q_matrix(l1_b, l2_b, l3_b, r_b)
1135
+ P_B = discrete_P(Q_B, 1.0)
1136
+
1137
+ # Run Markov models
1138
+ start_dist = np.array([1, 0, 0, 0], float)
1139
+ cost_A_sim, qaly_A_sim, _ = run_markov(P_A, C_A_sim, U_sim, start_dist,
1140
+ psa_cycles, psa_discount_rate)
1141
+ cost_B_sim, qaly_B_sim, _ = run_markov(P_B, C_B_sim, U_sim, start_dist,
1142
+ psa_cycles, psa_discount_rate)
1143
+
1144
+ # Calculate incremental values
1145
+ delta_cost = cost_B_sim - cost_A_sim
1146
+ delta_qaly = qaly_B_sim - qaly_A_sim
1147
+
1148
+ # Calculate ICER
1149
+ if abs(delta_qaly) > 1e-9:
1150
+ icer_sim = delta_cost / delta_qaly
1151
+ else:
1152
+ icer_sim = np.inf if delta_cost > 0 else -np.inf
1153
+
1154
+ # Store results
1155
+ results_cost_A.append(cost_A_sim)
1156
+ results_cost_B.append(cost_B_sim)
1157
+ results_qaly_A.append(qaly_A_sim)
1158
+ results_qaly_B.append(qaly_B_sim)
1159
+ results_delta_cost.append(delta_cost)
1160
+ results_delta_qaly.append(delta_qaly)
1161
+ results_icer.append(icer_sim)
1162
+
1163
+ # Update progress
1164
+ progress_bar.progress((sim + 1) / n_simulations)
1165
+
1166
+ progress_bar.empty()
1167
+
1168
+ # Convert to arrays
1169
+ results_delta_cost = np.array(results_delta_cost)
1170
+ results_delta_qaly = np.array(results_delta_qaly)
1171
+ results_icer = np.array(results_icer)
1172
+
1173
+ # Filter out infinite ICERs for display
1174
+ results_icer_finite = results_icer[np.isfinite(results_icer)]
1175
+
1176
+ # Display results
1177
+ st.success(f"✅ Completed {n_simulations} simulations!")
1178
+
1179
+ st.write("### Probabilistic Results Summary")
1180
+
1181
+ summary_cols = st.columns(4)
1182
+
1183
+ with summary_cols[0]:
1184
+ st.metric("Mean ΔCost", f"${np.mean(results_delta_cost):.2f}")
1185
+ st.caption(
1186
+ f"95% CI: [{np.percentile(results_delta_cost, 2.5):.2f}, {np.percentile(results_delta_cost, 97.5):.2f}]")
1187
+
1188
+ with summary_cols[1]:
1189
+ st.metric("Mean ΔQALY", f"{np.mean(results_delta_qaly):.4f}")
1190
+ st.caption(
1191
+ f"95% CI: [{np.percentile(results_delta_qaly, 2.5):.4f}, {np.percentile(results_delta_qaly, 97.5):.4f}]")
1192
+
1193
+ with summary_cols[2]:
1194
+ st.metric("Mean ICER", f"${np.mean(results_icer_finite):.2f}/QALY")
1195
+ st.caption(
1196
+ f"95% CI: [{np.percentile(results_icer_finite, 2.5):.2f}, {np.percentile(results_icer_finite, 97.5):.2f}]")
1197
+
1198
+ with summary_cols[3]:
1199
+ # Calculate probability cost-effective at $50k threshold
1200
+ wtp_threshold = 50000
1201
+ prob_ce = np.mean((results_delta_cost / results_delta_qaly) < wtp_threshold)
1202
+ st.metric("Prob. Cost-Effective", f"{prob_ce * 100:.1f}%")
1203
+ st.caption(f"at ${wtp_threshold:,}/QALY")
1204
+
1205
+ # Cost-Effectiveness Plane
1206
+ st.write("### Cost-Effectiveness Plane (Scatter Plot)")
1207
+
1208
+ fig, ax = plt.subplots(figsize=(10, 8))
1209
+
1210
+ # Plot scatter points
1211
+ ax.scatter(results_delta_qaly, results_delta_cost, alpha=0.3, s=20, c='blue')
1212
+
1213
+ # Plot mean point
1214
+ ax.scatter(np.mean(results_delta_qaly), np.mean(results_delta_cost),
1215
+ color='red', s=200, marker='*', edgecolors='black', linewidths=2,
1216
+ label='Mean', zorder=5)
1217
+
1218
+ # Add quadrant lines
1219
+ ax.axhline(0, color='gray', linestyle='--', alpha=0.5, linewidth=1)
1220
+ ax.axvline(0, color='gray', linestyle='--', alpha=0.5, linewidth=1)
1221
+
1222
+ # Add WTP threshold line
1223
+ xlim = ax.get_xlim()
1224
+ x_range = np.linspace(xlim[0], xlim[1], 100)
1225
+ ax.plot(x_range, x_range * wtp_threshold, 'k--', alpha=0.5,
1226
+ linewidth=2, label=f'WTP ${wtp_threshold:,}/QALY')
1227
+
1228
+ ax.set_xlabel("Incremental QALYs", fontsize=12, fontweight='bold')
1229
+ ax.set_ylabel("Incremental Cost ($)", fontsize=12, fontweight='bold')
1230
+ ax.set_title("Cost-Effectiveness Plane (Probabilistic Sensitivity Analysis)",
1231
+ fontsize=14, fontweight='bold')
1232
+ ax.legend(fontsize=11)
1233
+ ax.grid(alpha=0.3)
1234
+
1235
+ st.pyplot(fig)
1236
+
1237
+ # CEAC Curve
1238
+ st.write("### Cost-Effectiveness Acceptability Curve (CEAC)")
1239
+
1240
+ wtp_range = np.linspace(0, 150000, 100)
1241
+ prob_ce_array = []
1242
+
1243
+ for wtp in wtp_range:
1244
+ # Count simulations where intervention is cost-effective
1245
+ ce_count = 0
1246
+ for i in range(n_simulations):
1247
+ if results_delta_qaly[i] > 0:
1248
+ if results_delta_cost[i] < 0: # Dominant
1249
+ ce_count += 1
1250
+ elif (results_delta_cost[i] / results_delta_qaly[i]) < wtp:
1251
+ ce_count += 1
1252
+ elif results_delta_qaly[i] < 0 and results_delta_cost[i] < 0:
1253
+ # Trade-off: saves money but loses QALYs
1254
+ # Not typically considered cost-effective
1255
+ pass
1256
+
1257
+ prob_ce_array.append(ce_count / n_simulations)
1258
+
1259
+ fig2, ax2 = plt.subplots(figsize=(10, 6))
1260
+ ax2.plot(wtp_range, prob_ce_array, 'b-', linewidth=2)
1261
+ ax2.axhline(0.5, color='gray', linestyle='--', alpha=0.5)
1262
+ ax2.axvline(50000, color='red', linestyle='--', alpha=0.5, label='$50,000/QALY')
1263
+ ax2.set_xlabel("Willingness-to-Pay Threshold ($/QALY)", fontsize=12, fontweight='bold')
1264
+ ax2.set_ylabel("Probability Cost-Effective", fontsize=12, fontweight='bold')
1265
+ ax2.set_title("Cost-Effectiveness Acceptability Curve", fontsize=14, fontweight='bold')
1266
+ ax2.set_ylim(0, 1)
1267
+ ax2.grid(alpha=0.3)
1268
+ ax2.legend()
1269
+
1270
+ st.pyplot(fig2)
1271
+
1272
+ # Distribution histograms
1273
+ st.write("### Distribution of Results")
1274
+
1275
+ hist_col1, hist_col2 = st.columns(2)
1276
+
1277
+ with hist_col1:
1278
+ fig3, ax3 = plt.subplots(figsize=(8, 5))
1279
+ ax3.hist(results_delta_cost, bins=50, alpha=0.7, color='blue', edgecolor='black')
1280
+ ax3.axvline(np.mean(results_delta_cost), color='red', linestyle='--',
1281
+ linewidth=2, label=f'Mean: ${np.mean(results_delta_cost):.2f}')
1282
+ ax3.set_xlabel("Incremental Cost ($)")
1283
+ ax3.set_ylabel("Frequency")
1284
+ ax3.set_title("Distribution of Incremental Costs")
1285
+ ax3.legend()
1286
+ ax3.grid(alpha=0.3)
1287
+ st.pyplot(fig3)
1288
+
1289
+ with hist_col2:
1290
+ fig4, ax4 = plt.subplots(figsize=(8, 5))
1291
+ ax4.hist(results_delta_qaly, bins=50, alpha=0.7, color='green', edgecolor='black')
1292
+ ax4.axvline(np.mean(results_delta_qaly), color='red', linestyle='--',
1293
+ linewidth=2, label=f'Mean: {np.mean(results_delta_qaly):.4f}')
1294
+ ax4.set_xlabel("Incremental QALYs")
1295
+ ax4.set_ylabel("Frequency")
1296
+ ax4.set_title("Distribution of Incremental QALYs")
1297
+ ax4.legend()
1298
+ ax4.grid(alpha=0.3)
1299
+ st.pyplot(fig4)
1300
+
1301
+ # ICER distribution
1302
+ st.write("### ICER Distribution")
1303
+ fig5, ax5 = plt.subplots(figsize=(10, 5))
1304
+ ax5.hist(results_icer_finite, bins=50, alpha=0.7, color='purple', edgecolor='black')
1305
+ ax5.axvline(np.mean(results_icer_finite), color='red', linestyle='--',
1306
+ linewidth=2, label=f'Mean: ${np.mean(results_icer_finite):.2f}/QALY')
1307
+ ax5.axvline(50000, color='orange', linestyle='--', linewidth=2,
1308
+ label='$50,000/QALY threshold')
1309
+ ax5.set_xlabel("ICER ($/QALY)")
1310
+ ax5.set_ylabel("Frequency")
1311
+ ax5.set_title("Distribution of ICER Values")
1312
+ ax5.legend()
1313
+ ax5.grid(alpha=0.3)
1314
+ st.pyplot(fig5)
1315
+
1316
+ except Exception as e:
1317
+ st.error(f"Error running probabilistic analysis: {str(e)}")
1318
+ import traceback
1319
+
1320
+ st.code(traceback.format_exc())
1321
+
1322
+ else:
1323
+ st.write("No modifiable risk factors available for intervention.")
1324
+
1325
+ # Tab 6: Personalized Chat
1326
+ with tab6:
1327
+ st.subheader("💬 Personalized Health Recommendations")
1328
+
1329
+ if not openai_api_key:
1330
+ st.warning("⚠️ Please enter your OpenAI API key in the top right corner to use Personalized Chat.")
1331
+ else:
1332
+ if not st.session_state.summary_generated:
1333
+ if st.button("✨ Generate Personalized Health Summary", type="primary"):
1334
+ with st.spinner("Analyzing your health profile..."):
1335
+ try:
1336
+ llm = get_llm()
1337
+ if llm:
1338
+ patient_info = get_patient_info_string()
1339
+
1340
+ summary_prompt = f"""Generate a comprehensive health assessment summary for this patient:
1341
+
1342
+ {patient_info}
1343
+
1344
+ Include:
1345
+ 1. Overall risk level assessment
1346
+ 2. Key modifiable risk factors
1347
+ 3. Top 3 priority recommendations
1348
+ 4. Expected health benefits of interventions
1349
+
1350
+ IMPORTANT:
1351
+ - Use plain text formatting only (no LaTeX, no \\text{{}} or \\frac{{}}{{}} syntax)
1352
+ - Write any formulas in plain text
1353
+ - Use simple markdown formatting (**, -, numbers) for emphasis
1354
+ - Avoid special characters that may not render correctly
1355
+
1356
+ Format the response with clear sections and bullet points."""
1357
+
1358
+ response = llm.invoke(summary_prompt).content
1359
+ response = response.replace("\\text{", "").replace("}", "")
1360
+ response = response.replace("\\frac{", "(").replace("}{", ")/(")
1361
+
1362
+ st.session_state.recommendation_messages.append({
1363
+ "role": "assistant",
1364
+ "content": response
1365
+ })
1366
+ st.session_state.summary_generated = True
1367
+ st.rerun()
1368
+ except Exception as e:
1369
+ st.error(f"Error generating summary: {str(e)}")
1370
+
1371
+ for message in st.session_state.recommendation_messages:
1372
+ with st.chat_message(message["role"]):
1373
+ st.markdown(message["content"])
1374
+
1375
+ if st.session_state.summary_generated:
1376
+ if prompt := st.chat_input("Ask about your personalized recommendations..."):
1377
+ st.session_state.recommendation_messages.append({"role": "user", "content": prompt})
1378
+ with st.chat_message("user"):
1379
+ st.markdown(prompt)
1380
+
1381
+ with st.chat_message("assistant"):
1382
+ with st.spinner("Thinking..."):
1383
+ try:
1384
+ llm = get_llm()
1385
+ if llm:
1386
+ patient_info = get_patient_info_string()
1387
+
1388
+ history_text = ""
1389
+ for msg in st.session_state.recommendation_messages[-10:]:
1390
+ role = "Patient" if msg["role"] == "user" else "Health Coach"
1391
+ history_text += f"{role}: {msg['content']}\n\n"
1392
+
1393
+ full_prompt = f"""You are a personalized health coach specializing in hypertension management.
1394
+
1395
+ PATIENT PROFILE:
1396
+ {patient_info}
1397
+
1398
+ Provide evidence-based, actionable recommendations for:
1399
+ - Weight management and DASH diet
1400
+ - Exercise prescriptions
1401
+ - Smoking cessation strategies
1402
+ - Medication adherence
1403
+ - Lifestyle modifications
1404
+ - Stress management
1405
+
1406
+ Be empathetic, practical, and motivating. Cite specific guidelines when relevant.
1407
+
1408
+ IMPORTANT:
1409
+ - Use plain text formatting only (no LaTeX, no \\text{{}} or \\frac{{}}{{}} syntax)
1410
+ - Write formulas in plain text
1411
+ - Use simple markdown formatting (**, -, numbers) for emphasis
1412
+ - Avoid special characters that may not render correctly
1413
+
1414
+ Conversation History:
1415
+ {history_text}
1416
+
1417
+ Patient Question: {prompt}
1418
+
1419
+ Your Personalized Advice:"""
1420
+
1421
+ response = llm.invoke(full_prompt).content
1422
+ response = response.replace("\\text{", "").replace("}", "")
1423
+ response = response.replace("\\frac{", "(").replace("}{", ")/(")
1424
+ st.markdown(response, unsafe_allow_html=False)
1425
+ st.session_state.recommendation_messages.append({
1426
+ "role": "assistant",
1427
+ "content": response
1428
+ })
1429
+ else:
1430
+ st.error("Failed to initialize AI.")
1431
+ except Exception as e:
1432
+ st.error(f"Error: {str(e)}")
1433
+
1434
+ # Footer
1435
+ st.markdown("---")
1436
+ st.markdown("""
1437
+ <div style='text-align: center; color: gray; font-size: 0.8em;'>
1438
+ Powered by LangChain & OpenAI | Hypertension CEA Tool v2.0
1439
+ </div>
1440
+ """, unsafe_allow_html=True)
requirements.txt CHANGED
@@ -1,3 +1,9 @@
1
- altair
 
2
  pandas
3
- streamlit
 
 
 
 
 
 
1
+ streamlit
2
+ numpy
3
  pandas
4
+ matplotlib
5
+ scipy
6
+ langchain-openai
7
+ langchain-community
8
+ langchain-core
9
+ chromadb