Spaces:

Donlagon007
/

personalized_ht

Runtime error

App Files Files Community

Donlagon007 commited on Oct 31, 2025

Commit

1ad58f5

verified ·

1 Parent(s): d4390ce

Upload 4 files

Browse files

Files changed (4) hide show

README.md +148 -12
hypertension_model_fixed2.py +702 -0
personalized_ht4.py +1440 -0
requirements.txt +8 -2

README.md CHANGED Viewed

@@ -1,20 +1,156 @@
 ---
-title: Personalized Ht
-emoji: 🚀
 colorFrom: red
-colorTo: red
-sdk: docker
-app_port: 8501
-tags:
-- streamlit
 pinned: false
-short_description: Streamlit template space
 license: mit
 ---
-# Welcome to Streamlit!
-Edit `/src/streamlit_app.py` to customize this app to your heart's desire. :heart:
-If you have any questions, checkout our [documentation](https://docs.streamlit.io) and [community
-forums](https://discuss.streamlit.io).

 ---
+title: Hypertension Personalized Cost-Effectiveness Analysis
+emoji: ❤️
 colorFrom: red
+colorTo: pink
+sdk: streamlit
+sdk_version: "1.40.0"
+app_file: personalized_ht4.py
 pinned: false
 license: mit
 ---
+# ❤️ Hypertension Personalized Cost-Effectiveness Analysis
+An AI-powered cost-effectiveness analysis tool for hypertension management with personalized recommendations using LangChain and OpenAI.
+## 🌟 Features
+### 1. **AI Assistant**
+- Interactive chatbot to help understand hypertension analysis
+- Answers questions about Markov models, ICER, QALYs, and risk factors
+### 2. **Hypertension Progression Analysis**
+- Visualize disease progression over time
+- 4-state Markov model: Normal → Prehypertension → Stage 1 HTN → Stage 2 HTN
+- Calculate lifetime risk and 5/10-year progression probabilities
+### 3. **Intervention Comparison**
+- Compare multiple interventions side-by-side
+- Available interventions:
+  - Weight loss (BMI reduction)
+  - Waist circumference reduction
+  - Smoking cessation
+  - Alcohol reduction
+  - Regular exercise
+  - Uric acid management
+  - Cholesterol control
+  - Glucose management
+### 4. **Cost-Effectiveness Analysis**
+- **Deterministic CEA**: Point estimates for costs and QALYs
+- **Probabilistic CEA**: Monte Carlo simulations with uncertainty
+- ICER calculations with WTP threshold analysis
+- Cost-effectiveness plane and CEAC curves
+### 5. **Personalized Health Chat**
+- AI health coach with personalized recommendations
+- Evidence-based advice for lifestyle modifications
+- Patient-specific risk factor management
+### 6. **Patient Database Management**
+- Upload CSV/Excel files with patient data
+- Quick retrieval by Patient ID
+- Optional vector store for semantic search
+## �� Requirements
+### OpenAI API Key (Required)
+This application requires an **OpenAI API key** to enable AI features:
+1. Get your API key from [OpenAI Platform](https://platform.openai.com/api-keys)
+2. Enter it in the text box at the top right of the app
+3. AI Assistant and Personalized Chat features will be enabled
+## 📊 Input Parameters
+### Patient Demographics
+- Sex (Male/Female)
+- Age
+- Education level
+### Anthropometrics
+- BMI (kg/m²)
+- Waist circumference (cm)
+### Laboratory Values
+- Fasting glucose (mg/dL)
+- Total cholesterol (mg/dL)
+- Uric acid (mg/dL)
+### Lifestyle Factors
+- Smoking status
+- Alcohol consumption
+- Exercise frequency
+- Betel nut chewing (for males)
+- Family history of hypertension
+## 🎯 How to Use
+1. **Enter your OpenAI API key** in the top-right corner
+2. **Input patient information** in the left sidebar:
+   - Option A: Upload a CSV/Excel file with patient data
+   - Option B: Manually enter patient characteristics
+3. **Explore different tabs**:
+   - Start with "AI Assistant" to learn about the tool
+   - View "Hypertension Progression" for baseline risk
+   - Compare interventions in "Intervention Comparison"
+   - Run detailed CEA in the analysis tabs
+4. **Get personalized recommendations** in "Personalized Chat"
+## 📈 Cost-Effectiveness Metrics
+- **ICER**: Incremental Cost-Effectiveness Ratio ($/QALY)
+- **QALY**: Quality-Adjusted Life Years
+- **NNT**: Number Needed to Treat
+- **WTP**: Willingness-to-Pay threshold (default: $50,000/QALY)
+## 🔬 Model Details
+### Markov Model Structure
+- **4 Health States**: Normal BP, Prehypertension, Stage 1 HTN, Stage 2 HTN
+- **Transitions**: Based on Cox proportional hazards models
+- **Risk Factors**: Gender-specific beta coefficients with standard errors
+- **Time Horizon**: Configurable (5-30 years)
+- **Discounting**: 3% annual discount rate (adjustable)
+### Interventions
+Each intervention modifies specific risk factors to reduce hypertension progression:
+- Lifestyle modifications (diet, exercise, smoking cessation)
+- Clinical interventions (medication, monitoring)
+- Combined approaches for personalized care
+## 🛠️ Technical Stack
+- **Frontend**: Streamlit
+- **AI/ML**: LangChain, OpenAI GPT-4o-mini
+- **Data Processing**: NumPy, Pandas
+- **Visualization**: Matplotlib
+- **Vector Store**: ChromaDB (for RAG)
+- **Statistical Analysis**: SciPy
+## 📝 Citation
+If you use this tool in your research, please cite:
+```
+Hypertension Personalized Cost-Effectiveness Analysis Tool
+AI-powered CEA with LangChain and OpenAI
+[Year] [Institution]
+```
+## ⚠️ Disclaimer
+This tool is for **educational and research purposes only**. It should not be used as a substitute for professional medical advice, diagnosis, or treatment. Always consult with qualified healthcare providers for medical decisions.
+## 📄 License
+MIT License - See LICENSE file for details
+## 🤝 Support
+For issues or questions:
+- Open an issue on the GitHub repository
+- Contact the development team
+---
+**Powered by LangChain & OpenAI | Hypertension CEA Tool v2.0**

hypertension_model_fixed2.py ADDED Viewed

	@@ -0,0 +1,702 @@

+# -*- coding: utf-8 -*-
+"""
+Hypertension Disease Progression and Intervention Effects Model
+with Beta (SE) structure and stochastic PSA option
+"""
+import numpy as np
+import pandas as pd
+from scipy.linalg import expm
+import matplotlib.pyplot as plt
+import io
+import base64
+# -----------------------------------------------------------
+# 1) β coefficients (log-HR) + SE (from your provided table)
+# -----------------------------------------------------------
+# Transition names for display
+TRANSITION_NAMES_EN = ["N→P", "P→S1", "S1→S2", "P→N"]
+beta_men = {
+    "N2P": {  # Normal → Prehypertension (β1)
+        "Education_high": {"beta": -0.2950, "se": 0.1721},
+        "BMI_ge25": {"beta": 0.5275, "se": 0.1568},
+        "Waist_ge90": {"beta": 0.5040, "se": 0.2526},
+        "Fasting_glu_high": {"beta": -0.0164, "se": 0.3720},
+        "TC_ge200": {"beta": 0.1328, "se": 0.1385},
+        "UA_high": {"beta": 0.0398, "se": 0.1469},
+        "Smoking_current": {"beta": 0.1014, "se": 0.1455},
+        "Betel_current": {"beta": -0.3524, "se": 0.1871},
+        "Alcohol_current": {"beta": -0.2193, "se": 0.1458},
+        "Exercise_freq": {"beta": 0.3028, "se": 0.1754},
+        "FHx_yes": {"beta": 0.0896, "se": 0.1786}
+    },
+    "P2S1": {  # Prehypertension → Stage 1 hypertension (β2)
+        "Education_high": {"beta": -0.1105, "se": 0.1079},
+        "BMI_ge25": {"beta": 0.1272, "se": 0.1062},
+        "Waist_ge90": {"beta": -0.0909, "se": 0.1222},
+        "Fasting_glu_high": {"beta": 0.0078, "se": 0.1778},
+        "TC_ge200": {"beta": -0.1197, "se": 0.0961},
+        "UA_high": {"beta": 0.3740, "se": 0.0986},
+        "Smoking_current": {"beta": -0.0505, "se": 0.1017},
+        "Betel_current": {"beta": 0.2878, "se": 0.1433},
+        "Alcohol_current": {"beta": 0.0422, "se": 0.1003},
+        "Exercise_freq": {"beta": 0.0642, "se": 0.1497},
+        "FHx_yes": {"beta": 0.2461, "se": 0.1280}
+    },
+    "S12S2": {  # Stage 1 → Stage 2 hypertension (β3)
+        "Education_high": {"beta": -0.6211, "se": 0.3150},
+        "BMI_ge25": {"beta": -0.6488, "se": 0.3189},
+        "Waist_ge90": {"beta": 0.2272, "se": 0.3577},
+        "Fasting_glu_high": {"beta": 0.3553, "se": 0.4633},
+        "TC_ge200": {"beta": -0.0633, "se": 0.2687},
+        "UA_high": {"beta": 0.0411, "se": 0.2725},
+        "Smoking_current": {"beta": -0.3919, "se": 0.2850},
+        "Betel_current": {"beta": -0.0243, "se": 0.4090},
+        "Alcohol_current": {"beta": 0.6950, "se": 0.2863},
+        "Exercise_freq": {"beta": -0.5746, "se": 0.3871},
+        "FHx_yes": {"beta": -0.2716, "se": 0.4013}
+    },
+    "P2N": {  # Prehypertension → Normal (β4)
+        "Education_high": {"beta": -0.3251, "se": 0.2192},
+        "BMI_ge25": {"beta": 0.0265, "se": 0.1978},
+        "Waist_ge90": {"beta": 0.4057, "se": 0.3004},
+        "Fasting_glu_high": {"beta": -0.3138, "se": 0.4235},
+        "TC_ge200": {"beta": -0.0306, "se": 0.1749},
+        "UA_high": {"beta": -0.3187, "se": 0.1964},
+        "Smoking_current": {"beta": 0.4710, "se": 0.1816},
+        "Betel_current": {"beta": -0.6040, "se": 0.2568},
+        "Alcohol_current": {"beta": -0.5499, "se": 0.1849},
+        "Exercise_freq": {"beta": 0.4304, "se": 0.2314},
+        "FHx_yes": {"beta": -0.0033, "se": 0.2351}
+    }
+}
+beta_women = {
+    "N2P": {  # Normal → Prehypertension (β1)
+        "Education_high": {"beta": -0.1497, "se": 0.1029},
+        "BMI_ge25": {"beta": 0.3171, "se": 0.1128},
+        "Waist_ge80": {"beta": -0.1668, "se": 0.1167},
+        "Fasting_glu_high": {"beta": 0.5199, "se": 0.2591},
+        "TC_ge200": {"beta": 0.2077, "se": 0.0940},
+        "UA_high": {"beta": 0.0705, "se": 0.1161},
+        "Smoking_current": {"beta": -0.5675, "se": 0.1968},
+        "Alcohol_current": {"beta": -0.1241, "se": 0.1670},
+        "Exercise_freq": {"beta": -0.1400, "se": 0.1177},
+        "FHx_yes": {"beta": -0.0344, "se": 0.1078}
+    },
+    "P2S1": {  # Prehypertension → Stage 1 hypertension (β2)
+        "Education_high": {"beta": -0.0813, "se": 0.1061},
+        "BMI_ge25": {"beta": 0.1029, "se": 0.0982},
+        "Waist_ge80": {"beta": -0.0223, "se": 0.1020},
+        "Fasting_glu_high": {"beta": 0.1663, "se": 0.1416},
+        "TC_ge200": {"beta": -0.0473, "se": 0.0805},
+        "UA_high": {"beta": 0.2912, "se": 0.0957},
+        "Smoking_current": {"beta": -0.4045, "se": 0.2225},
+        "Alcohol_current": {"beta": -0.0472, "se": 0.1685},
+        "Exercise_freq": {"beta": -0.0872, "se": 0.1160},
+        "FHx_yes": {"beta": 0.3253, "se": 0.1075}
+    },
+    "S12S2": {  # Stage 1 → Stage 2 hypertension (β3)
+        "Education_high": {"beta": -0.3908, "se": 0.3162},
+        "BMI_ge25": {"beta": -0.3142, "se": 0.2948},
+        "Waist_ge80": {"beta": 0.1843, "se": 0.2955},
+        "Fasting_glu_high": {"beta": -1.3789, "se": 0.7272},
+        "TC_ge200": {"beta": 0.1766, "se": 0.2380},
+        "UA_high": {"beta": 0.0682, "se": 0.2806},
+        "Smoking_current": {"beta": -8.0955, "se": 21.3033},
+        "Alcohol_current": {"beta": 0.0780, "se": 0.5369},
+        "Exercise_freq": {"beta": 0.2557, "se": 0.4326},
+        "FHx_yes": {"beta": 0.2805, "se": 0.3081}
+    },
+    "P2N": {  # Prehypertension → Normal (β4)
+        "Education_high": {"beta": 0.0195, "se": 0.1214},
+        "BMI_ge25": {"beta": -0.2213, "se": 0.1391},
+        "Waist_ge80": {"beta": -0.4769, "se": 0.1649},
+        "Fasting_glu_high": {"beta": 0.1771, "se": 0.3247},
+        "TC_ge200": {"beta": 0.0501, "se": 0.1156},
+        "UA_high": {"beta": -0.2798, "se": 0.1517},
+        "Smoking_current": {"beta": -0.1689, "se": 0.2330},
+        "Alcohol_current": {"beta": -0.1731, "se": 0.1995},
+        "Exercise_freq": {"beta": -0.0527, "se": 0.1429},
+        "FHx_yes": {"beta": -0.4005, "se": 0.1375}
+    }
+}
+# -----------------------------------------------------------
+# 2) Baseline hazards
+# -----------------------------------------------------------
+lam10_0, lam20_0, lam30_0, rho0_0 = 0.08, 0.10, 0.12, 0.05
+# -----------------------------------------------------------
+# 3) Lambda calculation with stochastic option
+# -----------------------------------------------------------
+def calc_lambda(betas: dict, features: dict, baseline: float, randomize=False):
+    """Compute λ = λ0 * exp(Xβ), optionally sampling β ~ Normal(mean, SE)"""
+    logHR = 0.0
+    for k, vals in betas.items():
+        beta = vals["beta"]
+        se = vals["se"]
+        if randomize:
+            beta = np.random.normal(beta, se)
+        logHR += beta * features.get(k, 0)
+    return baseline * np.exp(logHR)
+def hazards_from_beta(sex: str, features: dict,
+                      lam10, lam20, lam30, rho0, randomize=False):
+    B = beta_men if sex.upper().startswith('M') else beta_women
+    l1 = calc_lambda(B["N2P"], features, lam10, randomize)
+    l2 = calc_lambda(B["P2S1"], features, lam20, randomize)
+    l3 = calc_lambda(B["S12S2"], features, lam30, randomize)
+    r = calc_lambda(B["P2N"], features, rho0, randomize)
+    return l1, l2, l3, r
+def Q_matrix(l1, l2, l3, rho):
+    # State order: [N, P, S1, S2]
+    Q = np.zeros((4, 4))
+    Q[0, 1] = l1
+    Q[0, 0] = -l1
+    Q[1, 0] = rho
+    Q[1, 2] = l2
+    Q[1, 1] = -(rho + l2)
+    Q[2, 3] = l3
+    Q[2, 2] = -l3
+    Q[3, 3] = 0.0
+    return Q
+def discrete_P(Q, years=1.0):
+    P = expm(Q * years)
+    # Numerical stability
+    P = np.clip(P, 0, 1)
+    P = P / P.sum(axis=1, keepdims=True)
+    return P
+# -----------------------------------------------------------
+# 4) (Optional) Calibration: achieve target 5-year S2 cumulative proportion
+# -----------------------------------------------------------
+def calibrate_scale(sex: str, features_ref: dict,
+                    lam10, lam20, lam30, rho0,
+                    target_s2_5y: float,
+                    max_iter=40):
+    lo, hi = 0.2, 5.0
+    for _ in range(max_iter):
+        mid = 0.5 * (lo + hi)
+        l1, l2, l3, r = hazards_from_beta(sex, features_ref,
+                                          lam10 * mid, lam20 * mid, lam30 * mid, rho0)
+        Q = Q_matrix(l1, l2, l3, r)
+        P = discrete_P(Q, 1.0)
+        s = np.array([1, 0, 0, 0], float)
+        for _ in range(5):
+            s = s @ P
+        if s[3] < target_s2_5y:
+            lo = mid
+        else:
+            hi = mid
+    return 0.5 * (lo + hi)
+# -----------------------------------------------------------
+# 5) Markov CEA: cost, utility, discounting, ICER
+# -----------------------------------------------------------
+def run_markov(P, C, U, start_dist, cycles=10, discount=0.03):
+    s = start_dist.astype(float)
+    total_cost, total_qaly = 0.0, 0.0
+    trace = [s.copy()]
+    for t in range(cycles):
+        total_cost += float(s @ C) / ((1 + discount) ** t)
+        total_qaly += float(s @ U) / ((1 + discount) ** t)
+        s = s @ P
+        trace.append(s.copy())
+    return total_cost, total_qaly, np.vstack(trace)
+def icer(costA, qalyA, costB, qalyB):
+    """Calculate ICER, handling edge cases"""
+    deltaC = costB - costA
+    deltaQ = qalyB - qalyA
+    # Handle special cases
+    if abs(deltaQ) < 1e-9:  # QALY difference too small, consider equal
+        return float('inf') if deltaC > 0 else float('-inf'), deltaC, deltaQ
+    # Normal case
+    return deltaC / deltaQ, deltaC, deltaQ
+# -----------------------------------------------------------
+# 6) Graphics: CE plane, CEAC curve
+# -----------------------------------------------------------
+def plot_ce_plane(deltaQ, deltaC, icer_val, intervention_name="Intervention"):
+    plt.figure(figsize=(10, 7))  # 加大圖表尺寸
+    # Set up axes and quadrant lines
+    plt.axhline(0, color='gray', linestyle='--', alpha=0.7, linewidth=1)
+    plt.axvline(0, color='gray', linestyle='--', alpha=0.7, linewidth=1)
+    # Plot ICER point with larger marker
+    plt.scatter(deltaQ, deltaC, s=200, color='#DC143C', edgecolors='darkred',
+                linewidths=2, zorder=5, alpha=0.9)
+    # Add different explanations based on quadrant
+    if deltaQ > 0 and deltaC > 0:  # Northeast quadrant
+        title_text = f"ICER = ${icer_val:,.1f}/QALY - More expensive but more effective"
+        quadrant = "NE"
+    elif deltaQ < 0 and deltaC > 0:  # Northwest quadrant
+        title_text = f"ICER = ${icer_val:,.1f}/QALY - More expensive and less effective"
+        quadrant = "NW"
+    elif deltaQ < 0 and deltaC < 0:  # Southwest quadrant
+        title_text = f"ICER = ${icer_val:,.1f}/QALY - Less expensive but less effective"
+        quadrant = "SW"
+    else:  # Southeast quadrant
+        title_text = f"ICER = ${icer_val:,.1f}/QALY - Less expensive and more effective"
+        quadrant = "SE"
+    plt.title(title_text, fontsize=14, fontweight='bold', pad=20)
+    # Add labels
+    plt.xlabel("Effect Difference (QALYs)", fontsize=12, fontweight='bold')
+    plt.ylabel("Cost Difference ($)", fontsize=12, fontweight='bold')
+    # Add WTP threshold line ($50,000/QALY)
+    wtp = 50000
+    x_max = max(abs(deltaQ) * 1.3, 0.05)  # 確保有足夠的範圍
+    x_range = [-x_max * 0.1, x_max]
+    plt.xlim(x_range)
+    # 計算 y 軸範圍
+    y_max = max(abs(deltaC) * 1.3, wtp * x_max * 0.5)
+    y_range = [-y_max * 0.2, y_max]
+    plt.ylim(y_range)
+    # 畫 WTP 閾值線
+    plt.plot([0, x_range[1]], [0, x_range[1] * wtp], 'k--', alpha=0.5,
+             linewidth=2, label=f'WTP Threshold ${wtp:,}/QALY')
+    # Add annotation with better positioning
+    # 根據點的位置調整標註位置
+    if deltaQ > 0 and deltaC < 0:
+        # 右下象限 - 標註放在左上
+        xytext = (-80, 40)
+        ha = 'right'
+    elif deltaQ > 0 and deltaC > 0:
+        # 右上象限 - 標註放在左下
+        xytext = (-80, -40)
+        ha = 'right'
+    elif deltaQ < 0 and deltaC < 0:
+        # 左下象限 - 標註放在右上
+        xytext = (80, 40)
+        ha = 'left'
+    else:
+        # 左上象限 - 標註放在右下
+        xytext = (80, -40)
+        ha = 'left'
+    plt.annotate(
+        f"{intervention_name}\nΔC=${deltaC:.1f}\nΔQ={deltaQ:.3f}",
+        xy=(deltaQ, deltaC),
+        xytext=xytext,
+        textcoords="offset points",
+        fontsize=11,
+        fontweight='bold',
+        ha=ha,
+        bbox=dict(boxstyle='round,pad=0.5', facecolor='yellow', alpha=0.7, edgecolor='black'),
+        arrowprops=dict(arrowstyle="->", connectionstyle="arc3,rad=.2",
+                        lw=2, color='black')
+    )
+    plt.grid(alpha=0.3, linestyle=':', linewidth=0.5)
+    plt.legend(fontsize=11, loc='upper left', framealpha=0.9)
+    # 調整邊距
+    plt.tight_layout()
+    # Save figure as base64 format
+    buf = io.BytesIO()
+    plt.savefig(buf, format='png', dpi=150, bbox_inches='tight')  # 提高 DPI
+    plt.close()
+    buf.seek(0)
+    img_str = base64.b64encode(buf.read()).decode('utf-8')
+    return f"data:image/png;base64,{img_str}"
+def generate_psa_samples(costA, qalyA, costB, qalyB, n_samples=1000, cv=0.2):
+    """Generate probabilistic sensitivity analysis samples"""
+    samples = []
+    # Assume costs and QALYs follow lognormal distribution
+    for _ in range(n_samples):
+        c_a = np.random.lognormal(np.log(costA), cv)
+        q_a = np.random.lognormal(np.log(qalyA), cv)
+        c_b = np.random.lognormal(np.log(costB), cv)
+        q_b = np.random.lognormal(np.log(qalyB), cv)
+        # Calculate increments
+        delta_c = c_b - c_a
+        delta_q = q_b - q_a
+        # Handle division by zero
+        if abs(delta_q) < 1e-9:
+            continue
+        # Calculate ICER
+        icer_val = delta_c / delta_q
+        samples.append((delta_q, delta_c, icer_val))
+    return samples
+def plot_ceac(costA, qalyA, costB, qalyB, intervention_name="Intervention", n_samples=1000):
+    # Generate PSA samples
+    samples = generate_psa_samples(costA, qalyA, costB, qalyB, n_samples)
+    # Set up WTP threshold range
+    wtp_range = np.linspace(0, 100000, 100)
+    prob_B_ce = []
+    # Calculate probability of B being cost-effective at each WTP threshold
+    for wtp in wtp_range:
+        count_ce = 0
+        for delta_q, delta_c, _ in samples:
+            # Condition for B being more cost-effective than A:
+            # Either (saves money and improves health) or (incremental cost/effect < WTP)
+            if (delta_c < 0 and delta_q > 0) or (delta_q > 0 and (delta_c / delta_q) < wtp):
+                count_ce += 1
+        prob_B_ce.append(count_ce / len(samples))
+    # Plot CEAC curve
+    plt.figure(figsize=(8, 6))
+    plt.plot(wtp_range, prob_B_ce, 'b-', linewidth=2)
+    plt.axhline(0.5, color='gray', linestyle='--', alpha=0.5)
+    plt.grid(alpha=0.3)
+    plt.xlabel("Willingness-to-Pay Threshold ($/QALY)")
+    plt.ylabel("Probability of Intervention Being Cost-Effective")
+    plt.title(f"{intervention_name} Cost-Effectiveness Acceptability Curve (CEAC)")
+    plt.ylim(0, 1)
+    # Save figure as base64 format
+    buf = io.BytesIO()
+    plt.savefig(buf, format='png', dpi=100)
+    plt.close()
+    buf.seek(0)
+    img_str = base64.b64encode(buf.read()).decode('utf-8')
+    return f"data:image/png;base64,{img_str}", wtp_range, prob_B_ce
+def plot_state_distribution(trace, states, title="Health State Distribution"):
+    plt.figure(figsize=(8, 6))
+    for i, state in enumerate(states):
+        plt.plot(range(len(trace)), trace[:, i], label=state)
+    plt.xlabel("Year")
+    plt.ylabel("Proportion")
+    plt.title(title)
+    plt.grid(alpha=0.3)
+    plt.legend()
+    # Save figure as base64 format
+    buf = io.BytesIO()
+    plt.savefig(buf, format='png', dpi=100)
+    plt.close()
+    buf.seek(0)
+    img_str = base64.b64encode(buf.read()).decode('utf-8')
+    return f"data:image/png;base64,{img_str}"
+# -----------------------------------------------------------
+# 7) Wrapper: integrated function for general intervention analysis
+# -----------------------------------------------------------
+def person_P(sex: str, features: dict, alpha=1.0, dt=1.0):
+    """Get a person's annual transition probability matrix"""
+    l1, l2, l3, r = hazards_from_beta(sex, features,
+                                      lam10_0 * alpha, lam20_0 * alpha, lam30_0 * alpha, rho0_0)
+    Q = Q_matrix(l1, l2, l3, r)
+    P = discrete_P(Q, dt)
+    return P, [l1, l2, l3, r]
+def run_analysis(sex: str, features: dict, intervention_feature: str,
+                 C_A, C_B, U, cycles=10, discount_rate=0.03, target_s2_5y=0.20):
+    """Single intervention effect analysis"""
+    # Define state names
+    states = ["Normal", "Prehypertension", "Stage 1", "Stage 2"]
+    # Create copies of baseline and intervention feature dictionaries
+    features_base = features.copy()
+    features_int = features.copy()
+    # Set baseline and intervention values based on intervention type
+    if intervention_feature == "Exercise_freq":
+        # Exercise intervention: 0 -> 1 (increase exercise)
+        features_base[intervention_feature] = 0
+        features_int[intervention_feature] = 1
+        intervention_name = "Increase Exercise"
+    elif intervention_feature in ["BMI_ge25", "Waist_ge90", "Waist_ge80",
+                                  "Fasting_glu_high", "TC_ge200", "UA_high",
+                                  "Smoking_current", "Betel_current", "Alcohol_current"]:
+        # These interventions go from 1->0 (reduce risk factor)
+        features_base[intervention_feature] = 1
+        features_int[intervention_feature] = 0
+        if intervention_feature == "BMI_ge25":
+            intervention_name = "Reduce BMI to <25"
+        elif intervention_feature in ["Waist_ge90", "Waist_ge80"]:
+            intervention_name = "Reduce Waist Circumference"
+        elif intervention_feature == "Fasting_glu_high":
+            intervention_name = "Lower Fasting Glucose"
+        elif intervention_feature == "TC_ge200":
+            intervention_name = "Lower Cholesterol"
+        elif intervention_feature == "UA_high":
+            intervention_name = "Lower Uric Acid"
+        elif intervention_feature == "Smoking_current":
+            intervention_name = "Quit Smoking"
+        elif intervention_feature == "Betel_current":
+            intervention_name = "Quit Betel Nut"
+        elif intervention_feature == "Alcohol_current":
+            intervention_name = "Quit Drinking"
+    elif intervention_feature == "Education_high":
+        # Education intervention: 0 -> 1 (increase education level)
+        features_base[intervention_feature] = 0
+        features_int[intervention_feature] = 1
+        intervention_name = "Improve Education"
+    else:
+        # Other cases, default baseline=0, intervention=1
+        features_base[intervention_feature] = 0
+        features_int[intervention_feature] = 1
+        intervention_name = f"Intervention ({intervention_feature})"
+    # Print parameter information
+    print(f"Sex: {'Male' if sex.upper().startswith('M') else 'Female'}")
+    print(f"Intervention: {intervention_name}")
+    print(f"Baseline parameter: {features_base[intervention_feature]}")
+    print(f"Post-intervention parameter: {features_int[intervention_feature]}")
+    # Set reference features for calibration
+    ref_features = {k: 0 for k in features.keys()}
+    # Calibration
+    alpha = calibrate_scale(
+        sex=sex, features_ref=ref_features,
+        lam10=lam10_0, lam20=lam20_0, lam30=lam30_0, rho0=rho0_0,
+        target_s2_5y=target_s2_5y
+    )
+    # Get transition matrices
+    P_A, lamA = person_P(sex, features_base, alpha=alpha)
+    P_B, lamB = person_P(sex, features_int, alpha=alpha)
+    # Starting distribution
+    start_dist = np.array([1, 0, 0, 0], float)  # Start in Normal state
+    # Run Markov model
+    cost_A, qaly_A, trace_A = run_markov(P_A, C_A, U, start_dist, cycles, discount_rate)
+    cost_B, qaly_B, trace_B = run_markov(P_B, C_B, U, start_dist, cycles, discount_rate)
+    # Calculate ICER
+    ICER, dC, dQ = icer(cost_A, qaly_A, cost_B, qaly_B)
+    # Generate charts
+    ce_plane_img = plot_ce_plane(dQ, dC, ICER, intervention_name)
+    ceac_img, wtp_range, prob_B_ce = plot_ceac(cost_A, qaly_A, cost_B, qaly_B, intervention_name)
+    stateA_img = plot_state_distribution(trace_A, states, "No Intervention")
+    stateB_img = plot_state_distribution(trace_B, states, intervention_name)
+    # Prepare results
+    results = {
+        "intervention": intervention_name,
+        "feature_name": intervention_feature,
+        "feature_base_value": features_base[intervention_feature],
+        "feature_int_value": features_int[intervention_feature],
+        "hazards_A": dict(zip(TRANSITION_NAMES_EN, np.round(lamA, 4))),
+        "hazards_B": dict(zip(TRANSITION_NAMES_EN, np.round(lamB, 4))),
+        "transition_matrix_A": pd.DataFrame(P_A, index=states, columns=states).round(4).to_dict(),
+        "transition_matrix_B": pd.DataFrame(P_B, index=states, columns=states).round(4).to_dict(),
+        "cost_A": cost_A,
+        "cost_B": cost_B,
+        "qaly_A": qaly_A,
+        "qaly_B": qaly_B,
+        "delta_cost": dC,
+        "delta_qaly": dQ,
+        "ICER": ICER,
+        "CE_plane_img": ce_plane_img,
+        "CEAC_img": ceac_img,
+        "stateA_img": stateA_img,
+        "stateB_img": stateB_img,
+        "wtp_values": wtp_range.tolist(),
+        "probability_cost_effective": prob_B_ce
+    }
+    # Display main results
+    print(f"\n--- {intervention_name} Cost-Effectiveness Analysis Results ---")
+    print(
+        f"{cycles}-year total cost: Baseline=${results['cost_A']:.1f}, Intervention=${results['cost_B']:.1f}, ΔC=${results['delta_cost']:.1f}")
+    print(
+        f"{cycles}-year total QALY: Baseline={results['qaly_A']:.3f}, Intervention={results['qaly_B']:.3f}, ΔQ={results['delta_qaly']:.3f}")
+    print(f"ICER = ${results['ICER']:.1f}/QALY")
+    # Intervention effect explanation
+    if dQ > 0:
+        if dC <= 0:
+            print("Conclusion: This intervention both saves money and improves health (Dominant)")
+        elif ICER < 50000:
+            print("Conclusion: This intervention is cost-effective (ICER < $50,000/QALY)")
+        else:
+            print("Conclusion: This intervention is not cost-effective")
+    else:
+        if dC >= 0:
+            print("Conclusion: This intervention both costs more and worsens health (Dominated)")
+        else:
+            print("Conclusion: This intervention saves money but worsens health")
+    return results
+# -----------------------------------------------------------
+# 7) Standard simulation main function - test different interventions
+# -----------------------------------------------------------
+def main_simulation():
+    # Standard male features
+    male_features = {
+        "Education_high": 0,
+        "BMI_ge25": 1,
+        "Waist_ge90": 1,
+        "Fasting_glu_high": 0,
+        "TC_ge200": 0,
+        "UA_high": 1,
+        "Smoking_current": 1,
+        "Betel_current": 0,
+        "Alcohol_current": 1,
+        "Exercise_freq": 0,
+        "FHx_yes": 1
+    }
+    # Standard female features
+    female_features = {
+        "Education_high": 0,
+        "BMI_ge25": 1,
+        "Waist_ge80": 1,
+        "Fasting_glu_high": 0,
+        "TC_ge200": 0,
+        "UA_high": 1,
+        "Smoking_current": 0,
+        "Betel_current": 0,
+        "Alcohol_current": 0,
+        "Exercise_freq": 0,
+        "FHx_yes": 1
+    }
+    # Cost and utility
+    C_A = np.array([200, 600, 1200, 2200])  # No intervention
+    U = np.array([1.00, 0.90, 0.70, 0.50])  # Utilities for each state
+    # Test weight loss intervention (BMI_ge25)
+    print("\n" + "=" * 50)
+    print("Testing Weight Loss Intervention (BMI_ge25: 1->0)")
+    print("=" * 50)
+    # Male weight loss
+    C_B_bmi = np.array([300, 650, 1250, 2250])  # Weight loss increases cost
+    results_bmi_m = run_analysis(
+        sex="M",
+        features=male_features,
+        intervention_feature="BMI_ge25",  # Reduce BMI to <25
+        C_A=C_A,
+        C_B=C_B_bmi,
+        U=U,
+        cycles=10,
+        discount_rate=0.03,
+        target_s2_5y=0.2365
+    )
+    # Female weight loss
+    results_bmi_f = run_analysis(
+        sex="F",
+        features=female_features,
+        intervention_feature="BMI_ge25",  # Reduce BMI to <25
+        C_A=C_A,
+        C_B=C_B_bmi,
+        U=U,
+        cycles=10,
+        discount_rate=0.03,
+        target_s2_5y=0.2365
+    )
+    # Test exercise intervention (Exercise_freq)
+    print("\n" + "=" * 50)
+    print("Testing Exercise Intervention (Exercise_freq: 0->1)")
+    print("=" * 50)
+    # Male exercise
+    C_B_exercise = np.array([250, 600, 1200, 2200])  # Exercise increases cost slightly
+    results_exercise_m = run_analysis(
+        sex="M",
+        features=male_features,
+        intervention_feature="Exercise_freq",  # Increase exercise frequency
+        C_A=C_A,
+        C_B=C_B_exercise,
+        U=U,
+        cycles=10,
+        discount_rate=0.03,
+        target_s2_5y=0.2365
+    )
+    # Female exercise
+    results_exercise_f = run_analysis(
+        sex="F",
+        features=female_features,
+        intervention_feature="Exercise_freq",  # Increase exercise frequency
+        C_A=C_A,
+        C_B=C_B_exercise,
+        U=U,
+        cycles=10,
+        discount_rate=0.03,
+        target_s2_5y=0.2365
+    )
+    # Test smoking cessation intervention (Smoking_current)
+    print("\n" + "=" * 50)
+    print("Testing Smoking Cessation Intervention (Smoking_current: 1->0)")
+    print("=" * 50)
+    # Male smoking cessation
+    C_B_smoking = np.array([220, 600, 1200, 2200])  # Smoking cessation increases cost slightly
+    results_smoking_m = run_analysis(
+        sex="M",
+        features=male_features,
+        intervention_feature="Smoking_current",  # Quit smoking
+        C_A=C_A,
+        C_B=C_B_smoking,
+        U=U,
+        cycles=10,
+        discount_rate=0.03,
+        target_s2_5y=0.2365
+    )
+    return {
+        "BMI_male": results_bmi_m,
+        "BMI_female": results_bmi_f,
+        "Exercise_male": results_exercise_m,
+        "Exercise_female": results_exercise_f,
+        "Smoking_male": results_smoking_m
+    }
+# Main program
+if __name__ == "__main__":
+    print("=" * 50)
+    print("Hypertension Disease Progression and Intervention Effects Model")
+    print("=" * 50)
+    # Run main simulation
+    results = main_simulation()

personalized_ht4.py ADDED Viewed

	@@ -0,0 +1,1440 @@

+"""
+Streamlit App for Hypertension Cost-Effectiveness Analysis with LangChain
+Enhanced with AI Assistant and Personalized Chat using OpenAI
+Added: CSV upload and patient ID retrieval feature
+"""
+import streamlit as st
+import numpy as np
+import pandas as pd
+import matplotlib.pyplot as plt
+from typing import Dict, List, Any
+# LangChain imports
+from langchain_openai import ChatOpenAI, OpenAIEmbeddings
+from langchain_community.vectorstores import Chroma
+from langchain_core.documents import Document
+# Import the hypertension model functions
+from hypertension_model_fixed2 import (
+    run_analysis, beta_men, beta_women,
+    hazards_from_beta, Q_matrix, discrete_P,
+    run_markov, icer
+)
+# Set page configuration
+st.set_page_config(
+    page_title="Hypertension CEA Tool with AI",
+    page_icon="❤️",
+    layout="wide",
+    initial_sidebar_state="expanded",
+)
+# Initialize session state for chat histories
+if 'assistant_messages' not in st.session_state:
+    st.session_state.assistant_messages = []
+if 'recommendation_messages' not in st.session_state:
+    st.session_state.recommendation_messages = []
+if 'summary_generated' not in st.session_state:
+    st.session_state.summary_generated = False
+if 'patients_df' not in st.session_state:
+    st.session_state.patients_df = None
+if 'vectorstore' not in st.session_state:
+    st.session_state.vectorstore = None
+# Header with OpenAI API Key input
+col1, col2 = st.columns([3, 1])
+with col1:
+    st.title("❤️ Hypertension Personalized Cost-effectiveness Analysis")
+    st.markdown("*AI-powered cost-effectiveness analysis tool with LangChain*")
+with col2:
+    openai_api_key = st.text_input(
+        "🔑 OpenAI API Key",
+        type="password",
+        placeholder="sk-...",
+        help="Enter your OpenAI API key to enable AI features"
+    )
+    if openai_api_key:
+        st.success("✓ API Key set")
+    else:
+        st.warning("⚠️ Enter API key")
+# Check if API key is provided
+def get_llm():
+    """Initialize LangChain LLM with OpenAI"""
+    if not openai_api_key:
+        return None
+    try:
+        llm = ChatOpenAI(
+            model="gpt-4o-mini",
+            temperature=0.7,
+            openai_api_key=openai_api_key
+        )
+        return llm
+    except Exception as e:
+        st.error(f"Error initializing OpenAI: {str(e)}")
+        return None
+# Create vector store from patient data
+def create_patient_vectorstore(patients_df: pd.DataFrame):
+    """Create vector store from patient dataframe for RAG retrieval"""
+    if not openai_api_key:
+        return None
+    try:
+        documents = []
+        for idx, row in patients_df.iterrows():
+            patient_text = f"""Patient ID: {row['patient_id']}
+Sex: {row['sex']}, Age: {row['age']}, Education: {row['education']}
+BMI: {row['bmi']} kg/m², Waist: {row['waist']} cm
+Fasting Glucose: {row['fasting_glucose']} mg/dL
+Total Cholesterol: {row['total_cholesterol']} mg/dL
+Uric Acid: {row['uric_acid']} mg/dL
+Smoking: {row['smoking']}, Alcohol: {row['alcohol']}, Exercise: {row['exercise']}
+Betel: {row.get('betel', 'No')}, Family History: {row['family_history']}"""
+            doc = Document(
+                page_content=patient_text,
+                metadata={"patient_id": row['patient_id']}
+            )
+            documents.append(doc)
+        embeddings = OpenAIEmbeddings(openai_api_key=openai_api_key)
+        vectorstore = Chroma.from_documents(documents=documents, embedding=embeddings)
+        return vectorstore
+    except Exception as e:
+        st.error(f"Error creating vector store: {str(e)}")
+        return None
+# Retrieve patient by ID
+def retrieve_patient_by_id(patient_id: str):
+    """Retrieve patient from dataframe by ID"""
+    if st.session_state.patients_df is None:
+        return None
+    patient_row = st.session_state.patients_df[
+        st.session_state.patients_df['patient_id'] == patient_id
+        ]
+    if patient_row.empty:
+        return None
+    return patient_row.iloc[0].to_dict()
+# Sidebar for patient information
+st.sidebar.header("👤 Patient Information")
+# ===== NEW: Patient Data Upload Section =====
+with st.sidebar.expander("📁 Upload Patient Database (Optional)", expanded=False):
+    uploaded_file = st.file_uploader(
+        "Upload CSV/Excel with patient data",
+        type=['csv', 'xlsx'],
+        help="Upload a file with multiple patients to enable quick retrieval by ID"
+    )
+    if uploaded_file is not None:
+        try:
+            if uploaded_file.name.endswith('.csv'):
+                df = pd.read_csv(uploaded_file)
+            else:
+                df = pd.read_excel(uploaded_file)
+            st.session_state.patients_df = df
+            st.success(f"✅ Loaded {len(df)} patients")
+            # Optionally create vector store
+            if openai_api_key and st.button("🔄 Create Vector Store for Smart Search"):
+                with st.spinner("Creating vector store..."):
+                    vectorstore = create_patient_vectorstore(df)
+                    if vectorstore:
+                        st.session_state.vectorstore = vectorstore
+                        st.success("✅ Vector store created!")
+        except Exception as e:
+            st.error(f"Error loading file: {str(e)}")
+    # Patient ID retrieval
+    if st.session_state.patients_df is not None:
+        st.markdown("---")
+        patient_id_input = st.text_input("🔍 Enter Patient ID", placeholder="P001")
+        if st.button("📥 Load Patient Data"):
+            if patient_id_input:
+                patient_data = retrieve_patient_by_id(patient_id_input)
+                if patient_data:
+                    st.session_state.loaded_patient = patient_data
+                    st.success(f"✅ Loaded {patient_id_input}")
+                    st.rerun()
+                else:
+                    st.error(f"Patient {patient_id_input} not found")
+st.sidebar.markdown("---")
+# Check if we have loaded patient data
+if 'loaded_patient' in st.session_state:
+    # Use loaded patient data
+    p = st.session_state.loaded_patient
+    st.sidebar.info(f"📋 Loaded: {p['patient_id']}")
+    sex = st.sidebar.radio("Biological Sex", ["Male", "Female"],
+                           index=0 if p['sex'] == 'Male' else 1, key="sex_loaded")
+    st.sidebar.markdown("### Demographics")
+    age = st.sidebar.number_input("Age", min_value=18, max_value=100, value=int(p['age']), key="age_loaded")
+    education_idx = 0 if p['education'] in ['High', 'High (College or above)'] else 1
+    education = st.sidebar.selectbox("Education Level",
+                                     ["High (College or above)", "Low (Below college)"],
+                                     index=education_idx, key="edu_loaded")
+    st.sidebar.markdown("### Anthropometrics")
+    bmi = st.sidebar.number_input("BMI (kg/m²)", min_value=15.0, max_value=50.0,
+                                  value=float(p['bmi']), format="%.1f", key="bmi_loaded")
+    waist = st.sidebar.number_input("Waist Circumference (cm)", min_value=50, max_value=150,
+                                    value=int(p['waist']), key="waist_loaded")
+    st.sidebar.markdown("### Laboratory Values")
+    fasting_glucose = st.sidebar.number_input("Fasting Glucose (mg/dL)", min_value=70, max_value=300,
+                                              value=int(p['fasting_glucose']), key="glucose_loaded")
+    total_cholesterol = st.sidebar.number_input("Total Cholesterol (mg/dL)", min_value=100, max_value=400,
+                                                value=int(p['total_cholesterol']), key="chol_loaded")
+    uric_acid = st.sidebar.number_input("Uric Acid (mg/dL)", min_value=2.0, max_value=15.0,
+                                        value=float(p['uric_acid']), format="%.1f", key="ua_loaded")
+    st.sidebar.markdown("### Lifestyle Factors")
+    smoking_idx = 0 if p['smoking'] in ['No', 'Non-smoker'] else 1
+    smoking = st.sidebar.selectbox("Smoking Status", ["Non-smoker", "Current smoker"],
+                                   index=smoking_idx, key="smoke_loaded")
+    alcohol_idx = 0 if p['alcohol'] in ['No', 'None/Occasional'] else 1
+    alcohol = st.sidebar.selectbox("Alcohol Consumption", ["None/Occasional", "Regular drinker"],
+                                   index=alcohol_idx, key="alcohol_loaded")
+    exercise_idx = 0 if p['exercise'] in ['No', 'Infrequent'] else 1
+    exercise = st.sidebar.selectbox("Exercise Frequency", ["Infrequent", "Regular (≥3 times/week)"],
+                                    index=exercise_idx, key="exercise_loaded")
+    if sex == "Male":
+        betel_idx = 0 if p.get('betel', 'No') == 'No' else 1
+        betel = st.sidebar.selectbox("Betel Nut Chewing", ["No", "Yes"],
+                                     index=betel_idx, key="betel_loaded")
+    family_idx = 0 if p['family_history'] == 'No' else 1
+    family_history = st.sidebar.selectbox("Family History of Hypertension", ["No", "Yes"],
+                                          index=family_idx, key="fh_loaded")
+else:
+    # Manual input (original functionality)
+    sex = st.sidebar.radio("Biological Sex", ["Male", "Female"])
+    st.sidebar.markdown("### Demographics")
+    age = st.sidebar.number_input("Age", min_value=18, max_value=100, value=45)
+    education = st.sidebar.selectbox("Education Level", ["High (College or above)", "Low (Below college)"])
+    st.sidebar.markdown("### Anthropometrics")
+    bmi = st.sidebar.number_input("BMI (kg/m²)", min_value=15.0, max_value=50.0, value=27.0, format="%.1f")
+    if sex == "Male":
+        waist = st.sidebar.number_input("Waist Circumference (cm)", min_value=60, max_value=150, value=88)
+    else:
+        waist = st.sidebar.number_input("Waist Circumference (cm)", min_value=50, max_value=150, value=78)
+    st.sidebar.markdown("### Laboratory Values")
+    fasting_glucose = st.sidebar.number_input("Fasting Glucose (mg/dL)", min_value=70, max_value=300, value=100)
+    total_cholesterol = st.sidebar.number_input("Total Cholesterol (mg/dL)", min_value=100, max_value=400, value=190)
+    if sex == "Male":
+        uric_acid = st.sidebar.number_input("Uric Acid (mg/dL)", min_value=2.0, max_value=15.0, value=6.5,
+                                            format="%.1f")
+    else:
+        uric_acid = st.sidebar.number_input("Uric Acid (mg/dL)", min_value=2.0, max_value=15.0, value=5.5,
+                                            format="%.1f")
+    st.sidebar.markdown("### Lifestyle Factors")
+    smoking = st.sidebar.selectbox("Smoking Status", ["Non-smoker", "Current smoker"])
+    alcohol = st.sidebar.selectbox("Alcohol Consumption", ["None/Occasional", "Regular drinker"])
+    exercise = st.sidebar.selectbox("Exercise Frequency", ["Infrequent", "Regular (≥3 times/week)"])
+    if sex == "Male":
+        betel = st.sidebar.selectbox("Betel Nut Chewing", ["No", "Yes"])
+    family_history = st.sidebar.selectbox("Family History of Hypertension", ["No", "Yes"])
+# Convert inputs to feature dictionary
+def create_feature_dict():
+    features = {}
+    if sex == "Male":
+        features["Education_high"] = 1 if education == "High (College or above)" else 0
+        features["BMI_ge25"] = 1 if bmi >= 25 else 0
+        features["Waist_ge90"] = 1 if waist >= 90 else 0
+        features["Fasting_glu_high"] = 1 if fasting_glucose >= 110 else 0
+        features["TC_ge200"] = 1 if total_cholesterol >= 200 else 0
+        features["UA_high"] = 1 if uric_acid >= 7 else 0
+        features["Smoking_current"] = 1 if smoking == "Current smoker" else 0
+        features["Betel_current"] = 1 if betel == "Yes" else 0
+        features["Alcohol_current"] = 1 if alcohol == "Regular drinker" else 0
+        features["Exercise_freq"] = 1 if exercise == "Regular (≥3 times/week)" else 0
+        features["FHx_yes"] = 1 if family_history == "Yes" else 0
+    else:  # Female
+        features["Education_high"] = 1 if education == "High (College or above)" else 0
+        features["BMI_ge25"] = 1 if bmi >= 25 else 0
+        features["Waist_ge80"] = 1 if waist >= 80 else 0
+        features["Fasting_glu_high"] = 1 if fasting_glucose >= 110 else 0
+        features["TC_ge200"] = 1 if total_cholesterol >= 200 else 0
+        features["UA_high"] = 1 if uric_acid >= 6 else 0
+        features["Smoking_current"] = 1 if smoking == "Current smoker" else 0
+        features["Alcohol_current"] = 1 if alcohol == "Regular drinker" else 0
+        features["Exercise_freq"] = 1 if exercise == "Regular (≥3 times/week)" else 0
+        features["FHx_yes"] = 1 if family_history == "Yes" else 0
+    return features
+# Get patient features and info string
+patient_features = create_feature_dict()
+def get_patient_info_string():
+    """Generate patient info string for AI context"""
+    risk_factors = []
+    if patient_features.get("BMI_ge25", 0) == 1:
+        risk_factors.append(f"BMI {bmi:.1f} kg/m² (≥25)")
+    if (sex == "Male" and patient_features.get("Waist_ge90", 0) == 1) or \
+            (sex == "Female" and patient_features.get("Waist_ge80", 0) == 1):
+        risk_factors.append(f"Waist circumference {waist} cm (high)")
+    if patient_features.get("Smoking_current", 0) == 1:
+        risk_factors.append("Current smoker")
+    if patient_features.get("Alcohol_current", 0) == 1:
+        risk_factors.append("Regular alcohol consumption")
+    if sex == "Male" and patient_features.get("Betel_current", 0) == 1:
+        risk_factors.append("Betel nut chewing")
+    if patient_features.get("Exercise_freq", 0) == 0:
+        risk_factors.append("Insufficient exercise")
+    if patient_features.get("FHx_yes", 0) == 1:
+        risk_factors.append("Family history of hypertension")
+    if patient_features.get("UA_high", 0) == 1:
+        risk_factors.append(f"High uric acid ({uric_acid:.1f} mg/dL)")
+    if patient_features.get("Fasting_glu_high", 0) == 1:
+        risk_factors.append(f"High fasting glucose ({fasting_glucose} mg/dL)")
+    if patient_features.get("TC_ge200", 0) == 1:
+        risk_factors.append(f"High total cholesterol ({total_cholesterol} mg/dL)")
+    patient_id_str = ""
+    if 'loaded_patient' in st.session_state:
+        patient_id_str = f"Patient ID: {st.session_state.loaded_patient['patient_id']}\n"
+    info = f"""{patient_id_str}{sex}, Age {age}
+BMI: {bmi:.1f} kg/m²
+Waist: {waist} cm
+Exercise: {exercise}
+Smoking: {smoking}
+Alcohol: {alcohol}
+Risk Factors Identified:
+"""
+    if risk_factors:
+        info += "\n".join(f"- {rf}" for rf in risk_factors)
+    else:
+        info += "- No major modifiable risk factors detected"
+    return info
+# Create tabs
+tab1, tab2, tab3, tab4, tab5, tab6 = st.tabs([
+    "🤖 AI Assistant",
+    "📊 Hypertension Progression",
+    "📈 Intervention Comparison",
+    "💰 CEA (Deterministic)",
+    "🎲 CEA (Probabilistic)",
+    "💬 Personalized Chat"
+])
+# Tab 1: AI Assistant
+with tab1:
+    st.subheader("AI Assistant - Ask Questions About This Tool")
+    if not openai_api_key:
+        st.warning("⚠️ Please enter your OpenAI API key in the top right corner to use the AI Assistant.")
+    else:
+        # Display chat messages
+        for message in st.session_state.assistant_messages:
+            with st.chat_message(message["role"]):
+                st.markdown(message["content"])
+        # Chat input
+        if prompt := st.chat_input("Ask me anything about hypertension analysis..."):
+            # Add user message
+            st.session_state.assistant_messages.append({"role": "user", "content": prompt})
+            with st.chat_message("user"):
+                st.markdown(prompt)
+            # Get AI response
+            with st.chat_message("assistant"):
+                with st.spinner("Thinking..."):
+                    try:
+                        llm = get_llm()
+                        if llm:
+                            history_text = ""
+                            for msg in st.session_state.assistant_messages[-10:]:
+                                role = "User" if msg["role"] == "user" else "Assistant"
+                                history_text += f"{role}: {msg['content']}\n\n"
+                            full_prompt = f"""You are an expert AI assistant for a Hypertension Cost-Effectiveness Analysis tool.
+Your role is to help users understand:
+- How to use this analysis tool
+- Markov model methodology (4 states: Normal → Prehypertension → Stage 1 HTN → Stage 2 HTN)
+- Cost-effectiveness metrics (ICER, QALY, CEAC)
+- Risk factor interpretation
+- Available interventions
+Be clear, concise, and educational. Use examples when helpful.
+IMPORTANT:
+- Use plain text formatting only (no LaTeX, no \\text{{}} or \\frac{{}}{{}} syntax)
+- Write mathematical formulas in plain text like: ICER = (Cost_B - Cost_A) / (QALY_B - QALY_A)
+- Use simple markdown formatting (**, -, numbers) for emphasis
+- Avoid special characters that may not render correctly
+Conversation History:
+{history_text}
+User Question: {prompt}
+Your Response:"""
+                            response = llm.invoke(full_prompt).content
+                            response = response.replace("\\text{", "").replace("}", "")
+                            response = response.replace("\\frac{", "(").replace("}{", ")/(")
+                            st.markdown(response, unsafe_allow_html=False)
+                            st.session_state.assistant_messages.append({"role": "assistant", "content": response})
+                        else:
+                            st.error("Failed to initialize AI. Please check your API key.")
+                    except Exception as e:
+                        st.error(f"Error: {str(e)}")
+# Tab 2: Hypertension Progression
+with tab2:
+    st.subheader("Baseline Hypertension Progression Risk")
+    # Create columns for different visualizations
+    prog_col1, prog_col2 = st.columns([1, 1])
+    # Prediction settings
+    with prog_col1:
+        st.write("Progression Projection Settings")
+        projection_years = st.slider("Projection Horizon (Years)", min_value=1, max_value=20, value=10)
+        # Calculate the transition rates
+        sex_code = "M" if sex == "Male" else "F"
+        l1, l2, l3, r = hazards_from_beta(sex_code, patient_features,
+                                          lam10=0.08, lam20=0.10, lam30=0.12, rho0=0.05, randomize=True)
+        # Display the annual transition rates
+        st.write("Annual Transition Rates:")
+        rates_df = pd.DataFrame({
+            "Transition": ["Normal → Prehypertension", "Prehypertension → Stage 1",
+                           "Stage 1 → Stage 2", "Prehypertension → Normal"],
+            "Annual Rate (%)": [l1 * 100, l2 * 100, l3 * 100, r * 100]
+        })
+        st.dataframe(rates_df)
+        # Calculate lifetime risk
+        Q = Q_matrix(l1, l2, l3, r)
+        P = discrete_P(Q, 1.0)
+        # Project the state distribution
+        states = ["Normal", "Prehypertension", "Stage 1", "Stage 2"]
+        s = np.array([1, 0, 0, 0], float)  # Start in Normal state
+        # Project over time
+        projections = [s.copy()]
+        for _ in range(projection_years):
+            s = s @ P
+            projections.append(s.copy())
+        proj_df = pd.DataFrame(projections, columns=states)
+        proj_df.index.name = "Year"
+        # Calculate lifetime risk of progressing to Stage 2
+        lifetime_risk_s2 = proj_df["Stage 2"].iloc[-1] * 100
+    # Visualization of progression over time
+    with prog_col2:
+        # Plot state distribution over time
+        fig, ax = plt.subplots(figsize=(8, 5))
+        for i, state in enumerate(states):
+            ax.plot(range(projection_years + 1), proj_df[state], label=state)
+        ax.set_xlabel("Year")
+        ax.set_ylabel("Proportion")
+        ax.set_title("Projected Hypertension State Distribution Over Time")
+        ax.legend()
+        ax.grid(alpha=0.3)
+        st.pyplot(fig)
+    # Summary metrics
+    st.subheader("Summary Risk Metrics")
+    risk_cols = st.columns(5)
+    # 5-year risks
+    risk_5yr_p = projections[5][1]
+    risk_5yr_htn = projections[5][2] + projections[5][3]
+    risk_5yr_s2 = projections[5][3]
+    # 10-year risks
+    year_10_idx = min(10, projection_years)
+    risk_10yr_p = projections[year_10_idx][1]
+    risk_10yr_htn = projections[year_10_idx][2] + projections[year_10_idx][3]
+    risk_10yr_s2 = projections[year_10_idx][3]
+    # Display risk metrics
+    with risk_cols[0]:
+        st.metric("5-Year Prehypertension Risk", f"{risk_5yr_p * 100:.1f}%")
+    with risk_cols[1]:
+        st.metric("5-Year Any Hypertension Risk", f"{risk_5yr_htn * 100:.1f}%")
+    with risk_cols[2]:
+        st.metric("5-Year Stage 2 Risk", f"{risk_5yr_s2 * 100:.1f}%")
+    with risk_cols[3]:
+        st.metric("10-Year Any Hypertension Risk", f"{risk_10yr_htn * 100:.1f}%")
+    with risk_cols[4]:
+        st.metric("Lifetime Stage 2 Risk", f"{lifetime_risk_s2:.1f}%")
+    # Comparison to population averages
+    st.write("#### Risk Comparison to Population Average")
+    avg_5yr_htn = 0.08
+    avg_10yr_htn = 0.18
+    risk_ratio_5yr = risk_5yr_htn / avg_5yr_htn if avg_5yr_htn > 0 else 1.0
+    risk_ratio_10yr = risk_10yr_htn / avg_10yr_htn if avg_10yr_htn > 0 else 1.0
+    st.write(f"This patient's 5-year risk of hypertension is **{risk_ratio_5yr:.1f}x** the population average.")
+    st.write(f"This patient's 10-year risk of hypertension is **{risk_ratio_10yr:.1f}x** the population average.")
+    # Risk factors explanation
+    st.write("#### Key Risk Factors")
+    # Identify high risk factors
+    risk_factors = []
+    if patient_features.get("BMI_ge25", 0) == 1:
+        risk_factors.append(f"BMI ≥ 25 kg/m² (current: {bmi:.1f})")
+    if (sex == "Male" and patient_features.get("Waist_ge90", 0) == 1) or (
+            sex == "Female" and patient_features.get("Waist_ge80", 0) == 1):
+        risk_factors.append(f"High waist circumference (current: {waist} cm)")
+    if patient_features.get("Smoking_current", 0) == 1:
+        risk_factors.append("Current smoker")
+    if patient_features.get("Alcohol_current", 0) == 1:
+        risk_factors.append("Regular alcohol consumption")
+    if sex == "Male" and patient_features.get("Betel_current", 0) == 1:
+        risk_factors.append("Betel nut chewing")
+    if patient_features.get("Exercise_freq", 0) == 0:
+        risk_factors.append("Infrequent exercise")
+    if patient_features.get("FHx_yes", 0) == 1:
+        risk_factors.append("Family history of hypertension")
+    if patient_features.get("UA_high", 0) == 1:
+        risk_factors.append(f"High uric acid (current: {uric_acid:.1f} mg/dL)")
+    if patient_features.get("Fasting_glu_high", 0) == 1:
+        risk_factors.append(f"High fasting glucose (current: {fasting_glucose} mg/dL)")
+    if patient_features.get("TC_ge200", 0) == 1:
+        risk_factors.append(f"High total cholesterol (current: {total_cholesterol} mg/dL)")
+    # Display risk factors
+    if risk_factors:
+        st.write("This patient has the following risk factors:")
+        for factor in risk_factors:
+            st.write(f"- {factor}")
+    else:
+        st.write("This patient has no major modifiable risk factors.")
+# Tab 3: Intervention Comparison
+with tab3:
+    st.subheader("Compare Intervention Effects")
+    # Choose interventions to compare
+    available_interventions = []
+    if patient_features.get("BMI_ge25", 0) == 1:
+        available_interventions.append(("BMI_ge25", "Weight Loss (BMI < 25 kg/m²)"))
+    if (sex == "Male" and patient_features.get("Waist_ge90", 0) == 1) or (
+            sex == "Female" and patient_features.get("Waist_ge80", 0) == 1):
+        waist_feature = "Waist_ge90" if sex == "Male" else "Waist_ge80"
+        available_interventions.append((waist_feature, "Waist Circumference Reduction"))
+    if patient_features.get("Smoking_current", 0) == 1:
+        available_interventions.append(("Smoking_current", "Smoking Cessation"))
+    if patient_features.get("Alcohol_current", 0) == 1:
+        available_interventions.append(("Alcohol_current", "Alcohol Reduction"))
+    if sex == "Male" and patient_features.get("Betel_current", 0) == 1:
+        available_interventions.append(("Betel_current", "Betel Nut Cessation"))
+    if patient_features.get("Exercise_freq", 0) == 0:
+        available_interventions.append(("Exercise_freq", "Regular Exercise"))
+    if patient_features.get("UA_high", 0) == 1:
+        available_interventions.append(("UA_high", "Uric Acid Reduction"))
+    if patient_features.get("TC_ge200", 0) == 1:
+        available_interventions.append(("TC_ge200", "Cholesterol Reduction"))
+    if patient_features.get("Fasting_glu_high", 0) == 1:
+        available_interventions.append(("Fasting_glu_high", "Glucose Control"))
+    if not available_interventions:
+        st.write("No modifiable risk factors available for intervention.")
+    else:
+        selected_intervention_names = st.multiselect(
+            "Select interventions to compare:",
+            [name for _, name in available_interventions],
+            max_selections=3
+        )
+        selected_interventions = [
+            feature for feature, name in available_interventions
+            if name in selected_intervention_names
+        ]
+        comp_years = st.slider("Comparison Projection (Years)", min_value=1, max_value=20, value=10, key="comp_years")
+        if selected_interventions:
+            sex_code = "M" if sex == "Male" else "F"
+            l1_base, l2_base, l3_base, r_base = hazards_from_beta(
+                sex_code, patient_features,
+                lam10=0.08, lam20=0.10, lam30=0.12, rho0=0.05
+            )
+            Q_base = Q_matrix(l1_base, l2_base, l3_base, r_base)
+            P_base = discrete_P(Q_base, 1.0)
+            states = ["Normal", "Prehypertension", "Stage 1", "Stage 2"]
+            s_base = np.array([1, 0, 0, 0], float)
+            proj_base = [s_base.copy()]
+            for _ in range(comp_years):
+                s_base = s_base @ P_base
+                proj_base.append(s_base.copy())
+            proj_base_df = pd.DataFrame(proj_base, columns=states)
+            proj_base_df.index.name = "Year"
+            intervention_data = []
+            for feature in selected_interventions:
+                int_features = patient_features.copy()
+                if feature == "Exercise_freq":
+                    int_features[feature] = 1
+                else:
+                    int_features[feature] = 0
+                l1_int, l2_int, l3_int, r_int = hazards_from_beta(
+                    sex_code, int_features,
+                    lam10=0.08, lam20=0.10, lam30=0.12, rho0=0.05
+                )
+                Q_int = Q_matrix(l1_int, l2_int, l3_int, r_int)
+                P_int = discrete_P(Q_int, 1.0)
+                s_int = np.array([1, 0, 0, 0], float)
+                proj_int = [s_int.copy()]
+                for _ in range(comp_years):
+                    s_int = s_int @ P_int
+                    proj_int.append(s_int.copy())
+                proj_int_df = pd.DataFrame(proj_int, columns=states)
+                baseline_htn_risk = proj_base_df["Stage 1"].iloc[-1] + proj_base_df["Stage 2"].iloc[-1]
+                int_htn_risk = proj_int_df["Stage 1"].iloc[-1] + proj_int_df["Stage 2"].iloc[-1]
+                absolute_risk_reduction = baseline_htn_risk - int_htn_risk
+                relative_risk_reduction = absolute_risk_reduction / baseline_htn_risk if baseline_htn_risk > 0 else 0
+                nnt = 1 / absolute_risk_reduction if absolute_risk_reduction > 0 else float('inf')
+                int_name = next(name for feat, name in available_interventions if feat == feature)
+                intervention_data.append({
+                    "feature": feature,
+                    "name": int_name,
+                    "projection": proj_int_df,
+                    "risk_reduction_abs": absolute_risk_reduction,
+                    "risk_reduction_rel": relative_risk_reduction,
+                    "nnt": nnt
+                })
+            st.write("#### Hypertension Risk Comparison")
+            fig, ax = plt.subplots(figsize=(10, 6))
+            baseline_htn_risk = [
+                proj_base_df["Stage 1"].iloc[i] + proj_base_df["Stage 2"].iloc[i]
+                for i in range(len(proj_base_df))
+            ]
+            ax.plot(range(comp_years + 1), baseline_htn_risk, 'k-', linewidth=2, label="No Intervention")
+            colors = ['b', 'g', 'r', 'c', 'm', 'y']
+            for i, int_data in enumerate(intervention_data):
+                int_htn_risk = [
+                    int_data["projection"]["Stage 1"].iloc[j] + int_data["projection"]["Stage 2"].iloc[j]
+                    for j in range(len(int_data["projection"]))
+                ]
+                ax.plot(
+                    range(comp_years + 1),
+                    int_htn_risk,
+                    f"{colors[i % len(colors)]}-",
+                    linewidth=2,
+                    label=int_data["name"]
+                )
+            ax.set_xlabel("Year")
+            ax.set_ylabel("Probability of Hypertension (Stage 1 or 2)")
+            ax.set_title(f"Effect of Interventions on {comp_years}-Year Hypertension Risk")
+            ax.legend()
+            ax.grid(alpha=0.3)
+            st.pyplot(fig)
+            st.write("#### Effectiveness Comparison")
+            metrics_data = {
+                "Intervention": ["No Intervention"] + [int_data["name"] for int_data in intervention_data],
+                f"{comp_years}-Year HTN Risk": [
+                                                   baseline_htn_risk[-1] * 100
+                                               ] + [
+                                                   (baseline_htn_risk[-1] - int_data["risk_reduction_abs"]) * 100
+                                                   for int_data in intervention_data
+                                               ],
+                "Absolute Risk Reduction (%)": [
+                                                   0
+                                               ] + [
+                                                   int_data["risk_reduction_abs"] * 100
+                                                   for int_data in intervention_data
+                                               ],
+                "Relative Risk Reduction (%)": [
+                                                   0
+                                               ] + [
+                                                   int_data["risk_reduction_rel"] * 100
+                                                   for int_data in intervention_data
+                                               ],
+                "Number Needed to Treat": [
+                                              "N/A"
+                                          ] + [
+                                              f"{int_data['nnt']:.1f}" if int_data["nnt"] < 100 else "100+"
+                                              for int_data in intervention_data
+                                          ]
+            }
+            metrics_df = pd.DataFrame(metrics_data)
+            st.table(metrics_df.set_index("Intervention"))
+            if intervention_data:
+                most_effective = max(intervention_data, key=lambda x: x["risk_reduction_abs"])
+                st.info(
+                    f"**Recommendation**: Based on this analysis, "
+                    f"**{most_effective['name']}** provides the greatest reduction in "
+                    f"{comp_years}-year hypertension risk "
+                    f"({most_effective['risk_reduction_abs'] * 100:.1f}% absolute reduction)."
+                )
+        else:
+            st.write("Please select at least one intervention to compare.")
+# Tab 4: Cost-Effectiveness Analysis (Deterministic)
+# Tab 4: Cost-Effectiveness Analysis (Deterministic)
+with tab4:
+    st.subheader("Cost-Effectiveness Analysis (Deterministic)")
+    st.info("📌 This analysis uses **point estimates** (single values) for all parameters")
+    st.write("### Analysis Settings")
+    if available_interventions:
+        cea_intervention = st.selectbox(
+            "Select intervention to analyze:",
+            [name for _, name in available_interventions],
+            index=0
+        )
+        cea_feature = next(
+            feature for feature, name in available_interventions
+            if name == cea_intervention
+        )
+        param_col1, param_col2, param_col3 = st.columns(3)
+        with param_col1:
+            cea_cycles = st.slider("Time Horizon (Years)", min_value=5, max_value=30, value=10)
+            discount_rate = st.slider("Discount Rate (%)", min_value=0, max_value=10, value=3) / 100
+        with param_col2:
+            st.write("#### Cost Settings ($ per year)")
+            cost_normal = st.number_input("Cost - Normal BP", min_value=0, max_value=5000, value=200)
+            cost_pre = st.number_input("Cost - Prehypertension", min_value=0, max_value=5000, value=600)
+            cost_s1 = st.number_input("Cost - Stage 1 HTN", min_value=0, max_value=5000, value=1200)
+            cost_s2 = st.number_input("Cost - Stage 2 HTN", min_value=0, max_value=5000, value=2200)
+        with param_col3:
+            st.write("#### Utility Settings (QOL 0-1)")
+            util_normal = st.slider("Utility - Normal BP", min_value=0.0, max_value=1.0, value=1.0, step=0.05)
+            util_pre = st.slider("Utility - Prehypertension", min_value=0.0, max_value=1.0, value=0.9, step=0.05)
+            util_s1 = st.slider("Utility - Stage 1 HTN", min_value=0.0, max_value=1.0, value=0.7, step=0.05)
+            util_s2 = st.slider("Utility - Stage 2 HTN", min_value=0.0, max_value=1.0, value=0.5, step=0.05)
+        st.write("#### Intervention Settings")
+        int_cost_increase = st.number_input(
+            "Additional Intervention Cost ($/year)",
+            min_value=0,
+            max_value=2000,
+            value=500
+        )
+        st.markdown("---")
+        C_A = np.array([cost_normal, cost_pre, cost_s1, cost_s2])
+        C_B = C_A.copy()
+        C_B[0] += int_cost_increase
+        U = np.array([util_normal, util_pre, util_s1, util_s2])
+        sex_code = "M" if sex == "Male" else "F"
+        try:
+            # Calculate baseline scenario
+            l1_base, l2_base, l3_base, r_base = hazards_from_beta(
+                sex_code, patient_features,
+                lam10=0.08, lam20=0.10, lam30=0.12, rho0=0.05
+            )
+            Q_base = Q_matrix(l1_base, l2_base, l3_base, r_base)
+            P_base = discrete_P(Q_base, 1.0)
+            start_dist = np.array([1, 0, 0, 0], float)
+            cost_A, qaly_A, _ = run_markov(P_base, C_A, U, start_dist, cea_cycles, discount_rate)
+            # Calculate intervention scenario
+            int_features = patient_features.copy()
+            if cea_feature == "Exercise_freq":
+                int_features[cea_feature] = 1
+            else:
+                int_features[cea_feature] = 0
+            l1_int, l2_int, l3_int, r_int = hazards_from_beta(
+                sex_code, int_features,
+                lam10=0.08, lam20=0.10, lam30=0.12, rho0=0.05
+            )
+            Q_int = Q_matrix(l1_int, l2_int, l3_int, r_int)
+            P_int = discrete_P(Q_int, 1.0)
+            cost_B, qaly_B, _ = run_markov(P_int, C_B, U, start_dist, cea_cycles, discount_rate)
+            # Calculate incremental values
+            cost_diff = cost_B - cost_A
+            qaly_diff = qaly_B - qaly_A
+            icer_val = icer(cost_A, qaly_A, cost_B, qaly_B)
+            # ✅ 新增：成本效益平面圖 (單點)
+            st.write("### Cost-Effectiveness Plane")
+            fig, ax = plt.subplots(figsize=(10, 8))
+            # Plot the single point
+            ax.scatter(qaly_diff, cost_diff, s=300, c='red', marker='*',
+                       edgecolors='black', linewidths=2, label='Intervention vs Baseline', zorder=5)
+            # Add quadrant lines
+            ax.axhline(0, color='gray', linestyle='--', alpha=0.5, linewidth=1)
+            ax.axvline(0, color='gray', linestyle='--', alpha=0.5, linewidth=1)
+            # Add WTP threshold line
+            wtp_threshold = 50000
+            xlim = ax.get_xlim()
+            ylim = ax.get_ylim()
+            # Extend the line across the plot
+            x_range = np.linspace(min(xlim[0], -0.01), max(xlim[1], 0.01), 100)
+            ax.plot(x_range, x_range * wtp_threshold, 'k--', alpha=0.5,
+                    linewidth=2, label=f'WTP ${wtp_threshold:,}/QALY')
+            # Add quadrant labels
+            ax.text(0.95, 0.95, 'More Costly\nMore Effective',
+                    transform=ax.transAxes, ha='right', va='top', fontsize=10, alpha=0.5)
+            ax.text(0.05, 0.95, 'More Costly\nLess Effective',
+                    transform=ax.transAxes, ha='left', va='top', fontsize=10, alpha=0.5)
+            ax.text(0.95, 0.05, 'Less Costly\nMore Effective',
+                    transform=ax.transAxes, ha='right', va='bottom', fontsize=10, alpha=0.5)
+            ax.text(0.05, 0.05, 'Less Costly\nLess Effective',
+                    transform=ax.transAxes, ha='left', va='bottom', fontsize=10, alpha=0.5)
+            ax.set_xlabel("Incremental QALYs", fontsize=12, fontweight='bold')
+            ax.set_ylabel("Incremental Cost ($)", fontsize=12, fontweight='bold')
+            ax.set_title("Cost-Effectiveness Plane (Deterministic Analysis)",
+                         fontsize=14, fontweight='bold')
+            ax.legend(fontsize=11)
+            ax.grid(alpha=0.3)
+            st.pyplot(fig)
+            st.write("### Summary Metrics")
+            metrics_cols = st.columns(3)
+            with metrics_cols[0]:
+                st.metric("Incremental Cost", f"${cost_diff:.2f}")
+            with metrics_cols[1]:
+                st.metric("Incremental QALYs", f"{qaly_diff:.3f}")
+            with metrics_cols[2]:
+                st.metric("ICER ($/QALY)", f"${icer_val:.2f}")
+            # ✅ 修改：決策建議（不使用機率）
+            st.write("### Cost-Effectiveness Decision")
+            wtp_threshold = 50000
+            if qaly_diff > 0:
+                if cost_diff <= 0:
+                    st.success("✅ **DOMINANT**: This intervention saves money and improves health.")
+                    st.info(
+                        f"💡 The intervention provides {qaly_diff:.3f} additional QALYs while saving ${-cost_diff:.2f}.")
+                elif icer_val < wtp_threshold:
+                    st.success(
+                        f"✅ **COST-EFFECTIVE**: ICER = ${icer_val:.2f}/QALY is below the ${wtp_threshold:,}/QALY threshold.")
+                    st.info(
+                        f"💡 For every additional QALY gained, it costs ${icer_val:.2f}, which is considered acceptable.")
+                else:
+                    st.warning(
+                        f"⚠️ **NOT COST-EFFECTIVE**: ICER = ${icer_val:.2f}/QALY exceeds the ${wtp_threshold:,}/QALY threshold.")
+                    st.info(
+                        f"💡 The intervention would need to cost ${qaly_diff * wtp_threshold:.2f} or less to be cost-effective at this threshold.")
+            else:
+                if cost_diff >= 0:
+                    st.error("❌ **DOMINATED**: This intervention costs more and worsens health outcomes.")
+                else:
+                    st.warning("⚠️ **TRADE-OFF**: This intervention saves money but reduces QALYs.")
+                    st.info(f"💡 It saves ${-cost_diff:.2f} but loses {-qaly_diff:.3f} QALYs.")
+            # ✅ 新增：敏感度分析表格
+            st.write("### Sensitivity to WTP Threshold")
+            st.caption(
+                "This shows whether the intervention would be considered cost-effective at different willingness-to-pay thresholds.")
+            wtp_thresholds = [25000, 50000, 75000, 100000, 150000]
+            decision_data = []
+            for wtp in wtp_thresholds:
+                if qaly_diff > 0:
+                    if cost_diff <= 0:
+                        decision = "✅ Dominant (Cost-Effective)"
+                    elif icer_val < wtp:
+                        decision = "✅ Cost-Effective"
+                    else:
+                        decision = "❌ Not Cost-Effective"
+                else:
+                    if cost_diff >= 0:
+                        decision = "❌ Dominated"
+                    else:
+                        decision = "⚠️ Saves Money, Loses QALYs"
+                decision_data.append({
+                    "WTP Threshold": f"${wtp:,}/QALY",
+                    "Decision": decision
+                })
+            decision_df = pd.DataFrame(decision_data)
+            st.table(decision_df.set_index("WTP Threshold"))
+        except Exception as e:
+            st.error(f"An error occurred while running the cost-effectiveness analysis: {str(e)}")
+            st.error("Try different parameters or a different intervention.")
+    else:
+        st.write("No modifiable risk factors available for intervention.")
+# Tab 5: Cost-Effectiveness Analysis (Probabilistic)
+with tab5:
+    st.subheader("Cost-Effectiveness Analysis (Probabilistic)")
+    st.info(
+        "📌 This analysis uses **distributions** (with Standard Errors) for all parameters to account for uncertainty")
+    st.write("### Analysis Settings")
+    if available_interventions:
+        psa_intervention = st.selectbox(
+            "Select intervention to analyze:",
+            [name for _, name in available_interventions],
+            index=0,
+            key="psa_intervention"
+        )
+        psa_feature = next(
+            feature for feature, name in available_interventions
+            if name == psa_intervention
+        )
+        psa_col1, psa_col2, psa_col3 = st.columns(3)
+        with psa_col1:
+            st.write("#### Simulation Settings")
+            psa_cycles = st.slider("Time Horizon (Years)", min_value=5, max_value=30, value=10, key="psa_cycles")
+            psa_discount_rate = st.slider("Discount Rate (%)", min_value=0, max_value=10, value=3,
+                                          key="psa_discount") / 100
+            n_simulations = st.slider("Number of Simulations", min_value=100, max_value=10000, value=1000, step=100)
+        with psa_col2:
+            st.write("#### Cost Parameters (Mean ± SE)")
+            cost_normal_mean = st.number_input("Cost - Normal BP (Mean)", min_value=0, max_value=5000, value=200,
+                                               key="psa_cn_mean")
+            cost_normal_se = st.number_input("Cost - Normal BP (SE)", min_value=0, max_value=500, value=20,
+                                             key="psa_cn_se")
+            cost_pre_mean = st.number_input("Cost - Prehypertension (Mean)", min_value=0, max_value=5000, value=600,
+                                            key="psa_cp_mean")
+            cost_pre_se = st.number_input("Cost - Prehypertension (SE)", min_value=0, max_value=500, value=60,
+                                          key="psa_cp_se")
+            cost_s1_mean = st.number_input("Cost - Stage 1 HTN (Mean)", min_value=0, max_value=5000, value=1200,
+                                           key="psa_cs1_mean")
+            cost_s1_se = st.number_input("Cost - Stage 1 HTN (SE)", min_value=0, max_value=500, value=120,
+                                         key="psa_cs1_se")
+            cost_s2_mean = st.number_input("Cost - Stage 2 HTN (Mean)", min_value=0, max_value=5000, value=2200,
+                                           key="psa_cs2_mean")
+            cost_s2_se = st.number_input("Cost - Stage 2 HTN (SE)", min_value=0, max_value=500, value=220,
+                                         key="psa_cs2_se")
+        with psa_col3:
+            st.write("#### Utility Parameters (Mean ± SE)")
+            util_normal_mean = st.slider("Utility - Normal BP (Mean)", 0.0, 1.0, 1.0, 0.01, key="psa_un_mean")
+            util_normal_se = st.slider("Utility - Normal BP (SE)", 0.0, 0.1, 0.01, 0.001, key="psa_un_se")
+            util_pre_mean = st.slider("Utility - Prehypertension (Mean)", 0.0, 1.0, 0.9, 0.01, key="psa_up_mean")
+            util_pre_se = st.slider("Utility - Prehypertension (SE)", 0.0, 0.1, 0.02, 0.001, key="psa_up_se")
+            util_s1_mean = st.slider("Utility - Stage 1 HTN (Mean)", 0.0, 1.0, 0.7, 0.01, key="psa_us1_mean")
+            util_s1_se = st.slider("Utility - Stage 1 HTN (SE)", 0.0, 0.1, 0.03, 0.001, key="psa_us1_se")
+            util_s2_mean = st.slider("Utility - Stage 2 HTN (Mean)", 0.0, 1.0, 0.5, 0.01, key="psa_us2_mean")
+            util_s2_se = st.slider("Utility - Stage 2 HTN (SE)", 0.0, 0.1, 0.05, 0.001, key="psa_us2_se")
+        st.write("#### Intervention Settings")
+        psa_int_cost_mean = st.number_input("Additional Intervention Cost (Mean)", 0, 2000, 500, key="psa_int_cost_mean")
+        psa_int_cost_se = st.number_input("Additional Intervention Cost (SE)", 0, 200, 50, key="psa_int_cost_se")
+        st.markdown("---")
+        if st.button("🎲 Run Probabilistic Analysis", type="primary"):
+            with st.spinner(f"Running {n_simulations} Monte Carlo simulations..."):
+                try:
+                    # Storage for simulation results
+                    results_cost_A = []
+                    results_cost_B = []
+                    results_qaly_A = []
+                    results_qaly_B = []
+                    results_icer = []
+                    results_delta_cost = []
+                    results_delta_qaly = []
+                    progress_bar = st.progress(0)
+                    for sim in range(n_simulations):
+                        # Sample from distributions (using normal distribution with SE)
+                        # For costs - use gamma distribution (non-negative)
+                        # For utilities - use beta distribution (bounded 0-1)
+                        # Sample costs (using gamma approximation)
+                        def sample_cost(mean, se):
+                            if se == 0:
+                                return mean
+                            shape = (mean / se) ** 2
+                            scale = se ** 2 / mean
+                            return np.random.gamma(shape, scale)
+                        # Sample utilities (using beta approximation)
+                        def sample_utility(mean, se):
+                            if se == 0 or mean == 0 or mean == 1:
+                                return np.clip(mean, 0, 1)
+                            # Beta distribution parameters
+                            alpha = mean * ((mean * (1 - mean) / (se ** 2)) - 1)
+                            beta = (1 - mean) * ((mean * (1 - mean) / (se ** 2)) - 1)
+                            if alpha > 0 and beta > 0:
+                                return np.random.beta(alpha, beta)
+                            else:
+                                return np.clip(np.random.normal(mean, se), 0, 1)
+                        # Sample parameters for this iteration
+                        C_A_sim = np.array([
+                            sample_cost(cost_normal_mean, cost_normal_se),
+                            sample_cost(cost_pre_mean, cost_pre_se),
+                            sample_cost(cost_s1_mean, cost_s1_se),
+                            sample_cost(cost_s2_mean, cost_s2_se)
+                        ])
+                        int_cost_add = sample_cost(psa_int_cost_mean, psa_int_cost_se)
+                        C_B_sim = C_A_sim.copy()
+                        C_B_sim[0] += int_cost_add
+                        U_sim = np.array([
+                            sample_utility(util_normal_mean, util_normal_se),
+                            sample_utility(util_pre_mean, util_pre_se),
+                            sample_utility(util_s1_mean, util_s1_se),
+                            sample_utility(util_s2_mean, util_s2_se)
+                        ])
+                        # Run analysis with sampled parameters
+                        sex_code = "M" if sex == "Male" else "F"
+                        # Get transition matrices (using point estimates for transition probabilities)
+                        l1, l2, l3, r = hazards_from_beta(sex_code, patient_features,
+                                                          lam10=0.08, lam20=0.10, lam30=0.12, rho0=0.05)
+                        # Baseline
+                        Q_A = Q_matrix(l1, l2, l3, r)
+                        P_A = discrete_P(Q_A, 1.0)
+                        # Intervention
+                        int_features = patient_features.copy()
+                        if psa_feature == "Exercise_freq":
+                            int_features[psa_feature] = 1
+                        else:
+                            int_features[psa_feature] = 0
+                        l1_b, l2_b, l3_b, r_b = hazards_from_beta(sex_code, int_features,
+                                                                  lam10=0.08, lam20=0.10, lam30=0.12, rho0=0.05)
+                        Q_B = Q_matrix(l1_b, l2_b, l3_b, r_b)
+                        P_B = discrete_P(Q_B, 1.0)
+                        # Run Markov models
+                        start_dist = np.array([1, 0, 0, 0], float)
+                        cost_A_sim, qaly_A_sim, _ = run_markov(P_A, C_A_sim, U_sim, start_dist,
+                                                               psa_cycles, psa_discount_rate)
+                        cost_B_sim, qaly_B_sim, _ = run_markov(P_B, C_B_sim, U_sim, start_dist,
+                                                               psa_cycles, psa_discount_rate)
+                        # Calculate incremental values
+                        delta_cost = cost_B_sim - cost_A_sim
+                        delta_qaly = qaly_B_sim - qaly_A_sim
+                        # Calculate ICER
+                        if abs(delta_qaly) > 1e-9:
+                            icer_sim = delta_cost / delta_qaly
+                        else:
+                            icer_sim = np.inf if delta_cost > 0 else -np.inf
+                        # Store results
+                        results_cost_A.append(cost_A_sim)
+                        results_cost_B.append(cost_B_sim)
+                        results_qaly_A.append(qaly_A_sim)
+                        results_qaly_B.append(qaly_B_sim)
+                        results_delta_cost.append(delta_cost)
+                        results_delta_qaly.append(delta_qaly)
+                        results_icer.append(icer_sim)
+                        # Update progress
+                        progress_bar.progress((sim + 1) / n_simulations)
+                    progress_bar.empty()
+                    # Convert to arrays
+                    results_delta_cost = np.array(results_delta_cost)
+                    results_delta_qaly = np.array(results_delta_qaly)
+                    results_icer = np.array(results_icer)
+                    # Filter out infinite ICERs for display
+                    results_icer_finite = results_icer[np.isfinite(results_icer)]
+                    # Display results
+                    st.success(f"✅ Completed {n_simulations} simulations!")
+                    st.write("### Probabilistic Results Summary")
+                    summary_cols = st.columns(4)
+                    with summary_cols[0]:
+                        st.metric("Mean ΔCost", f"${np.mean(results_delta_cost):.2f}")
+                        st.caption(
+                            f"95% CI: [{np.percentile(results_delta_cost, 2.5):.2f}, {np.percentile(results_delta_cost, 97.5):.2f}]")
+                    with summary_cols[1]:
+                        st.metric("Mean ΔQALY", f"{np.mean(results_delta_qaly):.4f}")
+                        st.caption(
+                            f"95% CI: [{np.percentile(results_delta_qaly, 2.5):.4f}, {np.percentile(results_delta_qaly, 97.5):.4f}]")
+                    with summary_cols[2]:
+                        st.metric("Mean ICER", f"${np.mean(results_icer_finite):.2f}/QALY")
+                        st.caption(
+                            f"95% CI: [{np.percentile(results_icer_finite, 2.5):.2f}, {np.percentile(results_icer_finite, 97.5):.2f}]")
+                    with summary_cols[3]:
+                        # Calculate probability cost-effective at $50k threshold
+                        wtp_threshold = 50000
+                        prob_ce = np.mean((results_delta_cost / results_delta_qaly) < wtp_threshold)
+                        st.metric("Prob. Cost-Effective", f"{prob_ce * 100:.1f}%")
+                        st.caption(f"at ${wtp_threshold:,}/QALY")
+                    # Cost-Effectiveness Plane
+                    st.write("### Cost-Effectiveness Plane (Scatter Plot)")
+                    fig, ax = plt.subplots(figsize=(10, 8))
+                    # Plot scatter points
+                    ax.scatter(results_delta_qaly, results_delta_cost, alpha=0.3, s=20, c='blue')
+                    # Plot mean point
+                    ax.scatter(np.mean(results_delta_qaly), np.mean(results_delta_cost),
+                               color='red', s=200, marker='*', edgecolors='black', linewidths=2,
+                               label='Mean', zorder=5)
+                    # Add quadrant lines
+                    ax.axhline(0, color='gray', linestyle='--', alpha=0.5, linewidth=1)
+                    ax.axvline(0, color='gray', linestyle='--', alpha=0.5, linewidth=1)
+                    # Add WTP threshold line
+                    xlim = ax.get_xlim()
+                    x_range = np.linspace(xlim[0], xlim[1], 100)
+                    ax.plot(x_range, x_range * wtp_threshold, 'k--', alpha=0.5,
+                            linewidth=2, label=f'WTP ${wtp_threshold:,}/QALY')
+                    ax.set_xlabel("Incremental QALYs", fontsize=12, fontweight='bold')
+                    ax.set_ylabel("Incremental Cost ($)", fontsize=12, fontweight='bold')
+                    ax.set_title("Cost-Effectiveness Plane (Probabilistic Sensitivity Analysis)",
+                                 fontsize=14, fontweight='bold')
+                    ax.legend(fontsize=11)
+                    ax.grid(alpha=0.3)
+                    st.pyplot(fig)
+                    # CEAC Curve
+                    st.write("### Cost-Effectiveness Acceptability Curve (CEAC)")
+                    wtp_range = np.linspace(0, 150000, 100)
+                    prob_ce_array = []
+                    for wtp in wtp_range:
+                        # Count simulations where intervention is cost-effective
+                        ce_count = 0
+                        for i in range(n_simulations):
+                            if results_delta_qaly[i] > 0:
+                                if results_delta_cost[i] < 0:  # Dominant
+                                    ce_count += 1
+                                elif (results_delta_cost[i] / results_delta_qaly[i]) < wtp:
+                                    ce_count += 1
+                            elif results_delta_qaly[i] < 0 and results_delta_cost[i] < 0:
+                                # Trade-off: saves money but loses QALYs
+                                # Not typically considered cost-effective
+                                pass
+                        prob_ce_array.append(ce_count / n_simulations)
+                    fig2, ax2 = plt.subplots(figsize=(10, 6))
+                    ax2.plot(wtp_range, prob_ce_array, 'b-', linewidth=2)
+                    ax2.axhline(0.5, color='gray', linestyle='--', alpha=0.5)
+                    ax2.axvline(50000, color='red', linestyle='--', alpha=0.5, label='$50,000/QALY')
+                    ax2.set_xlabel("Willingness-to-Pay Threshold ($/QALY)", fontsize=12, fontweight='bold')
+                    ax2.set_ylabel("Probability Cost-Effective", fontsize=12, fontweight='bold')
+                    ax2.set_title("Cost-Effectiveness Acceptability Curve", fontsize=14, fontweight='bold')
+                    ax2.set_ylim(0, 1)
+                    ax2.grid(alpha=0.3)
+                    ax2.legend()
+                    st.pyplot(fig2)
+                    # Distribution histograms
+                    st.write("### Distribution of Results")
+                    hist_col1, hist_col2 = st.columns(2)
+                    with hist_col1:
+                        fig3, ax3 = plt.subplots(figsize=(8, 5))
+                        ax3.hist(results_delta_cost, bins=50, alpha=0.7, color='blue', edgecolor='black')
+                        ax3.axvline(np.mean(results_delta_cost), color='red', linestyle='--',
+                                    linewidth=2, label=f'Mean: ${np.mean(results_delta_cost):.2f}')
+                        ax3.set_xlabel("Incremental Cost ($)")
+                        ax3.set_ylabel("Frequency")
+                        ax3.set_title("Distribution of Incremental Costs")
+                        ax3.legend()
+                        ax3.grid(alpha=0.3)
+                        st.pyplot(fig3)
+                    with hist_col2:
+                        fig4, ax4 = plt.subplots(figsize=(8, 5))
+                        ax4.hist(results_delta_qaly, bins=50, alpha=0.7, color='green', edgecolor='black')
+                        ax4.axvline(np.mean(results_delta_qaly), color='red', linestyle='--',
+                                    linewidth=2, label=f'Mean: {np.mean(results_delta_qaly):.4f}')
+                        ax4.set_xlabel("Incremental QALYs")
+                        ax4.set_ylabel("Frequency")
+                        ax4.set_title("Distribution of Incremental QALYs")
+                        ax4.legend()
+                        ax4.grid(alpha=0.3)
+                        st.pyplot(fig4)
+                    # ICER distribution
+                    st.write("### ICER Distribution")
+                    fig5, ax5 = plt.subplots(figsize=(10, 5))
+                    ax5.hist(results_icer_finite, bins=50, alpha=0.7, color='purple', edgecolor='black')
+                    ax5.axvline(np.mean(results_icer_finite), color='red', linestyle='--',
+                                linewidth=2, label=f'Mean: ${np.mean(results_icer_finite):.2f}/QALY')
+                    ax5.axvline(50000, color='orange', linestyle='--', linewidth=2,
+                                label='$50,000/QALY threshold')
+                    ax5.set_xlabel("ICER ($/QALY)")
+                    ax5.set_ylabel("Frequency")
+                    ax5.set_title("Distribution of ICER Values")
+                    ax5.legend()
+                    ax5.grid(alpha=0.3)
+                    st.pyplot(fig5)
+                except Exception as e:
+                    st.error(f"Error running probabilistic analysis: {str(e)}")
+                    import traceback
+                    st.code(traceback.format_exc())
+    else:
+        st.write("No modifiable risk factors available for intervention.")
+# Tab 6: Personalized Chat
+with tab6:
+    st.subheader("💬 Personalized Health Recommendations")
+    if not openai_api_key:
+        st.warning("⚠️ Please enter your OpenAI API key in the top right corner to use Personalized Chat.")
+    else:
+        if not st.session_state.summary_generated:
+            if st.button("✨ Generate Personalized Health Summary", type="primary"):
+                with st.spinner("Analyzing your health profile..."):
+                    try:
+                        llm = get_llm()
+                        if llm:
+                            patient_info = get_patient_info_string()
+                            summary_prompt = f"""Generate a comprehensive health assessment summary for this patient:
+{patient_info}
+Include:
+1. Overall risk level assessment
+2. Key modifiable risk factors
+3. Top 3 priority recommendations
+4. Expected health benefits of interventions
+IMPORTANT:
+- Use plain text formatting only (no LaTeX, no \\text{{}} or \\frac{{}}{{}} syntax)
+- Write any formulas in plain text
+- Use simple markdown formatting (**, -, numbers) for emphasis
+- Avoid special characters that may not render correctly
+Format the response with clear sections and bullet points."""
+                            response = llm.invoke(summary_prompt).content
+                            response = response.replace("\\text{", "").replace("}", "")
+                            response = response.replace("\\frac{", "(").replace("}{", ")/(")
+                            st.session_state.recommendation_messages.append({
+                                "role": "assistant",
+                                "content": response
+                            })
+                            st.session_state.summary_generated = True
+                            st.rerun()
+                    except Exception as e:
+                        st.error(f"Error generating summary: {str(e)}")
+        for message in st.session_state.recommendation_messages:
+            with st.chat_message(message["role"]):
+                st.markdown(message["content"])
+        if st.session_state.summary_generated:
+            if prompt := st.chat_input("Ask about your personalized recommendations..."):
+                st.session_state.recommendation_messages.append({"role": "user", "content": prompt})
+                with st.chat_message("user"):
+                    st.markdown(prompt)
+                with st.chat_message("assistant"):
+                    with st.spinner("Thinking..."):
+                        try:
+                            llm = get_llm()
+                            if llm:
+                                patient_info = get_patient_info_string()
+                                history_text = ""
+                                for msg in st.session_state.recommendation_messages[-10:]:
+                                    role = "Patient" if msg["role"] == "user" else "Health Coach"
+                                    history_text += f"{role}: {msg['content']}\n\n"
+                                full_prompt = f"""You are a personalized health coach specializing in hypertension management.
+PATIENT PROFILE:
+{patient_info}
+Provide evidence-based, actionable recommendations for:
+- Weight management and DASH diet
+- Exercise prescriptions
+- Smoking cessation strategies
+- Medication adherence
+- Lifestyle modifications
+- Stress management
+Be empathetic, practical, and motivating. Cite specific guidelines when relevant.
+IMPORTANT:
+- Use plain text formatting only (no LaTeX, no \\text{{}} or \\frac{{}}{{}} syntax)
+- Write formulas in plain text
+- Use simple markdown formatting (**, -, numbers) for emphasis
+- Avoid special characters that may not render correctly
+Conversation History:
+{history_text}
+Patient Question: {prompt}
+Your Personalized Advice:"""
+                                response = llm.invoke(full_prompt).content
+                                response = response.replace("\\text{", "").replace("}", "")
+                                response = response.replace("\\frac{", "(").replace("}{", ")/(")
+                                st.markdown(response, unsafe_allow_html=False)
+                                st.session_state.recommendation_messages.append({
+                                    "role": "assistant",
+                                    "content": response
+                                })
+                            else:
+                                st.error("Failed to initialize AI.")
+                        except Exception as e:
+                            st.error(f"Error: {str(e)}")
+# Footer
+st.markdown("---")
+st.markdown("""
+<div style='text-align: center; color: gray; font-size: 0.8em;'>
+    Powered by LangChain & OpenAI | Hypertension CEA Tool v2.0
+</div>
+""", unsafe_allow_html=True)

requirements.txt CHANGED Viewed

@@ -1,3 +1,9 @@
-altair
 pandas
-streamlit

+streamlit
+numpy
 pandas
+matplotlib
+scipy
+langchain-openai
+langchain-community
+langchain-core
+chromadb