techatcreated commited on
Commit
9aa4b61
·
verified ·
1 Parent(s): 67dcdf4

Upload app.py

Browse files
Files changed (1) hide show
  1. app.py +1983 -0
app.py ADDED
@@ -0,0 +1,1983 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ SWAN Menopause Stage Prediction & Forecasting — Gradio UI
3
+ Hugging Face Spaces deployment-ready.
4
+
5
+ Run locally: python app.py
6
+ Deploy: Push to a HF Space with SDK=gradio
7
+
8
+ Output structure (per execution):
9
+ swan_ml_output/
10
+ <YYYYMMDD_HHMMSS>/
11
+ charts/ ← PNG visualizations
12
+ predictions/ ← CSV result files
13
+ reports/ ← TXT summary reports
14
+ """
15
+
16
+ import os
17
+ import json
18
+ import warnings
19
+ from datetime import datetime
20
+ from pathlib import Path
21
+ from typing import Optional
22
+
23
+ import numpy as np
24
+ import pandas as pd
25
+ import matplotlib
26
+ matplotlib.use("Agg")
27
+ import matplotlib.pyplot as plt
28
+
29
+ warnings.filterwarnings("ignore")
30
+
31
+ # ── Gradio ────────────────────────────────────────────────────────────────────
32
+ import gradio as gr
33
+
34
+ # ── Local ML module ───────────────────────────────────────────────────────────
35
+ try:
36
+ from menopause import (
37
+ MenopauseForecast,
38
+ SymptomCycleForecaster,
39
+ load_forecast_model,
40
+ )
41
+ _MODULE_AVAILABLE = True
42
+ except ImportError:
43
+ _MODULE_AVAILABLE = False
44
+
45
+ # ── Model loading ─────────────────────────────────────────────────────────────
46
+ FORECAST_DIR = os.environ.get("FORECAST_DIR", "swan_ml_output")
47
+ OUTPUT_BASE = Path(FORECAST_DIR)
48
+
49
+ _forecast: Optional[MenopauseForecast] = None # type: ignore[type-arg]
50
+ _metadata: dict = {}
51
+
52
+
53
+ def _load_models():
54
+ """Attempt to load saved joblib pipelines. Returns (success, message)."""
55
+ global _forecast, _metadata
56
+
57
+ if not _MODULE_AVAILABLE:
58
+ return False, "menopause.py not found. Make sure it is in the same directory."
59
+
60
+ meta_path = Path(FORECAST_DIR) / "forecast_metadata.json"
61
+ rf_path = Path(FORECAST_DIR) / "rf_pipeline.pkl"
62
+ lr_path = Path(FORECAST_DIR) / "lr_pipeline.pkl"
63
+
64
+ if not all(p.exists() for p in (meta_path, rf_path, lr_path)):
65
+ return (
66
+ False,
67
+ f"Model artifacts not found in '{FORECAST_DIR}'. "
68
+ "Run `python menopause.py` to train and save the models first.",
69
+ )
70
+
71
+ try:
72
+ _forecast = load_forecast_model(FORECAST_DIR)
73
+ with open(meta_path) as fh:
74
+ _metadata = json.load(fh)
75
+ return True, f"✅ Models loaded — {len(_metadata.get('feature_names', []))} features"
76
+ except Exception as exc:
77
+ return False, f"Error loading models: {exc}"
78
+
79
+
80
+ _MODEL_OK, _MODEL_MSG = _load_models()
81
+
82
+
83
+ # ── Output directory management ───────────────────────────────────────────────
84
+
85
+ def _make_run_dir() -> Path:
86
+ """Create and return a unique timestamped run directory under swan_ml_output/."""
87
+ ts = datetime.now().strftime("%Y%m%d_%H%M%S")
88
+ run_dir = OUTPUT_BASE / ts
89
+ (run_dir / "charts").mkdir(parents=True, exist_ok=True)
90
+ (run_dir / "predictions").mkdir(parents=True, exist_ok=True)
91
+ (run_dir / "reports").mkdir(parents=True, exist_ok=True)
92
+ return run_dir
93
+
94
+
95
+ def _get_file_path(file_obj) -> Optional[str]:
96
+ """
97
+ Safely extract a file-system path from a Gradio file component value.
98
+
99
+ Gradio ≤ 3.x → returns a file-like object with a .name attribute.
100
+ Gradio 4.x → returns a str path (or NamedString subclass).
101
+ This helper handles both.
102
+ """
103
+ if file_obj is None:
104
+ return None
105
+ if hasattr(file_obj, "name"):
106
+ return file_obj.name
107
+ return str(file_obj)
108
+
109
+
110
+ # ── Constants & helpers ───────────────────────────────────────────────────────
111
+
112
+ STAGE_COLORS = {"pre": "#16a34a", "peri": "#d97706", "post": "#7c3aed"}
113
+ STAGE_EMOJI = {"pre": "🟢", "peri": "🟡", "post": "🟣"}
114
+ STAGE_LABELS = {
115
+ "pre": "Pre-Menopausal",
116
+ "peri": "Peri-Menopausal",
117
+ "post": "Post-Menopausal",
118
+ }
119
+
120
+ STAGE_INFO = {
121
+ "pre": {
122
+ "title": "Pre-Menopausal",
123
+ "description": "Regular menstrual cycles with typical hormonal fluctuations. Ovarian function is normal.",
124
+ "symptoms": ["Regular periods", "Normal hormone levels", "Potential mild PMS"],
125
+ "guidance": "Maintain regular check-ups. Track your cycle and note any changes.",
126
+ },
127
+ "peri": {
128
+ "title": "Peri-Menopausal (Transition)",
129
+ "description": "Hormonal changes begin — estrogen and progesterone levels fluctuate. Cycles become irregular.",
130
+ "symptoms": ["Irregular periods", "Hot flashes", "Sleep disturbances", "Mood changes", "Night sweats"],
131
+ "guidance": "Consult your healthcare provider. Lifestyle adjustments (diet, exercise, sleep) can help.",
132
+ },
133
+ "post": {
134
+ "title": "Post-Menopausal",
135
+ "description": "12+ months since last menstrual period. Estrogen remains at consistently lower levels.",
136
+ "symptoms": ["No periods", "Possible continued hot flashes", "Vaginal dryness", "Bone density changes"],
137
+ "guidance": "Focus on bone health, cardiovascular health, and regular screenings. Discuss HRT options.",
138
+ },
139
+ }
140
+
141
+ # Feature descriptions keyed by the model's canonical feature names
142
+ FEATURE_DESCRIPTIONS = {
143
+ "PAIN17": "Pain indicator (visit-specific)",
144
+ "PAINTW17": "Pain two-week indicator",
145
+ "PAIN27": "Secondary pain indicator",
146
+ "PAINTW27": "Secondary pain two-week indicator",
147
+ "SLEEP17": "Sleep disturbance pattern 1",
148
+ "SLEEP27": "Sleep disturbance pattern 2",
149
+ "BCOHOTH7": "Birth control — other method",
150
+ "EXERCIS7": "General exercise indicator",
151
+ "EXERHAR7": "Vigorous exercise",
152
+ "EXEROST7": "Osteoporosis exercise",
153
+ "EXERMEN7": "Exercise — mental health",
154
+ "EXERLOO7": "Exercise lookalike",
155
+ "EXERMEM7": "Exercise — memory",
156
+ "EXERPER7": "Exercise perception",
157
+ "EXERGEN7": "General exercise type",
158
+ "EXERWGH7": "Weight exercise",
159
+ "EXERADV7": "Exercise advice indicator",
160
+ "EXEROTH7": "Other exercise",
161
+ "EXERSPE7": "Specific exercise",
162
+ "ABBLEED7": "Abnormal bleeding (0=no, 1=yes)", # ← correct feature name
163
+ "BLEEDNG7": "Bleeding pattern",
164
+ "LMPDAY7": "Last menstrual period day",
165
+ "DEPRESS7": "Depression indicator",
166
+ "SEX17": "Sexual activity indicator 1",
167
+ "SEX27": "Sexual activity indicator 2",
168
+ "SEX37": "Sexual activity indicator 3",
169
+ "SEX47": "Sexual activity indicator 4",
170
+ "SEX57": "Sexual activity indicator 5",
171
+ "SEX67": "Sexual activity indicator 6",
172
+ "SEX77": "Sexual activity indicator 7",
173
+ "SEX87": "Sexual activity indicator 8",
174
+ "SEX97": "Sexual activity indicator 9",
175
+ "SEX107": "Sexual activity indicator 10",
176
+ "SEX117": "Sexual activity indicator 11",
177
+ "SEX127": "Sexual activity indicator 12",
178
+ "SMOKERE7": "Smoking status",
179
+ "HOTFLAS7": "Hot flash severity (1=none, 5=very severe)",
180
+ "NUMHOTF7": "Number of hot flashes per week",
181
+ "BOTHOTF7": "How bothersome are hot flashes",
182
+ "IRRITAB7": "Irritability level",
183
+ "VAGINDR7": "Vaginal dryness",
184
+ "MOODCHG7": "Mood change frequency",
185
+ "SLEEPQL7": "Sleep quality score",
186
+ "PHYSILL7": "Physical illness indicators",
187
+ "HOTHEAD7": "Hot flashes with headache",
188
+ "EXER12H7": "Exercise in last 12 hours",
189
+ "ALCO24H7": "Alcohol in last 24h",
190
+ "AGE7": "Age (years)",
191
+ "RACE": "Race (1=White, 2=Black, 3=Chinese, 4=Japanese, 5=Hispanic)",
192
+ "LANGINT7": "Interview language indicator",
193
+ }
194
+
195
+
196
+ def _confidence_color(conf: float) -> str:
197
+ if conf >= 0.8:
198
+ return "#16a34a"
199
+ elif conf >= 0.6:
200
+ return "#d97706"
201
+ return "#dc2626"
202
+
203
+
204
+ # ── Chart builders ────────────────────────────────────────────────────────────
205
+
206
+ def _make_proba_chart(
207
+ probabilities: dict,
208
+ predicted_stage: str,
209
+ save_path: Optional[Path] = None,
210
+ ) -> plt.Figure:
211
+ """Horizontal bar chart for stage probabilities. Optionally saves PNG."""
212
+ fig, ax = plt.subplots(figsize=(6, 3.5))
213
+ fig.patch.set_facecolor("#1a1a2e")
214
+ ax.set_facecolor("#16213e")
215
+
216
+ stages = list(probabilities.keys())
217
+ probs = [probabilities[s] * 100 for s in stages]
218
+ colors = [STAGE_COLORS.get(s, "#607d8b") for s in stages]
219
+ edge_colors = ["white" if s == predicted_stage else "none" for s in stages]
220
+ lws = [2.5 if s == predicted_stage else 0 for s in stages]
221
+
222
+ bars = ax.barh(stages, probs, color=colors, edgecolor=edge_colors,
223
+ linewidth=lws, height=0.5, zorder=3)
224
+
225
+ for bar, prob in zip(bars, probs):
226
+ ax.text(
227
+ min(prob + 1, 98), bar.get_y() + bar.get_height() / 2,
228
+ f"{prob:.1f}%",
229
+ va="center", ha="left", color="white", fontsize=11, fontweight="bold",
230
+ )
231
+
232
+ labels = [STAGE_LABELS.get(s, s) for s in stages]
233
+ ax.set_yticks(range(len(stages)))
234
+ ax.set_yticklabels(labels, color="white", fontsize=10)
235
+ ax.set_xlim(0, 105)
236
+ ax.tick_params(colors="white", labelsize=11)
237
+ ax.spines[["top", "right", "left", "bottom"]].set_visible(False)
238
+ ax.xaxis.set_visible(False)
239
+ for spine in ax.spines.values():
240
+ spine.set_color("#333")
241
+ ax.set_title("Stage Probabilities", color="white", fontsize=12,
242
+ pad=10, fontweight="bold")
243
+ ax.grid(axis="x", color="#333", linestyle="--", linewidth=0.5, zorder=0)
244
+ fig.tight_layout()
245
+
246
+ if save_path:
247
+ fig.savefig(save_path, dpi=150, bbox_inches="tight",
248
+ facecolor=fig.get_facecolor())
249
+ return fig
250
+
251
+
252
+ def _make_cycle_chart(
253
+ cycle_day: int,
254
+ cycle_length: int = 28,
255
+ hot_prob: float = None,
256
+ mood_prob: float = None,
257
+ save_path: Optional[Path] = None,
258
+ ) -> plt.Figure:
259
+ """Circular cycle-day visualization. Optionally saves PNG."""
260
+ fig, ax = plt.subplots(figsize=(5, 5), subplot_kw=dict(polar=True))
261
+ fig.patch.set_facecolor("#1a1a2e")
262
+ ax.set_facecolor("#16213e")
263
+
264
+ days = np.linspace(0, 2 * np.pi, cycle_length, endpoint=False)
265
+ for i, d in enumerate(days):
266
+ phase = i / cycle_length
267
+ color = plt.cm.RdYlGn(1 - phase)
268
+ ax.bar(d, 1, width=2 * np.pi / cycle_length * 0.9,
269
+ bottom=0.5, color=color, alpha=0.4, zorder=1)
270
+
271
+ if cycle_day is not None:
272
+ angle = (cycle_day - 1) / cycle_length * 2 * np.pi
273
+ ax.scatter([angle], [1.05], s=200, color="#ff6b6b", zorder=5, linewidths=2)
274
+ ax.annotate(
275
+ f"Day {cycle_day}",
276
+ xy=(angle, 1.05), xytext=(0, 0),
277
+ textcoords="offset points", ha="center", va="center",
278
+ color="white", fontsize=12, fontweight="bold",
279
+ )
280
+
281
+ ax.set_rticks([])
282
+ ax.set_xticks([i * 2 * np.pi / 4 for i in range(4)])
283
+ ax.set_xticklabels(["Day 1", "Day 7", "Day 14", "Day 21"],
284
+ color="#aaa", fontsize=9)
285
+ ax.set_yticklabels([])
286
+ ax.spines["polar"].set_color("#333")
287
+ ax.grid(color="#333", linewidth=0.5)
288
+
289
+ title = "Cycle Position"
290
+ if hot_prob is not None:
291
+ title += f"\n🔥 {hot_prob:.0%} 😤 {mood_prob:.0%}"
292
+ ax.set_title(title, color="white", fontsize=11, pad=20, fontweight="bold")
293
+ fig.tight_layout()
294
+
295
+ if save_path:
296
+ fig.savefig(save_path, dpi=150, bbox_inches="tight",
297
+ facecolor=fig.get_facecolor())
298
+ return fig
299
+
300
+
301
+ def _make_batch_summary_chart(results_df: pd.DataFrame,
302
+ save_path: Optional[Path] = None) -> None:
303
+ """Stage distribution + confidence histogram for batch runs. Saves PNG."""
304
+ fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(10, 4))
305
+ fig.patch.set_facecolor("#1a1a2e")
306
+
307
+ # Stage distribution pie
308
+ stage_counts = results_df["predicted_stage"].value_counts()
309
+ colors = [STAGE_COLORS.get(s, "#607d8b") for s in stage_counts.index]
310
+ ax1.set_facecolor("#16213e")
311
+ wedges, texts, autotexts = ax1.pie(
312
+ stage_counts.values, labels=stage_counts.index,
313
+ colors=colors, autopct="%1.0f%%",
314
+ textprops={"color": "white", "fontsize": 10},
315
+ )
316
+ for at in autotexts:
317
+ at.set_color("white")
318
+ ax1.set_title("Stage Distribution", color="white", fontsize=11, fontweight="bold")
319
+
320
+ # Confidence histogram
321
+ ax2.set_facecolor("#16213e")
322
+ if "confidence" in results_df.columns:
323
+ conf = results_df["confidence"].dropna()
324
+ ax2.hist(conf, bins=min(10, len(conf)), color="#3B82F6",
325
+ edgecolor="#1a1a2e", alpha=0.8)
326
+ ax2.axvline(0.8, color="#4CAF50", linestyle="--",
327
+ linewidth=1.5, label="High (0.80)")
328
+ ax2.axvline(0.6, color="#FF9800", linestyle="--",
329
+ linewidth=1.5, label="Med (0.60)")
330
+ ax2.legend(fontsize=8, labelcolor="white", facecolor="#0d0d1a")
331
+ ax2.set_xlabel("Confidence", color="#aaa", fontsize=9)
332
+ ax2.set_ylabel("Count", color="#aaa", fontsize=9)
333
+ ax2.tick_params(colors="white", labelsize=9)
334
+ for sp in ["top", "right"]:
335
+ ax2.spines[sp].set_visible(False)
336
+ for sp in ["left", "bottom"]:
337
+ ax2.spines[sp].set_color("#333")
338
+ ax2.set_title("Confidence Distribution", color="white",
339
+ fontsize=11, fontweight="bold")
340
+
341
+ fig.tight_layout()
342
+ if save_path:
343
+ fig.savefig(save_path, dpi=150, bbox_inches="tight",
344
+ facecolor=fig.get_facecolor())
345
+ plt.close(fig)
346
+
347
+
348
+ # ── Text report writers ───────────────────────────────────────────────────────
349
+
350
+ def _write_single_stage_report(
351
+ path: Path,
352
+ stage: str,
353
+ confidence: float,
354
+ probabilities: dict,
355
+ model: str,
356
+ comparison: dict,
357
+ input_features: dict,
358
+ ):
359
+ lines = [
360
+ "=" * 60,
361
+ "SWAN MENOPAUSE STAGE PREDICTION REPORT",
362
+ f"Generated : {datetime.now().strftime('%Y-%m-%d %H:%M:%S')}",
363
+ "=" * 60,
364
+ "",
365
+ f"Predicted Stage : {STAGE_LABELS.get(stage, stage)}",
366
+ f"Model : {model}",
367
+ f"Confidence : {confidence:.1%}",
368
+ "",
369
+ "Stage Probabilities:",
370
+ ]
371
+ for s, p in probabilities.items():
372
+ bar = "█" * int(p * 20)
373
+ lines.append(f" {s:<6} : {p:.4f} {bar}")
374
+ lines += [
375
+ "",
376
+ "Model Comparison:",
377
+ f" RandomForest → {comparison['RandomForest']['stage']}"
378
+ f" ({comparison['RandomForest'].get('confidence', 0):.1%})",
379
+ f" LogisticRegression → {comparison['LogisticRegression']['stage']}"
380
+ f" ({comparison['LogisticRegression'].get('confidence', 0):.1%})",
381
+ "",
382
+ "Input Features (non-NaN):",
383
+ ]
384
+ for k, v in input_features.items():
385
+ if v is not None and not (isinstance(v, float) and np.isnan(v)):
386
+ lines.append(f" {k:<12} = {v}")
387
+ lines += [
388
+ "",
389
+ "⚠️ For research/educational use only. Not a clinical diagnosis.",
390
+ "=" * 60,
391
+ ]
392
+ path.write_text("\n".join(lines), encoding="utf-8")
393
+
394
+
395
+ def _write_batch_report(
396
+ path: Path,
397
+ results: pd.DataFrame,
398
+ model: str,
399
+ run_dir: Path,
400
+ ):
401
+ total = len(results)
402
+ dist = results["predicted_stage"].value_counts().to_dict() \
403
+ if "predicted_stage" in results.columns else {}
404
+ if "confidence" in results.columns:
405
+ conf = results["confidence"]
406
+ mean_c = conf.mean(); min_c = conf.min(); max_c = conf.max()
407
+ high = int((conf > 0.8).sum())
408
+ medium = int(((conf > 0.6) & (conf <= 0.8)).sum())
409
+ low = int((conf <= 0.6).sum())
410
+ else:
411
+ mean_c = min_c = max_c = high = medium = low = 0
412
+
413
+ lines = [
414
+ "=" * 60,
415
+ "SWAN BATCH STAGE PREDICTION REPORT",
416
+ f"Generated : {datetime.now().strftime('%Y-%m-%d %H:%M:%S')}",
417
+ f"Model : {model}",
418
+ "=" * 60,
419
+ "",
420
+ f"Total Individuals : {total}",
421
+ "",
422
+ "Stage Distribution:",
423
+ ]
424
+ for stage in ["pre", "peri", "post"]:
425
+ count = dist.get(stage, 0)
426
+ pct = count / total * 100 if total else 0
427
+ lines.append(f" {stage:<6} : {count} ({pct:.1f}%)")
428
+ lines += [
429
+ "",
430
+ "Confidence Scores:",
431
+ f" Mean : {mean_c:.4f}",
432
+ f" Min : {min_c:.4f}",
433
+ f" Max : {max_c:.4f}",
434
+ "",
435
+ "Confidence Distribution:",
436
+ f" High (>0.80) : {high}/{total} ({high/total*100:.1f}%)" if total else " N/A",
437
+ f" Medium (0.60-0.80) : {medium}/{total} ({medium/total*100:.1f}%)" if total else " N/A",
438
+ f" Low (≤0.60) : {low}/{total} ({low/total*100:.1f}%)" if total else " N/A",
439
+ "",
440
+ f"Output Directory : {run_dir}",
441
+ "",
442
+ "⚠️ For research/educational use only. Not a clinical diagnosis.",
443
+ "=" * 60,
444
+ ]
445
+ path.write_text("\n".join(lines), encoding="utf-8")
446
+
447
+
448
+ def _write_symptom_report(
449
+ path: Path,
450
+ individual_id: str,
451
+ lmp: str,
452
+ target_date: str,
453
+ cycle_day: int,
454
+ cycle_length: int,
455
+ hot_prob: float,
456
+ hot_pred: bool,
457
+ mood_prob: float,
458
+ mood_pred: bool,
459
+ ):
460
+ hp = float(hot_prob) if (hot_prob is not None and not np.isnan(hot_prob)) else 0.0
461
+ mp = float(mood_prob) if (mood_prob is not None and not np.isnan(mood_prob)) else 0.0
462
+ lines = [
463
+ "=" * 60,
464
+ "SWAN SYMPTOM CYCLE FORECAST REPORT",
465
+ f"Generated : {datetime.now().strftime('%Y-%m-%d %H:%M:%S')}",
466
+ "=" * 60,
467
+ "",
468
+ f"Individual : {individual_id or 'N/A'}",
469
+ f"LMP : {lmp}",
470
+ f"Target Date : {target_date or 'Today'}",
471
+ f"Cycle Length : {cycle_length} days",
472
+ f"Cycle Day : {cycle_day}",
473
+ "",
474
+ "Symptom Probabilities:",
475
+ f" Hot Flash : {hp:.4f} {'[ELEVATED RISK]' if hot_pred else '[LOW RISK]'}",
476
+ f" Mood Change : {mp:.4f} {'[ELEVATED RISK]' if mood_pred else '[LOW RISK]'}",
477
+ "",
478
+ "⚠️ For research/educational use only. Not a clinical diagnosis.",
479
+ "=" * 60,
480
+ ]
481
+ path.write_text("\n".join(lines), encoding="utf-8")
482
+
483
+
484
+ def _write_batch_symptom_report(
485
+ path: Path,
486
+ results: pd.DataFrame,
487
+ cycle_length: int,
488
+ run_dir: Path,
489
+ ):
490
+ total = len(results)
491
+ hot_flags = int(results["hotflash_pred"].sum()) \
492
+ if "hotflash_pred" in results.columns else 0
493
+ mood_flags = int(results["mood_pred"].sum()) \
494
+ if "mood_pred" in results.columns else 0
495
+ mean_hot = float(results["hotflash_prob"].mean()) \
496
+ if "hotflash_prob" in results.columns else 0.0
497
+ mean_mood = float(results["mood_prob"].mean()) \
498
+ if "mood_prob" in results.columns else 0.0
499
+ lines = [
500
+ "=" * 60,
501
+ "SWAN BATCH SYMPTOM FORECAST REPORT",
502
+ f"Generated : {datetime.now().strftime('%Y-%m-%d %H:%M:%S')}",
503
+ f"Cycle Length : {cycle_length} days",
504
+ "=" * 60,
505
+ "",
506
+ f"Total Individuals : {total}",
507
+ f"Hot Flash Risk : {hot_flags}/{total} elevated",
508
+ f"Mood Change Risk : {mood_flags}/{total} elevated",
509
+ f"Avg Hot Flash Prob : {mean_hot:.4f}",
510
+ f"Avg Mood Prob : {mean_mood:.4f}",
511
+ "",
512
+ f"Output Directory : {run_dir}",
513
+ "",
514
+ "⚠️ For research/educational use only. Not a clinical diagnosis.",
515
+ "=" * 60,
516
+ ]
517
+ path.write_text("\n".join(lines), encoding="utf-8")
518
+
519
+
520
+ # ── Core prediction functions ─────────────────────────────────────────────────
521
+
522
+ def predict_single_stage(
523
+ age, race, langint,
524
+ hot_flash, num_hot_flash, bothersome_hf,
525
+ sleep_quality, depression_indicator, mood_change, irritability,
526
+ pain_indicator, abbleed, vaginal_dryness, lmp_day,
527
+ model_choice,
528
+ ):
529
+ """
530
+ Single-person stage prediction.
531
+
532
+ Returns (stage_html, chart_fig, conf_note, compare_html, csv_download_path).
533
+ """
534
+ if not _MODEL_OK:
535
+ return f"⚠️ {_MODEL_MSG}", None, "Models unavailable.", "", None
536
+
537
+ # Build feature dict using the model's canonical feature names
538
+ def _v(x):
539
+ return float(x) if x is not None else np.nan
540
+
541
+ feature_dict = {
542
+ "AGE7": _v(age),
543
+ "RACE": _v(race),
544
+ "LANGINT7": _v(langint),
545
+ "HOTFLAS7": _v(hot_flash),
546
+ "NUMHOTF7": _v(num_hot_flash),
547
+ "BOTHOTF7": _v(bothersome_hf),
548
+ "SLEEPQL7": _v(sleep_quality),
549
+ "DEPRESS7": _v(depression_indicator),
550
+ "MOODCHG7": _v(mood_change),
551
+ "IRRITAB7": _v(irritability),
552
+ "PAIN17": _v(pain_indicator),
553
+ "ABBLEED7": _v(abbleed), # ← correct feature name (was ABLEED7)
554
+ "VAGINDR7": _v(vaginal_dryness),
555
+ "LMPDAY7": _v(lmp_day) if lmp_day else np.nan,
556
+ }
557
+
558
+ try:
559
+ result = _forecast.predict_single(feature_dict, model=model_choice, return_proba=True)
560
+ stage = result["stage"]
561
+ confidence = result.get("confidence") or 0.0
562
+ proba = result.get("probabilities") or {}
563
+
564
+ # ── Create timestamped run directory ──────────────────────────────────
565
+ run_dir = _make_run_dir()
566
+
567
+ # ── Save probability chart (PNG) ──────────────────────────────────────
568
+ chart_path = run_dir / "charts" / "stage_probabilities.png"
569
+ chart_fig = _make_proba_chart(proba, stage, save_path=chart_path) if proba else None
570
+
571
+ # ── Save prediction CSV ───────────────────────────────────────────────
572
+ pred_row = {
573
+ "predicted_stage": stage,
574
+ "model": model_choice,
575
+ "confidence": round(confidence, 4),
576
+ **{f"prob_{k}": round(v, 4) for k, v in proba.items()},
577
+ "timestamp": datetime.now().isoformat(),
578
+ }
579
+ csv_path = run_dir / "predictions" / "stage_prediction.csv"
580
+ pd.DataFrame([pred_row]).to_csv(csv_path, index=False)
581
+
582
+ # ── Model comparison ──────────────────────────────────────────────────
583
+ comparison = _forecast.compare_models(feature_dict)
584
+ rf_stage = comparison["RandomForest"]["stage"]
585
+ lr_stage = comparison["LogisticRegression"]["stage"]
586
+ agree = rf_stage == lr_stage
587
+
588
+ # ── Save text report ──────────────────────────────────────────────────
589
+ txt_path = run_dir / "reports" / "prediction_summary.txt"
590
+ _write_single_stage_report(
591
+ txt_path, stage, confidence, proba,
592
+ model_choice, comparison, feature_dict,
593
+ )
594
+
595
+ # ── Build result card HTML ────────────────────────────────────────────
596
+ info = STAGE_INFO.get(stage, {})
597
+ emoji = STAGE_EMOJI.get(stage, "⚪")
598
+ color = STAGE_COLORS.get(stage, "#607d8b")
599
+ conf_color = _confidence_color(confidence)
600
+
601
+ symptom_tags = "".join(
602
+ f'<span style="background:{color}14;color:{color};padding:4px 10px;'
603
+ f'border-radius:20px;border:1px solid {color}44;font-size:12px;'
604
+ f'font-weight:500">{s}</span>'
605
+ for s in info.get("symptoms", [])
606
+ )
607
+
608
+ stage_html = f"""
609
+ <div class="result-card" style="border-left:4px solid {color}">
610
+ <div style="display:flex;align-items:center;gap:12px;margin-bottom:16px;flex-wrap:wrap">
611
+ <span style="font-size:40px;flex-shrink:0">{emoji}</span>
612
+ <div style="flex:1;min-width:140px">
613
+ <div style="color:#6b7280;font-size:12px;text-transform:uppercase;letter-spacing:2px">
614
+ Predicted Stage
615
+ </div>
616
+ <div style="color:{color};font-size:26px;font-weight:700">
617
+ {STAGE_LABELS.get(stage, stage)}
618
+ </div>
619
+ </div>
620
+ <div style="text-align:right;flex-shrink:0">
621
+ <div style="color:#6b7280;font-size:11px">Confidence</div>
622
+ <div style="color:{conf_color};font-size:28px;font-weight:700">
623
+ {confidence:.0%}
624
+ </div>
625
+ </div>
626
+ </div>
627
+ <hr style="border:none;border-top:1px solid #e2e8f0;margin:12px 0">
628
+ <p style="color:#374151;font-size:14px;margin:8px 0">
629
+ {info.get('description', '')}
630
+ </p>
631
+ <div style="margin-top:12px">
632
+ <div style="color:#6b7280;font-size:11px;text-transform:uppercase;
633
+ letter-spacing:1px;margin-bottom:6px">Common Symptoms</div>
634
+ <div style="display:flex;flex-wrap:wrap;gap:6px">{symptom_tags}</div>
635
+ </div>
636
+ <div style="background:{color}0d;border-left:3px solid {color};
637
+ padding:10px 14px;margin-top:14px;border-radius:0 8px 8px 0">
638
+ <span style="color:{color};font-size:12px;font-weight:600">💡 Guidance: </span>
639
+ <span style="color:#374151;font-size:13px">{info.get('guidance', '')}</span>
640
+ </div>
641
+ <div style="color:#9ca3af;font-size:11px;margin-top:12px">
642
+ Model: {model_choice} · {datetime.now().strftime('%Y-%m-%d %H:%M')}
643
+ </div>
644
+ </div>
645
+ """
646
+
647
+ # Confidence note
648
+ if confidence >= 0.8:
649
+ conf_note = "✅ High confidence — the model is quite certain about this stage."
650
+ elif confidence >= 0.6:
651
+ conf_note = ("⚠️ Moderate confidence — consider providing more feature values "
652
+ "or consulting a clinician.")
653
+ else:
654
+ conf_note = ("🔴 Low confidence — prediction is uncertain; "
655
+ "clinical consultation is strongly recommended.")
656
+
657
+ # Model comparison panel + run-dir info
658
+ compare_html = f"""
659
+ <div class="result-card" style="margin-top:0">
660
+ <div style="color:#6b7280;font-size:11px;text-transform:uppercase;
661
+ letter-spacing:1px;margin-bottom:10px;font-weight:600">
662
+ Model Comparison
663
+ </div>
664
+ <div class="stat-grid-2">
665
+ <div class="stat-item" style="border-top:3px solid #16a34a">
666
+ <div style="color:#16a34a;font-size:11px;font-weight:600">Random Forest</div>
667
+ <div style="color:#111827;font-size:17px;margin-top:4px">
668
+ {STAGE_EMOJI.get(rf_stage,'')} {STAGE_LABELS.get(rf_stage, rf_stage)}
669
+ </div>
670
+ <div style="color:#6b7280;font-size:12px">
671
+ {comparison['RandomForest'].get('confidence', 0):.0%} confidence
672
+ </div>
673
+ </div>
674
+ <div class="stat-item" style="border-top:3px solid #2563eb">
675
+ <div style="color:#2563eb;font-size:11px;font-weight:600">
676
+ Logistic Regression
677
+ </div>
678
+ <div style="color:#111827;font-size:17px;margin-top:4px">
679
+ {STAGE_EMOJI.get(lr_stage,'')} {STAGE_LABELS.get(lr_stage, lr_stage)}
680
+ </div>
681
+ <div style="color:#6b7280;font-size:12px">
682
+ {comparison['LogisticRegression'].get('confidence', 0):.0%} confidence
683
+ </div>
684
+ </div>
685
+ </div>
686
+ <div style="margin-top:10px;padding:8px;border-radius:8px;
687
+ background:{'#d1fae5' if agree else '#fef2f2'};
688
+ color:{'#065f46' if agree else '#9f1239'};
689
+ font-size:13px;text-align:center;font-weight:500">
690
+ {"✅ Both models agree — prediction is robust"
691
+ if agree else
692
+ "⚠️ Models disagree — interpret with caution"}
693
+ </div>
694
+ <div class="output-path-box">
695
+ <div class="output-path-title">📁 Outputs saved to:</div>
696
+ <div class="output-path-dir">{run_dir}/</div>
697
+ <div class="output-path-files">
698
+ charts/stage_probabilities.png<br>
699
+ predictions/stage_prediction.csv<br>
700
+ reports/prediction_summary.txt
701
+ </div>
702
+ </div>
703
+ </div>
704
+ """
705
+
706
+ return stage_html, chart_fig, conf_note, compare_html, str(csv_path)
707
+
708
+ except Exception as exc:
709
+ return f"❌ Prediction error: {exc}", None, "", "", None
710
+
711
+
712
+ def predict_batch_stage(file, model_choice):
713
+ """
714
+ Batch stage prediction from uploaded CSV.
715
+
716
+ Returns (csv_download_path, summary_html, preview_df).
717
+ """
718
+ if not _MODEL_OK:
719
+ return None, f"⚠️ {_MODEL_MSG}", None
720
+
721
+ if file is None:
722
+ return None, "Please upload a CSV file.", None
723
+
724
+ file_path = _get_file_path(file)
725
+ try:
726
+ df = pd.read_csv(file_path)
727
+ except Exception as exc:
728
+ return None, f"Could not read CSV: {exc}", None
729
+
730
+ if df.empty:
731
+ return None, "Uploaded CSV is empty.", None
732
+
733
+ # Identify ID column
734
+ id_col_candidates = ["individual", "Individual", "ID", "id",
735
+ "SWANID", "subject", "Subject"]
736
+ id_col = next((c for c in id_col_candidates if c in df.columns), None)
737
+
738
+ # Validate features
739
+ feature_names = _metadata.get("feature_names", [])
740
+ matching = [c for c in df.columns if c in feature_names]
741
+ missing_pct = 1 - len(matching) / max(len(feature_names), 1)
742
+
743
+ warnings_list = []
744
+ if not matching:
745
+ return None, (
746
+ "❌ No matching feature columns found. "
747
+ "Please include columns from the training feature set "
748
+ "(see 'Feature Reference' tab)."
749
+ ), None
750
+ if missing_pct > 0.5:
751
+ warnings_list.append(
752
+ f"⚠️ {missing_pct:.0%} of training features are missing — "
753
+ "prediction accuracy may be reduced."
754
+ )
755
+
756
+ try:
757
+ results = _forecast.predict_batch(df, model=model_choice, return_proba=True)
758
+
759
+ # Insert individual ID
760
+ if id_col:
761
+ results.insert(0, "individual", df[id_col].values)
762
+ else:
763
+ results.insert(0, "individual",
764
+ [f"Row_{i+1}" for i in range(len(results))])
765
+
766
+ results["model"] = model_choice
767
+ results["notes"] = ""
768
+ if "confidence" in results.columns:
769
+ low_mask = results["confidence"] < 0.6
770
+ results.loc[low_mask, "notes"] = "Low confidence — review manually"
771
+
772
+ # ── Create timestamped run directory ──────────────────────────────────
773
+ run_dir = _make_run_dir()
774
+
775
+ # ── Save predictions CSV ──────────────────────────────────────────────
776
+ csv_path = run_dir / "predictions" / "batch_stage_predictions.csv"
777
+ results.to_csv(csv_path, index=False)
778
+
779
+ # ── Save confidence/distribution chart (PNG) ──────────────────────────
780
+ chart_path = run_dir / "charts" / "batch_summary_chart.png"
781
+ _make_batch_summary_chart(results, save_path=chart_path)
782
+
783
+ # ── Save text report ──────────────────────────────────────────────────
784
+ txt_path = run_dir / "reports" / "batch_summary.txt"
785
+ _write_batch_report(txt_path, results, model_choice, run_dir)
786
+
787
+ # ── Build summary HTML ────────────────────────────────────────────────
788
+ total = len(results)
789
+ dist = results["predicted_stage"].value_counts().to_dict()
790
+ mean_conf = results["confidence"].mean() \
791
+ if "confidence" in results.columns else 0.0
792
+ high_conf = int((results["confidence"] > 0.8).sum()) \
793
+ if "confidence" in results.columns else 0
794
+
795
+ dist_bars = ""
796
+ for stage in ["pre", "peri", "post"]:
797
+ count = dist.get(stage, 0)
798
+ pct = count / total * 100
799
+ dist_bars += f"""
800
+ <div style="margin:6px 0">
801
+ <div style="display:flex;justify-content:space-between;margin-bottom:2px">
802
+ <span style="color:#374151;font-size:13px">
803
+ {STAGE_EMOJI.get(stage,'')} {STAGE_LABELS.get(stage, stage)}
804
+ </span>
805
+ <span style="color:#6b7280;font-size:12px">{count} ({pct:.0f}%)</span>
806
+ </div>
807
+ <div style="background:#e2e8f0;border-radius:4px;height:8px">
808
+ <div style="background:{STAGE_COLORS.get(stage,'#6b7280')};
809
+ width:{pct}%;height:8px;border-radius:4px"></div>
810
+ </div>
811
+ </div>"""
812
+
813
+ warn_html = "".join(
814
+ f'<div style="color:#d97706;font-size:12px;margin-top:4px">{w}</div>'
815
+ for w in warnings_list
816
+ )
817
+
818
+ summary_html = f"""
819
+ <div class="result-card">
820
+ <div style="color:#111827;font-size:16px;font-weight:700;margin-bottom:14px">
821
+ 📊 Batch Results — {total} individuals
822
+ </div>
823
+ {warn_html}
824
+ <div class="stat-grid-3">
825
+ <div class="stat-item">
826
+ <div class="stat-label">Total</div>
827
+ <div class="stat-value">{total}</div>
828
+ </div>
829
+ <div class="stat-item">
830
+ <div class="stat-label">Avg Confidence</div>
831
+ <div class="stat-value" style="color:{_confidence_color(mean_conf)}">
832
+ {mean_conf:.0%}
833
+ </div>
834
+ </div>
835
+ <div class="stat-item">
836
+ <div class="stat-label">High Conf (&gt;80%)</div>
837
+ <div class="stat-value" style="color:#16a34a">{high_conf}/{total}</div>
838
+ </div>
839
+ </div>
840
+ <div style="margin-top:12px">{dist_bars}</div>
841
+ <div class="output-path-box">
842
+ <div class="output-path-title">📁 Outputs saved to:</div>
843
+ <div class="output-path-dir">{run_dir}/</div>
844
+ <div class="output-path-files">
845
+ predictions/batch_stage_predictions.csv<br>
846
+ charts/batch_summary_chart.png<br>
847
+ reports/batch_summary.txt
848
+ </div>
849
+ </div>
850
+ </div>
851
+ """
852
+
853
+ return str(csv_path), summary_html, results.head(20)
854
+
855
+ except Exception as exc:
856
+ return None, f"❌ Batch prediction error: {exc}", None
857
+
858
+
859
+ def predict_symptoms(individual_id, lmp_input, target_date_input, cycle_length):
860
+ """
861
+ Cycle-based symptom forecasting (single person).
862
+
863
+ Returns (result_html, chart_fig, csv_download_path).
864
+ """
865
+ if not lmp_input:
866
+ return "Please enter your Last Menstrual Period date.", None, None
867
+
868
+ try:
869
+ cycle_length = int(cycle_length) if cycle_length else 28
870
+ fore = SymptomCycleForecaster(cycle_length=cycle_length)
871
+ target_date = target_date_input if target_date_input else None
872
+ result = fore.predict_single(lmp=lmp_input, target_date=target_date)
873
+
874
+ cycle_day = result.get("cycle_day")
875
+ hot_prob = result.get("hotflash_prob", 0)
876
+ hot_pred = result.get("hotflash_pred", False)
877
+ mood_prob = result.get("mood_prob", 0)
878
+ mood_pred = result.get("mood_pred", False)
879
+
880
+ # Safe float helpers
881
+ hp = float(hot_prob) if (hot_prob is not None and not np.isnan(hot_prob)) else 0.0
882
+ mp = float(mood_prob) if (mood_prob is not None and not np.isnan(mood_prob)) else 0.0
883
+
884
+ # ── Create timestamped run directory ──────────────────────────────────
885
+ run_dir = _make_run_dir()
886
+
887
+ # ── Save cycle chart (PNG) ────────────────────────────────────────────
888
+ chart_path = run_dir / "charts" / "cycle_position.png"
889
+ chart_fig = _make_cycle_chart(
890
+ cycle_day, cycle_length, hp, mp, save_path=chart_path
891
+ )
892
+
893
+ # ── Save forecast CSV ─────────────────────────────────────────────────
894
+ csv_path = run_dir / "predictions" / "symptom_forecast.csv"
895
+ lmp_note = ""
896
+ try:
897
+ int(str(lmp_input).strip())
898
+ lmp_note = "LMP inferred as day-of-month; interpret with caution"
899
+ except (ValueError, TypeError):
900
+ pass
901
+ pd.DataFrame([{
902
+ "individual": individual_id or "N/A",
903
+ "LMP": lmp_input,
904
+ "date": target_date_input or datetime.now().strftime("%Y-%m-%d"),
905
+ "cycle_day": cycle_day,
906
+ "hotflash_prob": round(hp, 6),
907
+ "hotflash_pred": bool(hot_pred),
908
+ "mood_prob": round(mp, 6),
909
+ "mood_pred": bool(mood_pred),
910
+ "notes": lmp_note,
911
+ }]).to_csv(csv_path, index=False)
912
+
913
+ # ── Save text report ──────────────────────────────────────────────────
914
+ txt_path = run_dir / "reports" / "symptom_summary.txt"
915
+ _write_symptom_report(
916
+ txt_path, individual_id, lmp_input, target_date_input,
917
+ cycle_day, cycle_length, hp, hot_pred, mp, mood_pred,
918
+ )
919
+
920
+ # ── Build result HTML ─────────────────────────────────────────────────
921
+ def _prob_bar(prob, label, color):
922
+ pct = min(prob * 100, 100)
923
+ return f"""
924
+ <div style="margin:10px 0">
925
+ <div style="display:flex;justify-content:space-between;margin-bottom:4px">
926
+ <span style="color:#374151;font-size:14px">{label}</span>
927
+ <span style="color:{color};font-size:16px;font-weight:700">{pct:.0f}%</span>
928
+ </div>
929
+ <div style="background:#e2e8f0;border-radius:6px;height:10px">
930
+ <div style="background:{color};width:{pct}%;height:10px;
931
+ border-radius:6px;transition:width 0.5s"></div>
932
+ </div>
933
+ </div>"""
934
+
935
+ hot_alert = "🔴 Elevated risk" if hot_pred else "🟢 Low risk"
936
+ mood_alert = "🔴 Elevated risk" if mood_pred else "🟢 Low risk"
937
+
938
+ html = f"""
939
+ <div class="result-card">
940
+ <div style="color:#111827;font-size:18px;font-weight:700;margin-bottom:4px">
941
+ {individual_id or 'Forecast'} — Cycle Day {cycle_day or '?'}
942
+ </div>
943
+ <div style="color:#6b7280;font-size:13px;margin-bottom:20px">
944
+ LMP: {lmp_input} | Target: {target_date_input or 'Today'}
945
+ | Cycle: {cycle_length} days
946
+ </div>
947
+ {_prob_bar(hp, '🔥 Hot Flash Probability', '#ef4444')}
948
+ <div style="color:#6b7280;font-size:12px;margin:-6px 0 10px 2px">{hot_alert}</div>
949
+ {_prob_bar(mp, '😤 Mood Change Probability', '#7c3aed')}
950
+ <div style="color:#6b7280;font-size:12px;margin:-6px 0 10px 2px">{mood_alert}</div>
951
+ <div style="background:#f8fafc;border:1px solid #e2e8f0;border-radius:8px;
952
+ padding:12px;margin-top:14px;font-size:12px;color:#6b7280">
953
+ ℹ️ Probabilities are computed from a cycle-phase model (Gaussian heuristic).
954
+ They represent symptom likelihood based on cycle day, not a clinical diagnosis.
955
+ </div>
956
+ <div class="output-path-box">
957
+ <div class="output-path-title">📁 Outputs saved to:</div>
958
+ <div class="output-path-dir">{run_dir}/</div>
959
+ <div class="output-path-files">
960
+ charts/cycle_position.png<br>
961
+ predictions/symptom_forecast.csv<br>
962
+ reports/symptom_summary.txt
963
+ </div>
964
+ </div>
965
+ </div>
966
+ """
967
+
968
+ return html, chart_fig, str(csv_path)
969
+
970
+ except Exception as exc:
971
+ return f"❌ Error: {exc}", None, None
972
+
973
+
974
+ def predict_symptoms_batch(file, lmp_col_name, date_col_name, cycle_length):
975
+ """
976
+ Batch symptom forecasting from CSV.
977
+
978
+ Returns (csv_download_path, summary_html, preview_df).
979
+ """
980
+ if file is None:
981
+ return None, "Please upload a CSV file.", None
982
+
983
+ file_path = _get_file_path(file)
984
+ try:
985
+ df = pd.read_csv(file_path)
986
+ except Exception as exc:
987
+ return None, f"Could not read CSV: {exc}", None
988
+
989
+ if lmp_col_name not in df.columns:
990
+ return None, (
991
+ f"LMP column '{lmp_col_name}' not found in CSV. "
992
+ f"Columns present: {list(df.columns)}"
993
+ ), None
994
+
995
+ try:
996
+ cycle_length = int(cycle_length) if cycle_length else 28
997
+ fore = SymptomCycleForecaster(cycle_length=cycle_length)
998
+ date_col = date_col_name \
999
+ if (date_col_name and date_col_name in df.columns) else None
1000
+ results = fore.predict_df(df, lmp_col=lmp_col_name, date_col=date_col)
1001
+
1002
+ # ── Add notes column (flag day-of-month LMP rows) ─────────────────────
1003
+ def _lmp_note(val):
1004
+ try:
1005
+ int(str(val).strip())
1006
+ return "LMP inferred as day-of-month; interpret with caution"
1007
+ except (ValueError, TypeError):
1008
+ return ""
1009
+ results["notes"] = df[lmp_col_name].apply(_lmp_note)
1010
+
1011
+ # ── Create timestamped run directory ──────────────────────────────────
1012
+ run_dir = _make_run_dir()
1013
+
1014
+ # ── Save predictions CSV ──────────────────────────────────────────────
1015
+ csv_path = run_dir / "predictions" / "batch_symptom_forecast.csv"
1016
+ results.to_csv(csv_path, index=False)
1017
+
1018
+ # ── Save text report ──────────────────────────────────────────────────
1019
+ txt_path = run_dir / "reports" / "batch_symptom_summary.txt"
1020
+ _write_batch_symptom_report(txt_path, results, cycle_length, run_dir)
1021
+
1022
+ # ── Build summary HTML ────────────────────────────────────────────────
1023
+ total = len(results)
1024
+ hot_flags = int(results["hotflash_pred"].sum()) \
1025
+ if "hotflash_pred" in results.columns else 0
1026
+ mood_flags = int(results["mood_pred"].sum()) \
1027
+ if "mood_pred" in results.columns else 0
1028
+ mean_hot = float(results["hotflash_prob"].mean()) \
1029
+ if "hotflash_prob" in results.columns else 0.0
1030
+ mean_mood = float(results["mood_prob"].mean()) \
1031
+ if "mood_prob" in results.columns else 0.0
1032
+
1033
+ summary_html = f"""
1034
+ <div class="result-card">
1035
+ <div style="color:#111827;font-size:16px;font-weight:700;margin-bottom:14px">
1036
+ 🌊 Symptom Forecast — {total} individuals
1037
+ </div>
1038
+ <div class="stat-grid-3">
1039
+ <div class="stat-item">
1040
+ <div class="stat-label">Total</div>
1041
+ <div class="stat-value">{total}</div>
1042
+ </div>
1043
+ <div class="stat-item">
1044
+ <div class="stat-label">🔥 Hot Flash Risk</div>
1045
+ <div class="stat-value" style="color:#ef4444">{hot_flags}</div>
1046
+ </div>
1047
+ <div class="stat-item">
1048
+ <div class="stat-label">😤 Mood Risk</div>
1049
+ <div class="stat-value" style="color:#7c3aed">{mood_flags}</div>
1050
+ </div>
1051
+ </div>
1052
+ <div class="stat-grid-2">
1053
+ <div class="stat-item">
1054
+ <div class="stat-label">Avg Hot Flash Prob</div>
1055
+ <div class="stat-value" style="color:#ef4444;font-size:18px">
1056
+ {mean_hot:.1%}
1057
+ </div>
1058
+ </div>
1059
+ <div class="stat-item">
1060
+ <div class="stat-label">Avg Mood Prob</div>
1061
+ <div class="stat-value" style="color:#7c3aed;font-size:18px">
1062
+ {mean_mood:.1%}
1063
+ </div>
1064
+ </div>
1065
+ </div>
1066
+ <div class="output-path-box">
1067
+ <div class="output-path-title">📁 Outputs saved to:</div>
1068
+ <div class="output-path-dir">{run_dir}/</div>
1069
+ <div class="output-path-files">
1070
+ predictions/batch_symptom_forecast.csv<br>
1071
+ reports/batch_symptom_summary.txt
1072
+ </div>
1073
+ </div>
1074
+ </div>
1075
+ """
1076
+
1077
+ return str(csv_path), summary_html, results
1078
+
1079
+ except Exception as exc:
1080
+ return None, f"❌ Error: {exc}", None
1081
+
1082
+
1083
+ # ── Feature reference & model status ─────────────────────────────────────────
1084
+
1085
+ def get_feature_reference() -> str:
1086
+ feature_names = _metadata.get("feature_names", list(FEATURE_DESCRIPTIONS.keys()))
1087
+
1088
+ rows = ""
1089
+ for i, f in enumerate(feature_names[:60]):
1090
+ desc = FEATURE_DESCRIPTIONS.get(f, f.split("_")[0])
1091
+ rows += f"""
1092
+ <tr>
1093
+ <td class="feature-num">{i + 1}</td>
1094
+ <td class="feature-code">{f}</td>
1095
+ <td class="feature-desc">{desc}</td>
1096
+ </tr>"""
1097
+
1098
+ remaining = len(feature_names) - 60
1099
+ if remaining > 0:
1100
+ rows += f"""
1101
+ <tr>
1102
+ <td colspan="3" style="padding:8px;color:#9ca3af;font-size:12px;text-align:center">
1103
+ … and {remaining} more features (one-hot encoded categories)
1104
+ </td>
1105
+ </tr>"""
1106
+
1107
+ return f"""
1108
+ <div class="feature-table-wrap">
1109
+ <div style="color:#111827;font-size:16px;font-weight:700;margin-bottom:14px">
1110
+ 📋 Training Features ({len(feature_names)} total after encoding)
1111
+ </div>
1112
+ <table>
1113
+ <thead>
1114
+ <tr>
1115
+ <th>#</th>
1116
+ <th>Feature</th>
1117
+ <th>Description</th>
1118
+ </tr>
1119
+ </thead>
1120
+ <tbody>{rows}</tbody>
1121
+ </table>
1122
+ </div>
1123
+ """
1124
+
1125
+
1126
+ def get_model_status() -> str:
1127
+ if _MODEL_OK:
1128
+ fc = len(_metadata.get("feature_names", []))
1129
+ sc = _metadata.get("stage_classes", ["pre", "peri", "post"])
1130
+ badges = "".join(
1131
+ f'<span style="background:{STAGE_COLORS.get(s,"#607d8b")}18;'
1132
+ f'color:{STAGE_COLORS.get(s,"#555")};padding:4px 12px;'
1133
+ f'border-radius:20px;border:1px solid {STAGE_COLORS.get(s,"#607d8b")}44;'
1134
+ f'font-size:13px;font-weight:600">{STAGE_EMOJI.get(s,"")} {s}</span>'
1135
+ for s in sc
1136
+ )
1137
+ return f"""
1138
+ <div class="status-card">
1139
+ <div style="display:flex;align-items:center;gap:10px;margin-bottom:14px">
1140
+ <span style="font-size:24px">✅</span>
1141
+ <div>
1142
+ <div style="color:#059669;font-size:16px;font-weight:700">
1143
+ Models Loaded Successfully
1144
+ </div>
1145
+ <div style="color:#6b7280;font-size:12px">Ready for predictions</div>
1146
+ </div>
1147
+ </div>
1148
+ <div class="stat-grid-3">
1149
+ <div class="stat-item">
1150
+ <div class="stat-label">Features</div>
1151
+ <div class="stat-value">{fc}</div>
1152
+ </div>
1153
+ <div class="stat-item">
1154
+ <div class="stat-label">Models</div>
1155
+ <div class="stat-value">2</div>
1156
+ </div>
1157
+ <div class="stat-item">
1158
+ <div class="stat-label">Stages</div>
1159
+ <div class="stat-value">{len(sc)}</div>
1160
+ </div>
1161
+ </div>
1162
+ <div style="margin-top:14px">
1163
+ <div style="color:#6b7280;font-size:11px;text-transform:uppercase;
1164
+ letter-spacing:0.5px;margin-bottom:6px">Available Stages</div>
1165
+ <div style="display:flex;gap:8px;flex-wrap:wrap">{badges}</div>
1166
+ </div>
1167
+ </div>
1168
+ """
1169
+ return f"""
1170
+ <div class="status-card">
1171
+ <div style="display:flex;align-items:center;gap:10px;margin-bottom:10px">
1172
+ <span style="font-size:24px">⚠️</span>
1173
+ <div>
1174
+ <div style="color:#dc2626;font-size:16px;font-weight:700">
1175
+ Models Not Loaded
1176
+ </div>
1177
+ <div style="color:#6b7280;font-size:12px">{_MODEL_MSG}</div>
1178
+ </div>
1179
+ </div>
1180
+ <div style="background:#fef2f2;border:1px solid #fecaca;border-radius:8px;
1181
+ padding:12px;color:#9f1239;font-size:13px">
1182
+ To train and save models:<br>
1183
+ <code style="background:#1e293b;color:#a3e635;padding:4px 8px;border-radius:4px;
1184
+ margin-top:6px;display:inline-block">python menopause.py</code>
1185
+ <br><br>
1186
+ This generates <code style="background:#e2e8f0;padding:2px 5px;border-radius:3px;
1187
+ color:#1e293b">swan_ml_output/rf_pipeline.pkl</code>,
1188
+ <code style="background:#e2e8f0;padding:2px 5px;border-radius:3px;
1189
+ color:#1e293b">lr_pipeline.pkl</code>, and
1190
+ <code style="background:#e2e8f0;padding:2px 5px;border-radius:3px;
1191
+ color:#1e293b">forecast_metadata.json</code>.
1192
+ </div>
1193
+ </div>
1194
+ """
1195
+
1196
+
1197
+ # ── Education content ─────────────────────────────────────────────────────────
1198
+ EDUCATION_HTML = """
1199
+ <div class="edu-card">
1200
+ <h2>🌸 Understanding Menopause</h2>
1201
+ <p>Menopause is a natural biological process marking the end of menstrual cycles.
1202
+ It is officially diagnosed after 12 consecutive months without a menstrual period
1203
+ and typically occurs in women in their late 40s to early 50s.</p>
1204
+
1205
+ <h3>Three Stages</h3>
1206
+ <div class="stage-cards-grid">
1207
+ <div class="stage-card-pre">
1208
+ <div style="color:#16a34a;font-weight:700;margin-bottom:8px">🟢 Pre-Menopause</div>
1209
+ <p style="font-size:13px;margin:0;color:#374151">Regular ovarian function. Periods are predictable.
1210
+ Hormones (estrogen, progesterone) follow a consistent monthly pattern.</p>
1211
+ </div>
1212
+ <div class="stage-card-peri">
1213
+ <div style="color:#d97706;font-weight:700;margin-bottom:8px">🟡 Peri-Menopause</div>
1214
+ <p style="font-size:13px;margin:0;color:#374151">Transition phase — usually begins in the mid-40s.
1215
+ Hormone levels fluctuate. Periods become irregular.
1216
+ Hot flashes and sleep issues may begin.</p>
1217
+ </div>
1218
+ <div class="stage-card-post">
1219
+ <div style="color:#7c3aed;font-weight:700;margin-bottom:8px">🟣 Post-Menopause</div>
1220
+ <p style="font-size:13px;margin:0;color:#374151">12+ months after the last period.
1221
+ Lower estrogen levels. Risk factors for osteoporosis and
1222
+ cardiovascular disease increase.</p>
1223
+ </div>
1224
+ </div>
1225
+
1226
+ <h3>Common Symptoms by Stage</h3>
1227
+ <table style="width:100%;border-collapse:collapse;font-size:13px">
1228
+ <thead>
1229
+ <tr style="background:#f8fafc">
1230
+ <th style="padding:8px;text-align:left;color:#6b7280;font-weight:600">Symptom</th>
1231
+ <th style="padding:8px;text-align:center;color:#16a34a;font-weight:600">Pre</th>
1232
+ <th style="padding:8px;text-align:center;color:#d97706;font-weight:600">Peri</th>
1233
+ <th style="padding:8px;text-align:center;color:#7c3aed;font-weight:600">Post</th>
1234
+ </tr>
1235
+ </thead>
1236
+ <tbody>
1237
+ <tr style="border-bottom:1px solid #e2e8f0">
1238
+ <td style="padding:8px;color:#374151">Hot flashes</td>
1239
+ <td style="text-align:center;color:#9ca3af">–</td>
1240
+ <td style="text-align:center">✅</td>
1241
+ <td style="text-align:center">✅</td>
1242
+ </tr>
1243
+ <tr style="border-bottom:1px solid #e2e8f0">
1244
+ <td style="padding:8px;color:#374151">Irregular periods</td>
1245
+ <td style="text-align:center;color:#9ca3af">–</td>
1246
+ <td style="text-align:center">✅</td>
1247
+ <td style="text-align:center;color:#9ca3af">N/A</td>
1248
+ </tr>
1249
+ <tr style="border-bottom:1px solid #e2e8f0">
1250
+ <td style="padding:8px;color:#374151">Sleep disturbances</td>
1251
+ <td style="text-align:center;color:#6b7280">Mild</td>
1252
+ <td style="text-align:center">✅</td>
1253
+ <td style="text-align:center">✅</td>
1254
+ </tr>
1255
+ <tr style="border-bottom:1px solid #e2e8f0">
1256
+ <td style="padding:8px;color:#374151">Mood changes</td>
1257
+ <td style="text-align:center;color:#6b7280">PMS</td>
1258
+ <td style="text-align:center">✅</td>
1259
+ <td style="text-align:center;color:#6b7280">Possible</td>
1260
+ </tr>
1261
+ <tr style="border-bottom:1px solid #e2e8f0">
1262
+ <td style="padding:8px;color:#374151">Vaginal dryness</td>
1263
+ <td style="text-align:center;color:#9ca3af">–</td>
1264
+ <td style="text-align:center;color:#6b7280">Possible</td>
1265
+ <td style="text-align:center">✅</td>
1266
+ </tr>
1267
+ <tr>
1268
+ <td style="padding:8px;color:#374151">Bone density changes</td>
1269
+ <td style="text-align:center;color:#9ca3af">–</td>
1270
+ <td style="text-align:center;color:#6b7280">Begins</td>
1271
+ <td style="text-align:center">✅</td>
1272
+ </tr>
1273
+ </tbody>
1274
+ </table>
1275
+
1276
+ <h3>About This Tool</h3>
1277
+ <p style="font-size:13px">This application uses machine learning models trained on the
1278
+ SWAN (Study of Women's Health Across the Nation) dataset — a landmark multisite,
1279
+ multiethnic longitudinal study. The models were trained on self-reported symptom and
1280
+ behavioral data to predict menopausal stage.</p>
1281
+ <div class="disclaimer-box">
1282
+ ⚠️ <strong style="color:#d97706">Disclaimer:</strong>
1283
+ This tool is for educational and research purposes only.
1284
+ Predictions should not substitute clinical diagnosis.
1285
+ Always consult a qualified healthcare provider for medical advice.
1286
+ </div>
1287
+ </div>
1288
+ """
1289
+
1290
+
1291
+ # ── Gradio UI ─────────────────────────────────────────────────────────────────
1292
+ CUSTOM_CSS = """
1293
+ /* ── Force light mode — disable Gradio dark theme entirely ───────────── */
1294
+ :root {
1295
+ color-scheme: light only !important;
1296
+ }
1297
+ /* Fallback: if Gradio somehow sets .dark, override every key variable */
1298
+ body.dark,
1299
+ body.dark .gradio-container {
1300
+ --body-background-fill: #f0f4f8 !important;
1301
+ --background-fill-primary: #ffffff !important;
1302
+ --background-fill-secondary: #f8fafc !important;
1303
+ --border-color-primary: #e2e8f0 !important;
1304
+ --border-color-accent: #3b82f6 !important;
1305
+ --color-accent: #2563eb !important;
1306
+ --color-accent-soft: #eff6ff !important;
1307
+ --input-background-fill: #ffffff !important;
1308
+ --input-border-color: #d1d5db !important;
1309
+ --label-text-color: #374151 !important;
1310
+ --block-label-text-color: #374151 !important;
1311
+ --block-title-text-color: #111827 !important;
1312
+ --body-text-color: #111827 !important;
1313
+ --body-text-color-subdued: #6b7280 !important;
1314
+ --link-text-color: #2563eb !important;
1315
+ --button-primary-background-fill: #2563eb !important;
1316
+ --button-primary-text-color: #ffffff !important;
1317
+ --button-secondary-background-fill: #ffffff !important;
1318
+ --button-secondary-text-color: #374151 !important;
1319
+ --tab-text-color: #374151 !important;
1320
+ --tab-text-color-selected: #2563eb !important;
1321
+ color: #111827 !important;
1322
+ background-color: #f0f4f8 !important;
1323
+ }
1324
+
1325
+ /* ── Core ────────────────────────────────────────────────────────────── */
1326
+ .gradio-container {
1327
+ max-width: 1200px !important;
1328
+ margin: 0 auto !important;
1329
+ font-family: 'Segoe UI', system-ui, -apple-system, sans-serif !important;
1330
+ background: #f0f4f8 !important;
1331
+ }
1332
+
1333
+ /* ── Header banner ──────────────────────────────────────────────────── */
1334
+ .header-banner {
1335
+ background: linear-gradient(135deg, #faf5ff 0%, #fff0f9 50%, #eff6ff 100%);
1336
+ border: 1px solid #e9d5ff;
1337
+ border-radius: 16px;
1338
+ padding: 28px 32px;
1339
+ margin-bottom: 20px;
1340
+ box-shadow: 0 2px 8px rgba(139,92,246,0.08);
1341
+ position: relative;
1342
+ overflow: hidden;
1343
+ }
1344
+ .header-banner::before {
1345
+ content: '';
1346
+ position: absolute;
1347
+ top: -40%; right: -5%;
1348
+ width: 280px; height: 280px;
1349
+ background: radial-gradient(circle, rgba(139,92,246,0.08) 0%, transparent 70%);
1350
+ pointer-events: none;
1351
+ }
1352
+
1353
+ /* ── Reusable info boxes ─────────────────────────────────────────────── */
1354
+ .info-box {
1355
+ background: #f8fafc;
1356
+ border: 1px solid #e2e8f0;
1357
+ border-left: 3px solid #3b82f6;
1358
+ border-radius: 8px;
1359
+ padding: 12px 16px;
1360
+ color: #475569;
1361
+ font-size: 13px;
1362
+ margin-bottom: 16px;
1363
+ line-height: 1.5;
1364
+ }
1365
+ .info-box code {
1366
+ background: #e2e8f0;
1367
+ color: #1e293b;
1368
+ padding: 1px 5px;
1369
+ border-radius: 3px;
1370
+ font-family: monospace;
1371
+ font-size: 0.9em;
1372
+ }
1373
+ .section-label {
1374
+ color: #2563eb;
1375
+ font-size: 12px;
1376
+ font-weight: 700;
1377
+ text-transform: uppercase;
1378
+ letter-spacing: 0.6px;
1379
+ margin-bottom: 10px;
1380
+ margin-top: 10px;
1381
+ }
1382
+ .format-hint {
1383
+ background: #f8fafc;
1384
+ border: 1px solid #e2e8f0;
1385
+ border-radius: 8px;
1386
+ padding: 14px;
1387
+ margin-top: 10px;
1388
+ font-size: 12px;
1389
+ color: #475569;
1390
+ }
1391
+ .format-hint-title { color: #2563eb; font-weight: 600; margin-bottom: 6px; }
1392
+ .format-hint pre { color: #475569; margin: 0; font-size: 11px; white-space: pre-wrap; }
1393
+ .format-hint-note { color: #94a3b8; font-size: 11px; margin-top: 8px; }
1394
+ .placeholder-msg { color: #9ca3af; text-align: center; padding: 40px; font-size: 14px; }
1395
+ .section-divider { border: none; border-top: 1px solid #e2e8f0; margin: 24px 0; }
1396
+ .batch-section-label { color: #2563eb; font-size: 14px; font-weight: 600; margin-bottom: 12px; }
1397
+
1398
+ /* ── Result & summary cards ─────────────────────────────────────────── */
1399
+ .result-card {
1400
+ background: #ffffff;
1401
+ border: 1px solid #e2e8f0;
1402
+ border-radius: 16px;
1403
+ padding: 24px;
1404
+ box-shadow: 0 1px 4px rgba(0,0,0,0.06);
1405
+ font-family: 'Segoe UI', system-ui, sans-serif;
1406
+ }
1407
+ .stat-grid-3 { display:grid; grid-template-columns:repeat(3,1fr); gap:12px; margin:14px 0; }
1408
+ .stat-grid-2 { display:grid; grid-template-columns:1fr 1fr; gap:10px; margin-top:10px; }
1409
+ .stat-item { background:#f8fafc; border:1px solid #e2e8f0; padding:12px; border-radius:8px; text-align:center; }
1410
+ .stat-label { color:#6b7280; font-size:11px; text-transform:uppercase; letter-spacing:0.4px; }
1411
+ .stat-value { color:#111827; font-size:22px; font-weight:700; line-height:1.2; margin-top:2px; }
1412
+ .output-path-box { background:#f0fdf4; border:1px solid #bbf7d0; border-radius:8px; padding:10px 14px; margin-top:12px; font-family:monospace; }
1413
+ .output-path-title { color:#059669; font-size:12px; font-weight:600; }
1414
+ .output-path-dir { color:#065f46; font-size:11px; margin-top:4px; }
1415
+ .output-path-files { color:#6b7280; font-size:10px; margin-top:4px; line-height:1.6; }
1416
+
1417
+ /* ── Code blocks ────────────────────────────────────────────────────── */
1418
+ .code-block {
1419
+ background: #1e293b;
1420
+ color: #a3e635;
1421
+ border-radius: 8px;
1422
+ padding: 12px;
1423
+ font-size: 12px;
1424
+ font-family: monospace;
1425
+ white-space: pre;
1426
+ overflow-x: auto;
1427
+ }
1428
+
1429
+ /* ── Setup instructions card ─────────────────────────────────────────── */
1430
+ .setup-card { background:#ffffff; border:1px solid #e2e8f0; border-radius:12px; padding:20px; margin-top:16px; font-family:'Segoe UI',system-ui,sans-serif; }
1431
+ .setup-title { color:#111827; font-size:15px; font-weight:700; margin-bottom:12px; }
1432
+ .setup-step { color:#374151; font-size:13px; line-height:1.8; }
1433
+ .setup-step strong { color:#2563eb; }
1434
+
1435
+ /* ── Education ──────────────────────────────────────────────────────── */
1436
+ .edu-card { background:#ffffff; border:1px solid #e2e8f0; border-radius:16px; padding:28px; font-family:'Segoe UI',system-ui,sans-serif; color:#374151; line-height:1.7; }
1437
+ .edu-card h2 { color:#111827; font-size:22px; margin-top:0; }
1438
+ .edu-card h3 { color:#7c3aed; font-size:16px; margin-top:20px; }
1439
+ .stage-cards-grid { display:grid; grid-template-columns:repeat(3,1fr); gap:16px; margin:14px 0; }
1440
+ .stage-card-pre { background:#f0fdf4; border-top:4px solid #16a34a; padding:16px; border-radius:10px; }
1441
+ .stage-card-peri { background:#fffbeb; border-top:4px solid #d97706; padding:16px; border-radius:10px; }
1442
+ .stage-card-post { background:#faf5ff; border-top:4px solid #7c3aed; padding:16px; border-radius:10px; }
1443
+ .disclaimer-box { background:#fffbeb; border-left:3px solid #d97706; padding:12px 16px; border-radius:0 8px 8px 0; margin-top:14px; font-size:13px; color:#374151; }
1444
+
1445
+ /* ── Feature reference table ────────────────────────────────────────── */
1446
+ .feature-table-wrap { background:#ffffff; border:1px solid #e2e8f0; border-radius:12px; padding:20px; max-height:500px; overflow-y:auto; font-family:'Segoe UI',system-ui,sans-serif; }
1447
+ .feature-table-wrap table { width:100%; border-collapse:collapse; }
1448
+ .feature-table-wrap thead tr { background:#f8fafc; }
1449
+ .feature-table-wrap th { padding:8px; color:#6b7280; font-size:11px; text-align:left; text-transform:uppercase; letter-spacing:0.4px; }
1450
+ .feature-table-wrap tr { border-bottom:1px solid #e2e8f0; }
1451
+ .feature-table-wrap td { padding:8px; }
1452
+ .feature-code { color:#2563eb; font-family:monospace; font-size:13px; }
1453
+ .feature-desc { color:#374151; font-size:12px; }
1454
+ .feature-num { color:#9ca3af; font-size:12px; }
1455
+
1456
+ /* ── Model status card ──────────────────────────────────────────────── */
1457
+ .status-card { background:#ffffff; border:1px solid #e2e8f0; border-radius:12px; padding:20px; font-family:'Segoe UI',system-ui,sans-serif; }
1458
+
1459
+ /* ── Footer ─────────────────────────────────────────────────────────── */
1460
+ .app-footer { text-align:center; color:#9ca3af; font-size:11px; margin-top:24px; padding:16px; border-top:1px solid #e2e8f0; }
1461
+ .app-footer a { color:#2563eb; text-decoration:none; }
1462
+
1463
+ /* ── Responsive — Tablet (≤ 768 px) ────────────────────────────────── */
1464
+ @media (max-width: 768px) {
1465
+ .gradio-container { padding: 8px !important; }
1466
+ .header-banner { padding: 16px 20px !important; margin-bottom: 12px !important; }
1467
+ .header-status-badge { display: none !important; }
1468
+ .stat-grid-3 { grid-template-columns: 1fr !important; }
1469
+ .stat-grid-2 { grid-template-columns: 1fr !important; }
1470
+ .stage-cards-grid { grid-template-columns: 1fr !important; }
1471
+ }
1472
+
1473
+ /* ── Responsive — Mobile (≤ 480 px) ────────────────────────────────── */
1474
+ @media (max-width: 480px) {
1475
+ .header-banner h1 { font-size: 18px !important; }
1476
+ .result-card { padding: 16px !important; }
1477
+ .edu-card { padding: 16px !important; }
1478
+ .setup-card { padding: 14px !important; }
1479
+ }
1480
+ """
1481
+
1482
+ HEADER_HTML = """
1483
+ <div class="header-banner">
1484
+ <div style="display:flex;align-items:center;gap:16px;flex-wrap:wrap">
1485
+ <div style="font-size:48px;flex-shrink:0">🌸</div>
1486
+ <div style="flex:1;min-width:200px">
1487
+ <h1 style="margin:0;font-size:26px;font-weight:800;
1488
+ background:linear-gradient(135deg,#7c3aed,#db2777);
1489
+ -webkit-background-clip:text;-webkit-text-fill-color:transparent">
1490
+ SWAN Menopause Prediction
1491
+ </h1>
1492
+ <p style="margin:4px 0 0;color:#6b7280;font-size:13px">
1493
+ AI-powered menopausal stage prediction &amp; symptom forecasting ·
1494
+ Based on the SWAN dataset
1495
+ </p>
1496
+ </div>
1497
+ <div class="header-status-badge" style="text-align:right;flex-shrink:0">
1498
+ <div style="background:#ffffff;border:1px solid #e2e8f0;border-radius:8px;
1499
+ padding:8px 16px;display:inline-block;box-shadow:0 1px 3px rgba(0,0,0,0.06)">
1500
+ <div style="color:#9ca3af;font-size:10px;text-transform:uppercase;letter-spacing:1px">
1501
+ Status
1502
+ </div>
1503
+ <div style="color:{color};font-size:13px;font-weight:600">{status}</div>
1504
+ </div>
1505
+ </div>
1506
+ </div>
1507
+ </div>
1508
+ """.format(
1509
+ color = "#059669" if _MODEL_OK else "#dc2626",
1510
+ status = "Models Ready ✅" if _MODEL_OK else "Models Needed ⚠️",
1511
+ )
1512
+
1513
+
1514
+ # ── Force-light-mode JS (runs on every page load) ─────────────────────────────
1515
+ # Removes Gradio's .dark class, locks localStorage to "light", and uses a
1516
+ # MutationObserver to prevent the class from being re-applied — works on
1517
+ # HuggingFace Spaces regardless of the user's OS/browser dark-mode setting.
1518
+ FORCE_LIGHT_JS = """
1519
+ function() {
1520
+ const forceLightMode = () => {
1521
+ if (document.body.classList.contains('dark')) {
1522
+ document.body.classList.remove('dark');
1523
+ }
1524
+ };
1525
+ // Apply immediately
1526
+ forceLightMode();
1527
+ // Lock Gradio's stored preference
1528
+ try { localStorage.setItem('theme', 'light'); } catch(e) {}
1529
+ // Watch for Gradio trying to re-add .dark and block it
1530
+ new MutationObserver(function(mutations) {
1531
+ mutations.forEach(function(m) {
1532
+ if (m.attributeName === 'class') forceLightMode();
1533
+ });
1534
+ }).observe(document.body, { attributes: true, attributeFilter: ['class'] });
1535
+ }
1536
+ """
1537
+
1538
+
1539
+ # ── App builder ───────────────────────────────────────────────────────────────
1540
+
1541
+ def build_app():
1542
+ with gr.Blocks(
1543
+ css = CUSTOM_CSS,
1544
+ js = FORCE_LIGHT_JS,
1545
+ title = "SWAN Menopause Prediction",
1546
+ theme = gr.themes.Soft(
1547
+ primary_hue = "blue",
1548
+ neutral_hue = "slate",
1549
+ ).set(
1550
+ # ── Body ──────────────────────────────────────────────────────
1551
+ body_background_fill = "#f0f4f8",
1552
+ body_background_fill_dark = "#f0f4f8",
1553
+ body_text_color = "#111827",
1554
+ body_text_color_dark = "#111827",
1555
+ body_text_color_subdued = "#6b7280",
1556
+ body_text_color_subdued_dark = "#6b7280",
1557
+ # ── Panel / block backgrounds ──────────────────────────────────
1558
+ background_fill_primary = "#ffffff",
1559
+ background_fill_primary_dark = "#ffffff",
1560
+ background_fill_secondary = "#f8fafc",
1561
+ background_fill_secondary_dark = "#f8fafc",
1562
+ block_background_fill = "#ffffff",
1563
+ block_background_fill_dark = "#ffffff",
1564
+ block_border_color = "#e2e8f0",
1565
+ block_border_color_dark = "#e2e8f0",
1566
+ block_label_background_fill = "#f8fafc",
1567
+ block_label_background_fill_dark= "#f8fafc",
1568
+ block_label_text_color = "#374151",
1569
+ block_label_text_color_dark = "#374151",
1570
+ block_title_text_color = "#111827",
1571
+ block_title_text_color_dark = "#111827",
1572
+ # ── Inputs ────────────────────────────────────────────────────
1573
+ input_background_fill = "#ffffff",
1574
+ input_background_fill_dark = "#ffffff",
1575
+ input_background_fill_focus = "#ffffff",
1576
+ input_background_fill_focus_dark= "#ffffff",
1577
+ input_border_color = "#d1d5db",
1578
+ input_border_color_dark = "#d1d5db",
1579
+ input_border_color_focus = "#3b82f6",
1580
+ input_border_color_focus_dark = "#3b82f6",
1581
+ input_placeholder_color = "#9ca3af",
1582
+ input_placeholder_color_dark = "#9ca3af",
1583
+ # ── Borders ────────────────────────────────────────────────────
1584
+ border_color_primary = "#e2e8f0",
1585
+ border_color_primary_dark = "#e2e8f0",
1586
+ border_color_accent = "#3b82f6",
1587
+ border_color_accent_dark = "#3b82f6",
1588
+ # ── Buttons ────────────────────────────────────────────────────
1589
+ button_primary_background_fill = "#2563eb",
1590
+ button_primary_background_fill_dark = "#2563eb",
1591
+ button_primary_background_fill_hover = "#1d4ed8",
1592
+ button_primary_background_fill_hover_dark = "#1d4ed8",
1593
+ button_primary_text_color = "#ffffff",
1594
+ button_primary_text_color_dark = "#ffffff",
1595
+ button_secondary_background_fill = "#ffffff",
1596
+ button_secondary_background_fill_dark = "#ffffff",
1597
+ button_secondary_background_fill_hover = "#f1f5f9",
1598
+ button_secondary_background_fill_hover_dark="#f1f5f9",
1599
+ button_secondary_text_color = "#374151",
1600
+ button_secondary_text_color_dark = "#374151",
1601
+ button_secondary_border_color = "#e2e8f0",
1602
+ button_secondary_border_color_dark = "#e2e8f0",
1603
+ # ── Checkbox / Radio ──────────────────────────────────────────
1604
+ checkbox_background_color = "#ffffff",
1605
+ checkbox_background_color_dark = "#ffffff",
1606
+ checkbox_background_color_selected = "#2563eb",
1607
+ checkbox_background_color_selected_dark = "#2563eb",
1608
+ checkbox_border_color = "#d1d5db",
1609
+ checkbox_border_color_dark = "#d1d5db",
1610
+ checkbox_border_color_focus = "#3b82f6",
1611
+ checkbox_border_color_focus_dark = "#3b82f6",
1612
+ # ── Slider ────────────────────────────────────────────────────
1613
+ slider_color = "#2563eb",
1614
+ slider_color_dark = "#2563eb",
1615
+ # ── Table ─────────────────────────────────────────────────────
1616
+ table_odd_background_fill = "#f8fafc",
1617
+ table_odd_background_fill_dark = "#f8fafc",
1618
+ table_even_background_fill = "#ffffff",
1619
+ table_even_background_fill_dark = "#ffffff",
1620
+ table_border_color = "#e2e8f0",
1621
+ table_border_color_dark = "#e2e8f0",
1622
+ # ── Links ─────────────────────────────────────────────────────
1623
+ link_text_color = "#2563eb",
1624
+ link_text_color_dark = "#2563eb",
1625
+ link_text_color_hover = "#1d4ed8",
1626
+ link_text_color_hover_dark = "#1d4ed8",
1627
+ link_text_color_visited = "#7c3aed",
1628
+ link_text_color_visited_dark = "#7c3aed",
1629
+ # ── Accent ────────────────────────────────────────────────────
1630
+ color_accent_soft = "#eff6ff",
1631
+ color_accent_soft_dark = "#eff6ff",
1632
+ ),
1633
+ ) as app:
1634
+
1635
+ gr.HTML(HEADER_HTML)
1636
+
1637
+ with gr.Tabs():
1638
+
1639
+ # ── TAB 1: Single Stage Prediction ────────────────────────────────
1640
+ with gr.Tab("🔮 Stage Prediction"):
1641
+ gr.HTML("""
1642
+ <div class="info-box">
1643
+ Fill in the fields below to predict menopausal stage for a single individual.
1644
+ All fields are optional — the pipeline handles missing values automatically.
1645
+ A timestamped output folder is created in
1646
+ <code>swan_ml_output/</code> for every run.
1647
+ </div>""")
1648
+
1649
+ with gr.Row():
1650
+ # ── Input column ──────────────────────────────────────────
1651
+ with gr.Column(scale=2):
1652
+
1653
+ with gr.Group():
1654
+ gr.HTML('<div class="section-label">Demographics</div>')
1655
+ with gr.Row():
1656
+ age = gr.Slider(
1657
+ minimum=35, maximum=75, value=48, step=1,
1658
+ label="Age (AGE7)",
1659
+ )
1660
+ race = gr.Dropdown(
1661
+ choices=[1, 2, 3, 4, 5], value=1,
1662
+ label="Race (RACE)",
1663
+ info="1=White, 2=Black, 3=Chinese, 4=Japanese, 5=Hispanic",
1664
+ )
1665
+ langint = gr.Dropdown(
1666
+ choices=[1, 2, 3], value=1,
1667
+ label="Interview Language (LANGINT7)",
1668
+ info="1=English, 2=Spanish, 3=Other",
1669
+ )
1670
+
1671
+ with gr.Group():
1672
+ gr.HTML('<div class="section-label">Vasomotor Symptoms</div>')
1673
+ with gr.Row():
1674
+ hot_flash = gr.Slider(
1675
+ minimum=1, maximum=5, value=1, step=1,
1676
+ label="Hot Flash Severity (HOTFLAS7)",
1677
+ info="1=None, 5=Very severe",
1678
+ )
1679
+ num_hot_flash = gr.Slider(
1680
+ minimum=0, maximum=15, value=0, step=1,
1681
+ label="# Hot Flashes/Week (NUMHOTF7)",
1682
+ )
1683
+ bothersome_hf = gr.Slider(
1684
+ minimum=1, maximum=4, value=1, step=1,
1685
+ label="How Bothersome (BOTHOTF7)",
1686
+ info="1=Not at all, 4=Extremely",
1687
+ )
1688
+
1689
+ with gr.Group():
1690
+ gr.HTML('<div class="section-label">Sleep &amp; Mood</div>')
1691
+ with gr.Row():
1692
+ sleep_quality = gr.Slider(
1693
+ minimum=1, maximum=5, value=2, step=1,
1694
+ label="Sleep Quality (SLEEPQL7)",
1695
+ info="1=Very good, 5=Very poor",
1696
+ )
1697
+ depression = gr.Slider(
1698
+ minimum=0, maximum=4, value=0, step=1,
1699
+ label="Depression Indicator (DEPRESS7)",
1700
+ info="0=No, higher=more severe",
1701
+ )
1702
+ with gr.Row():
1703
+ mood_change = gr.Slider(
1704
+ minimum=1, maximum=5, value=1, step=1,
1705
+ label="Mood Changes (MOODCHG7)",
1706
+ info="1=None, 5=Severe",
1707
+ )
1708
+ irritability = gr.Slider(
1709
+ minimum=1, maximum=5, value=1, step=1,
1710
+ label="Irritability (IRRITAB7)",
1711
+ )
1712
+
1713
+ with gr.Group():
1714
+ gr.HTML('<div class="section-label">Physical &amp; Gynaecological</div>')
1715
+ with gr.Row():
1716
+ pain = gr.Slider(
1717
+ minimum=0, maximum=5, value=0, step=1,
1718
+ label="Pain Indicator (PAIN17)",
1719
+ )
1720
+ abbleed = gr.Dropdown(
1721
+ choices=[0, 1, 2], value=0,
1722
+ label="Abnormal Bleeding (ABBLEED7)",
1723
+ info="0=No, 1=Yes, 2=Unsure",
1724
+ )
1725
+ with gr.Row():
1726
+ vaginal_dryness = gr.Slider(
1727
+ minimum=0, maximum=5, value=0, step=1,
1728
+ label="Vaginal Dryness (VAGINDR7)",
1729
+ )
1730
+ lmp_day = gr.Number(
1731
+ value=None,
1732
+ label="LMP Day (LMPDAY7)",
1733
+ info="Day of last menstrual period (optional)",
1734
+ )
1735
+
1736
+ model_choice = gr.Radio(
1737
+ choices=["RandomForest", "LogisticRegression"],
1738
+ value="RandomForest",
1739
+ label="Model",
1740
+ info="RandomForest: higher accuracy | "
1741
+ "LogisticRegression: more interpretable",
1742
+ )
1743
+ predict_btn = gr.Button(
1744
+ "🔮 Predict Stage", variant="primary", size="lg"
1745
+ )
1746
+
1747
+ # ── Output column ─────────────────────────────────────────
1748
+ with gr.Column(scale=3):
1749
+ result_html = gr.HTML(
1750
+ '<div class="placeholder-msg">Fill in the form and click Predict Stage</div>'
1751
+ )
1752
+ result_chart = gr.Plot(label="Stage Probabilities")
1753
+ confidence_note = gr.Textbox(
1754
+ label="Confidence Note", interactive=False, lines=2
1755
+ )
1756
+ compare_html = gr.HTML()
1757
+ stage_download = gr.File(
1758
+ label="Download Prediction CSV", interactive=False
1759
+ )
1760
+
1761
+ predict_btn.click(
1762
+ fn = predict_single_stage,
1763
+ inputs = [
1764
+ age, race, langint,
1765
+ hot_flash, num_hot_flash, bothersome_hf,
1766
+ sleep_quality, depression, mood_change, irritability,
1767
+ pain, abbleed, vaginal_dryness, lmp_day,
1768
+ model_choice,
1769
+ ],
1770
+ outputs = [
1771
+ result_html, result_chart, confidence_note,
1772
+ compare_html, stage_download,
1773
+ ],
1774
+ )
1775
+
1776
+ # ── TAB 2: Batch Stage Prediction ─────────────────────────────────
1777
+ with gr.Tab("📁 Batch Stage Prediction"):
1778
+ gr.HTML("""
1779
+ <div class="info-box">
1780
+ Upload a CSV file with individual feature values for batch prediction.
1781
+ Results + charts + a summary report are saved to a timestamped folder
1782
+ inside <code>swan_ml_output/</code>.
1783
+ </div>""")
1784
+
1785
+ with gr.Row():
1786
+ with gr.Column(scale=1):
1787
+ batch_file = gr.File(
1788
+ label="Upload stage_input.csv",
1789
+ file_types=[".csv"],
1790
+ )
1791
+ batch_model = gr.Radio(
1792
+ choices=["RandomForest", "LogisticRegression"],
1793
+ value="RandomForest",
1794
+ label="Model",
1795
+ )
1796
+ gr.HTML("""
1797
+ <div class="format-hint">
1798
+ <div class="format-hint-title">Expected CSV Format</div>
1799
+ <pre>individual,AGE7,RACE,HOTFLAS7,...
1800
+ Person_001,48,1,2,...
1801
+ Person_002,52,2,1,...</pre>
1802
+ <div class="format-hint-note">
1803
+ See the test-csv/ folder for an approved example.
1804
+ </div>
1805
+ </div>""")
1806
+ batch_predict_btn = gr.Button(
1807
+ "🚀 Run Batch Prediction", variant="primary"
1808
+ )
1809
+
1810
+ with gr.Column(scale=2):
1811
+ batch_summary_html = gr.HTML(
1812
+ '<div class="placeholder-msg">Upload a CSV to begin</div>'
1813
+ )
1814
+ batch_download = gr.File(
1815
+ label="Download Predictions CSV", interactive=False
1816
+ )
1817
+ batch_results_df = gr.DataFrame(
1818
+ label="Results Preview (first 20 rows)",
1819
+ interactive=False,
1820
+ )
1821
+
1822
+ batch_predict_btn.click(
1823
+ fn = predict_batch_stage,
1824
+ inputs = [batch_file, batch_model],
1825
+ outputs = [batch_download, batch_summary_html, batch_results_df],
1826
+ )
1827
+
1828
+ # ── TAB 3: Symptom Forecast ───────────────────────────────────────
1829
+ with gr.Tab("🌊 Symptom Forecast"):
1830
+ gr.HTML("""
1831
+ <div class="info-box">
1832
+ Predict hot flash and mood change probability based on cycle day
1833
+ (calculated from Last Menstrual Period date).
1834
+ All outputs are saved to a timestamped folder inside
1835
+ <code>swan_ml_output/</code>.
1836
+ </div>""")
1837
+
1838
+ with gr.Row():
1839
+ with gr.Column(scale=1):
1840
+ sym_individual = gr.Textbox(
1841
+ label="Individual ID (optional)",
1842
+ placeholder="e.g., Patient_001",
1843
+ )
1844
+ sym_lmp = gr.Textbox(
1845
+ label="Last Menstrual Period (LMP)",
1846
+ placeholder="2026-01-15 or 15 (day of month)",
1847
+ info="Full date (YYYY-MM-DD) or day-of-month integer",
1848
+ )
1849
+ sym_date = gr.Textbox(
1850
+ label="Target Date (optional)",
1851
+ placeholder="2026-02-27 (defaults to today)",
1852
+ info="Date to forecast for (YYYY-MM-DD)",
1853
+ )
1854
+ sym_cycle = gr.Slider(
1855
+ minimum=21, maximum=40, value=28, step=1,
1856
+ label="Cycle Length (days)",
1857
+ )
1858
+ sym_predict_btn = gr.Button(
1859
+ "🌊 Forecast Symptoms", variant="primary"
1860
+ )
1861
+
1862
+ with gr.Column(scale=2):
1863
+ sym_result_html = gr.HTML(
1864
+ '<div class="placeholder-msg">Enter LMP date and click Forecast</div>'
1865
+ )
1866
+ sym_chart = gr.Plot(label="Cycle Position")
1867
+ sym_download = gr.File(
1868
+ label="Download Forecast CSV", interactive=False
1869
+ )
1870
+
1871
+ sym_predict_btn.click(
1872
+ fn = predict_symptoms,
1873
+ inputs = [sym_individual, sym_lmp, sym_date, sym_cycle],
1874
+ outputs = [sym_result_html, sym_chart, sym_download],
1875
+ )
1876
+
1877
+ gr.HTML('<hr class="section-divider">')
1878
+ gr.HTML('<div class="batch-section-label">📁 Batch Symptom Forecasting</div>')
1879
+
1880
+ with gr.Row():
1881
+ with gr.Column(scale=1):
1882
+ sym_batch_file = gr.File(
1883
+ label="Upload symptoms_input.csv",
1884
+ file_types=[".csv"],
1885
+ )
1886
+ sym_lmp_col = gr.Textbox(
1887
+ label="LMP Column Name", value="LMP"
1888
+ )
1889
+ sym_date_col = gr.Textbox(
1890
+ label="Date Column Name (optional)", value="date"
1891
+ )
1892
+ sym_cycle_batch = gr.Slider(
1893
+ minimum=21, maximum=40, value=28, step=1,
1894
+ label="Default Cycle Length",
1895
+ )
1896
+ sym_batch_btn = gr.Button(
1897
+ "🌊 Run Batch Forecast", variant="primary"
1898
+ )
1899
+
1900
+ with gr.Column(scale=2):
1901
+ sym_batch_summary = gr.HTML(
1902
+ '<div class="placeholder-msg">Upload a CSV to begin</div>'
1903
+ )
1904
+ sym_batch_download = gr.File(
1905
+ label="Download Symptom Forecast CSV", interactive=False
1906
+ )
1907
+ sym_batch_df = gr.DataFrame(
1908
+ label="Results Preview",
1909
+ interactive=False,
1910
+ )
1911
+
1912
+ sym_batch_btn.click(
1913
+ fn = predict_symptoms_batch,
1914
+ inputs = [
1915
+ sym_batch_file, sym_lmp_col,
1916
+ sym_date_col, sym_cycle_batch,
1917
+ ],
1918
+ outputs = [sym_batch_download, sym_batch_summary, sym_batch_df],
1919
+ )
1920
+
1921
+ # ── TAB 4: Education ��─────────────────────────────────────────────
1922
+ with gr.Tab("📚 Menopause Education"):
1923
+ gr.HTML(EDUCATION_HTML)
1924
+
1925
+ # ── TAB 5: Feature Reference ──────────────────────────────────────
1926
+ with gr.Tab("🔬 Feature Reference"):
1927
+ gr.HTML("""
1928
+ <div class="info-box">
1929
+ Canonical list of features used by the trained models
1930
+ (from <code>forecast_metadata.json</code>).
1931
+ For batch CSV uploads, column names must match these feature names.
1932
+ </div>""")
1933
+ gr.HTML(get_feature_reference())
1934
+
1935
+ # ── TAB 6: Model Status ───────────────────────────────────────────
1936
+ with gr.Tab("⚙️ Model Status"):
1937
+ gr.HTML(get_model_status())
1938
+ gr.HTML("""
1939
+ <div class="setup-card">
1940
+ <div class="setup-title">🚀 Setup Instructions</div>
1941
+ <div class="setup-step">
1942
+ <p><strong>Step 1 — Train models:</strong></p>
1943
+ <pre class="code-block">python menopause.py</pre>
1944
+ <p><strong>Step 2 — Verify artifacts:</strong></p>
1945
+ <pre class="code-block">ls swan_ml_output/
1946
+ # rf_pipeline.pkl lr_pipeline.pkl forecast_metadata.json</pre>
1947
+ <p><strong>Step 3 — Run this app:</strong></p>
1948
+ <pre class="code-block">python app.py</pre>
1949
+ <p><strong>Step 4 — Deploy on Hugging Face Spaces:</strong></p>
1950
+ <pre class="code-block">git lfs install
1951
+ git lfs track "*.pkl"
1952
+ git add .
1953
+ git commit -m "SWAN menopause prediction app"
1954
+ git push</pre>
1955
+ <p><strong>Output folder structure (per run):</strong></p>
1956
+ <pre class="code-block">swan_ml_output/
1957
+ &lt;YYYYMMDD_HHMMSS&gt;/
1958
+ charts/ &larr; PNG visualizations
1959
+ predictions/ &larr; CSV result files
1960
+ reports/ &larr; TXT summary reports</pre>
1961
+ </div>
1962
+ </div>
1963
+ """)
1964
+
1965
+ gr.HTML("""
1966
+ <div class="app-footer">
1967
+ SWAN Menopause Prediction App · Built with Gradio ·
1968
+ For research &amp; educational use only · Not for clinical diagnosis ·
1969
+ <a href="https://www.swanstudy.org/" target="_blank">SWAN Study</a>
1970
+ </div>""")
1971
+
1972
+ return app
1973
+
1974
+
1975
+ # ── Entry point ───────────────────────────────────────────────────────────────
1976
+ if __name__ == "__main__":
1977
+ demo = build_app()
1978
+ demo.launch(
1979
+ server_name = "0.0.0.0",
1980
+ server_port = int(os.environ.get("PORT", 7860)),
1981
+ share = False,
1982
+ show_error = True,
1983
+ )