emp-admin commited on
Commit
5f98f88
Β·
verified Β·
1 Parent(s): 3ea4062

Upload 9 files

Browse files
Files changed (9) hide show
  1. Dockerfile +6 -13
  2. README.md +81 -5
  3. advice_model.pkl +2 -2
  4. app.py +225 -182
  5. generate_data.py +197 -0
  6. metadata.json +24 -0
  7. requirements.txt +6 -7
  8. risk_model.pkl +2 -2
  9. train.py +145 -0
Dockerfile CHANGED
@@ -1,14 +1,7 @@
1
- FROM python:3.9
2
-
3
- WORKDIR /code
4
-
5
- COPY ./requirements.txt /code/requirements.txt
6
-
7
- RUN pip install --no-cache-dir --upgrade -r /code/requirements.txt
8
-
9
  COPY . .
10
-
11
- # Grant permissions to models
12
- RUN chmod 777 risk_model.pkl advice_model.pkl
13
-
14
- CMD ["uvicorn", "app:app", "--host", "0.0.0.0", "--port", "7860"]
 
1
+ FROM python:3.11-slim
2
+ WORKDIR /app
3
+ COPY requirements.txt .
4
+ RUN pip install --no-cache-dir -r requirements.txt
 
 
 
 
5
  COPY . .
6
+ EXPOSE 7860
7
+ CMD ["uvicorn", "app:app", "--host", "0.0.0.0", "--port", "7860"]
 
 
 
README.md CHANGED
@@ -1,10 +1,86 @@
1
  ---
2
- title: Bioweather
3
- emoji: πŸ“Š
4
- colorFrom: gray
5
- colorTo: gray
6
  sdk: docker
7
  pinned: false
 
 
8
  ---
9
 
10
- Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ title: Phoebe Bioweather API v2
3
+ emoji: 🌀️
4
+ colorFrom: blue
5
+ colorTo: green
6
  sdk: docker
7
  pinned: false
8
+ license: mit
9
+ app_port: 7860
10
  ---
11
 
12
+ # 🌀️ Phoebe Bioweather API v2.0
13
+
14
+ **Weather-driven headache risk scoring** for the [Phoebe](https://empedoclabs.com) iOS app by **EmpedocLabs**.
15
+
16
+ ## What It Does
17
+
18
+ Takes 7 weather parameters β†’ returns a 0-100 risk score, one of 15 biometeo conditions, and personalized actionable advice with 3 severity tiers.
19
+
20
+ ## Endpoints
21
+
22
+ | Method | Path | Description |
23
+ |---|---|---|
24
+ | `GET` | `/` | Status |
25
+ | `GET` | `/health` | Model health |
26
+ | `POST` | `/predict` | Risk score + condition + actions |
27
+
28
+ ## Request
29
+
30
+ ```json
31
+ {
32
+ "temp_c": 28.5,
33
+ "pressure_hpa": 1005.3,
34
+ "humidity": 88,
35
+ "wind_kph": 12,
36
+ "uv_index": 7,
37
+ "pressure_drop": -7.2,
38
+ "temp_change": 3.5
39
+ }
40
+ ```
41
+
42
+ ## Response
43
+
44
+ ```json
45
+ {
46
+ "risk_score": 72,
47
+ "risk_level": "High",
48
+ "condition": {
49
+ "id": 1,
50
+ "title": "Rapid Pressure Drop",
51
+ "emoji": "πŸ“‰",
52
+ "text": "A sharp pressure drop is one of today's main headache drivers...",
53
+ "actions": [
54
+ "Reduce stimulation for the next few hours...",
55
+ "Lower sensory load: dim lights, shorter screen blocks...",
56
+ "Keep hydration steady and avoid skipped meals."
57
+ ]
58
+ }
59
+ }
60
+ ```
61
+
62
+ ## 15 Biometeo Conditions
63
+
64
+ | ID | Condition | Primary Trigger |
65
+ |---|---|---|
66
+ | 0 | Clear Skies | No weather trigger |
67
+ | 1 | Rapid Pressure Drop | Barometric drop > 5 hPa |
68
+ | 2 | Pressure Squeeze | Barometric rise > 5 hPa |
69
+ | 3 | Sauna Effect | Heat + humidity |
70
+ | 4 | High Wind | Wind > 35 km/h |
71
+ | 5 | High UV Glare | UV index β‰₯ 7 |
72
+ | 6 | Bitter Cold | Temperature < 0Β°C |
73
+ | 7 | Drastic Temp Drop | 24h temp change < -7Β°C |
74
+ | 8 | Heat Shock | 24h temp change > +7Β°C |
75
+ | 9 | Heavy Dampness | Humidity > 88% + calm |
76
+ | 10 | Mild Barometric Dip | Pressure drop 2-5 hPa |
77
+ | 11 | Mild Pressure Squeeze | Pressure rise 2-5 hPa |
78
+ | 12 | Breezy Pollen Risk | Moderate wind + warm |
79
+ | 13 | Dry Air Warning | Humidity < 30% |
80
+ | 14 | Stagnant & Gloomy | Low UV + high humidity + calm |
81
+
82
+ ## Model Details
83
+
84
+ - Risk regressor: HistGradientBoosting, MAE=2.52, RΒ²=0.977
85
+ - Advice classifier: HistGradientBoosting, Accuracy=98.6%, F1=0.971
86
+ - Rule-based coherence layer ensures physically-impossible outputs never reach the user
advice_model.pkl CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:263d9593bab57d91a586d9adadad063eb77f1dd017684154ed5568d3dc77c783
3
- size 3006513
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:bc520fad21be7a84e2f10c835fb57bb7e1afc2fcebfb5246c25c528489cf3a5c
3
+ size 5642415
app.py CHANGED
@@ -1,20 +1,87 @@
1
- from fastapi import FastAPI
2
- from pydantic import BaseModel
 
 
 
 
 
 
 
 
 
 
 
 
 
 
3
  import pickle
 
4
  import pandas as pd
 
5
 
6
- # Load Models
7
- with open("risk_model.pkl", "rb") as f:
8
- risk_model = pickle.load(f)
9
 
10
- with open("advice_model.pkl", "rb") as f:
11
- advice_model = pickle.load(f)
12
 
13
- app = FastAPI()
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
14
 
15
- # Deterministic, risk-aware advice library.
16
- # The classifier predicts the condition ID.
17
- # The risk model determines which text severity to use.
18
  ADVICE_LIBRARY = {
19
  0: {
20
  "title": "Clear Skies, Clear Head",
@@ -36,7 +103,7 @@ ADVICE_LIBRARY = {
36
  "texts": {
37
  "Low": "Pressure is dipping, but the signal is still mild. Very sensitive users may notice slight heaviness behind the eyes or a drop in energy.",
38
  "Moderate": "Pressure is falling fast enough to lower your migraine threshold. This is a day to reduce other triggers and keep hydration steady.",
39
- "High": "A sharp pressure drop is one of today’s main headache drivers. Lower sensory load, keep rescue medication accessible if prescribed, and avoid overexertion."
40
  },
41
  "actions": [
42
  "Lower sensory load: dim lights, shorter screen blocks, less noise.",
@@ -190,7 +257,7 @@ ADVICE_LIBRARY = {
190
  "texts": {
191
  "Low": "Air movement may be stirring light environmental irritation, especially if you already have mild allergy sensitivity.",
192
  "Moderate": "Breezy conditions can carry pollen and dust that push sinus and histamine-related headaches. Keep indoor air cleaner and limit exposure if needed.",
193
- "High": "Wind-driven allergen exposure is likely one of today’s main triggers. Protect your airways, keep windows controlled, and manage the histamine load early."
194
  },
195
  "actions": [
196
  "Keep windows closed if pollen is a usual issue.",
@@ -225,225 +292,201 @@ ADVICE_LIBRARY = {
225
  "Improve indoor lighting if screens feel heavy on the eyes.",
226
  "Watch posture and avoid collapsing into the desk."
227
  ]
228
- }
229
  }
230
 
231
 
 
 
 
 
232
  class WeatherInput(BaseModel):
233
- temp_c: float
234
- pressure_hpa: float
235
- humidity: float
236
- wind_kph: float
237
- uv_index: int
238
- pressure_drop: float
239
- temp_change: float
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
240
 
 
 
 
241
 
242
- def clamp_risk_score(value) -> int:
243
  try:
244
  return int(max(0, min(100, round(float(value)))))
245
  except Exception:
246
  return 0
247
 
248
 
249
- def get_risk_level(risk_score: int) -> str:
250
- if risk_score > 55:
251
- return "High"
252
- if risk_score > 30:
253
- return "Moderate"
254
  return "Low"
255
 
256
 
257
- def infer_rule_based_condition(row: dict) -> tuple[int, int]:
258
- """
259
- Rule-based coherence layer.
260
- This does NOT replace the ML classifier.
261
- It only corrects obviously weak/misaligned condition picks.
262
- Returns: (condition_id, strength)
263
- """
264
- temp_c = float(row["temp_c"])
265
- humidity = float(row["humidity"])
266
- wind_kph = float(row["wind_kph"])
267
- uv_index = int(row["uv_index"])
268
- pressure_delta = float(row["pressure_drop"])
269
- temp_change = float(row["temp_change"])
270
-
271
- candidates = []
272
-
273
- def add(condition_id: int, strength: int):
274
- candidates.append((condition_id, strength))
275
-
276
- # Pressure movement
277
- if pressure_delta <= -8:
278
- add(1, 95) # Rapid drop
279
- elif pressure_delta <= -4:
280
- add(10, 72) # Mild dip
281
-
282
- if pressure_delta >= 8:
283
- add(2, 95) # Rapid rise
284
- elif pressure_delta >= 4:
285
- add(11, 72) # Mild rise
286
-
287
- # Heat / humidity
288
- if temp_c >= 29 and humidity >= 70:
289
- add(3, 92)
290
-
291
- # Wind / pollen
292
- if wind_kph >= 40:
293
- add(4, 90)
294
- elif wind_kph >= 20:
295
- add(12, 66)
296
-
297
- # UV
298
- if uv_index >= 8:
299
- add(5, 88)
300
-
301
- # Cold
302
- if temp_c <= 0:
303
- add(6, 84)
304
-
305
- # Temp shock
306
- if temp_change <= -7:
307
- add(7, 89)
308
- elif temp_change >= 7:
309
- add(8, 89)
310
-
311
- # Dampness / stagnation
312
- if humidity >= 92 and wind_kph <= 12:
313
- add(9, 76)
314
-
315
- # Dryness
316
- if humidity <= 30:
317
- add(13, 78)
318
-
319
- # Gloomy / stagnant
320
- if uv_index <= 2 and humidity >= 75 and wind_kph <= 10:
321
- add(14, 64)
322
-
323
- if not candidates:
324
  return 0, 0
 
325
 
326
- return max(candidates, key=lambda x: x[1])
327
 
 
 
 
328
 
329
- def select_condition_id(model_condition_id: int, row: dict, risk_score: int) -> int:
330
- """
331
- Use ML first, then correct obvious nonsense deterministically.
332
- Example: model says 'clear skies' while risk is high and pressure is crashing.
333
- """
334
- rule_condition_id, rule_strength = infer_rule_based_condition(row)
 
335
 
336
- if model_condition_id not in ADVICE_LIBRARY:
337
- return rule_condition_id if rule_strength > 0 else 0
338
 
339
- # If the rule signal is very strong, trust the weather pattern.
340
- if rule_strength >= 90 and model_condition_id != rule_condition_id:
341
- return rule_condition_id
342
 
343
- # If model says "all clear" but risk is elevated and rules see a meaningful trigger, override.
344
- if model_condition_id == 0 and risk_score >= 45 and rule_strength >= 65:
345
- return rule_condition_id
346
-
347
- return model_condition_id
348
-
349
-
350
- def dedupe_keep_order(items: list[str]) -> list[str]:
351
- seen = set()
352
- result = []
353
- for item in items:
354
- if item and item not in seen:
355
- seen.add(item)
356
- result.append(item)
357
- return result
358
-
359
-
360
- def build_actions(condition_id: int, risk_score: int, row: dict) -> list[str]:
361
- risk_level = get_risk_level(risk_score)
362
- actions = []
363
-
364
- # Risk-level actions first
365
- if risk_level == "High":
366
- actions.extend([
367
  "Reduce stimulation for the next few hours: dim lights, lower audio, and shorten screen sessions.",
368
  "Keep hydration, food intake, and routine stable today.",
369
- "If you have a clinician-approved rescue plan, keep it accessible."
370
  ])
371
- elif risk_level == "Moderate":
372
- actions.extend([
373
  "Protect the basics early: hydration, meals, and shorter screen blocks.",
374
- "Avoid stacking other triggers like dehydration, long fasting, or poor posture."
375
  ])
376
  else:
377
- actions.extend([
378
- "No need to overreact, but stay consistent with hydration and meals.",
379
- ])
380
 
381
- # Condition-specific actions
382
- actions.extend(ADVICE_LIBRARY[condition_id]["actions"])
383
 
384
- # Feature-derived actions
385
  if row["uv_index"] >= 7:
386
- actions.append("Use sunglasses outdoors and reduce glare indoors.")
387
  if row["humidity"] >= 70 and row["temp_c"] >= 27:
388
- actions.append("Prioritize electrolytes and cooler environments.")
389
  if row["humidity"] <= 30:
390
- actions.append("Support dry sinuses with humidity or saline if needed.")
391
  if row["wind_kph"] >= 25:
392
- actions.append("Protect your ears and neck when outside.")
393
  if abs(row["temp_change"]) >= 7:
394
- actions.append("Avoid abrupt indoor/outdoor temperature swings; transition gradually.")
395
  if abs(row["pressure_drop"]) >= 4:
396
- actions.append("Keep the rest of the day trigger-light: no skipped meals, no dehydration, no unnecessary strain.")
 
 
 
 
 
 
 
 
 
397
 
398
- actions = dedupe_keep_order(actions)
399
- return actions[:5]
400
 
 
 
 
401
 
402
  @app.get("/")
403
  def home():
404
- return {"status": "Biometeorology AI is Active"}
 
 
 
 
 
405
 
406
 
407
- @app.post("/predict")
408
- def predict(input_data: WeatherInput):
409
- data_dict = input_data.model_dump() if hasattr(input_data, "model_dump") else input_data.dict()
410
- df = pd.DataFrame([data_dict])
 
 
 
 
411
 
412
- if hasattr(risk_model, "feature_names_in_"):
413
- expected_cols = list(risk_model.feature_names_in_)
414
- missing = set(expected_cols) - set(df.columns)
415
- if missing:
416
- return {"error": f"Missing features required by model: {sorted(missing)}"}
417
- df = df[expected_cols]
418
 
419
- row = df.iloc[0].to_dict()
 
420
 
421
- # 1. Predict risk
422
  risk_pred = risk_model.predict(df)[0]
423
- risk_score = clamp_risk_score(risk_pred)
424
  risk_level = get_risk_level(risk_score)
425
 
426
- # 2. Predict condition from ML
427
- model_condition_id = int(advice_model.predict(df)[0])
428
 
429
- # 3. Deterministic coherence layer
430
- final_condition_id = select_condition_id(model_condition_id, row, risk_score)
431
- content = ADVICE_LIBRARY.get(final_condition_id, ADVICE_LIBRARY[0])
432
 
433
- # 4. Deterministic text selection by risk level
434
  text = content["texts"][risk_level]
435
 
436
- # 5. Deterministic action list
437
- actions = build_actions(final_condition_id, risk_score, row)
438
-
439
- return {
440
- "risk_score": risk_score,
441
- "risk_level": risk_level,
442
- "condition": {
443
- "id": final_condition_id,
444
- "title": content["title"],
445
- "emoji": content["emoji"],
446
- "text": text,
447
- "actions": actions
448
- }
449
- }
 
 
 
1
+ """
2
+ ═══════════════════════════════════════════════════════════════════════
3
+ Phoebe Bioweather API v2.0
4
+ EmpedocLabs Β© 2025
5
+
6
+ Weather-driven headache risk scoring + actionable clinical advice.
7
+ Designed for the Phoebe iOS app.
8
+
9
+ GET / β†’ Status
10
+ GET /health β†’ Model status
11
+ POST /predict β†’ Risk score + condition + personalized actions
12
+ ═══════════════════════════════════════════════════════════════════════
13
+ """
14
+
15
+ import logging
16
+ import os
17
  import pickle
18
+ import numpy as np
19
  import pandas as pd
20
+ from typing import List
21
 
22
+ from fastapi import FastAPI, HTTPException
23
+ from fastapi.middleware.cors import CORSMiddleware
24
+ from pydantic import BaseModel, Field
25
 
26
+ # ── Logging ──────────────────────────────────────────────────────────
 
27
 
28
+ logging.basicConfig(level=logging.INFO, format="%(asctime)s | %(levelname)s | %(message)s")
29
+ logger = logging.getLogger("bioweather")
30
+
31
+ # ── App ──────────────────────────────────────────────────────────────
32
+
33
+ app = FastAPI(
34
+ title="Phoebe Bioweather API",
35
+ version="2.0.0",
36
+ description="Weather-driven headache risk scoring for the Phoebe iOS app by EmpedocLabs.",
37
+ )
38
+
39
+ app.add_middleware(
40
+ CORSMiddleware,
41
+ allow_origins=["*"],
42
+ allow_credentials=True,
43
+ allow_methods=["*"],
44
+ allow_headers=["*"],
45
+ )
46
+
47
+ # ── Models ───────────────────────────────────────────────────────────
48
+
49
+ risk_model = None
50
+ advice_model = None
51
+
52
+ FEATURE_COLS = ["temp_c", "pressure_hpa", "humidity", "wind_kph",
53
+ "uv_index", "pressure_drop", "temp_change"]
54
+
55
+
56
+ @app.on_event("startup")
57
+ async def load_models():
58
+ global risk_model, advice_model
59
+
60
+ for name, filename in [("risk", "risk_model.pkl"), ("advice", "advice_model.pkl")]:
61
+ path = filename
62
+ if not os.path.exists(path):
63
+ path = os.path.join("model", filename)
64
+ if not os.path.exists(path):
65
+ path = os.path.join(os.path.dirname(__file__), filename)
66
+
67
+ try:
68
+ with open(path, "rb") as f:
69
+ if name == "risk":
70
+ risk_model = pickle.load(f)
71
+ else:
72
+ advice_model = pickle.load(f)
73
+ logger.info(f"βœ… {name}_model loaded from {path}")
74
+ except Exception as e:
75
+ logger.error(f"❌ Failed to load {name}_model: {e}")
76
+
77
+ if risk_model and advice_model:
78
+ logger.info("βœ… Bioweather v2.0 ready")
79
+
80
+
81
+ # ═══════════════════════════════════════════════════════════════════════
82
+ # ADVICE LIBRARY β€” 15 biometeo conditions with 3 severity tiers each
83
+ # ═══════════════════════════════════════════════════════════════════════
84
 
 
 
 
85
  ADVICE_LIBRARY = {
86
  0: {
87
  "title": "Clear Skies, Clear Head",
 
103
  "texts": {
104
  "Low": "Pressure is dipping, but the signal is still mild. Very sensitive users may notice slight heaviness behind the eyes or a drop in energy.",
105
  "Moderate": "Pressure is falling fast enough to lower your migraine threshold. This is a day to reduce other triggers and keep hydration steady.",
106
+ "High": "A sharp pressure drop is one of today's main headache drivers. Lower sensory load, keep rescue medication accessible if prescribed, and avoid overexertion."
107
  },
108
  "actions": [
109
  "Lower sensory load: dim lights, shorter screen blocks, less noise.",
 
257
  "texts": {
258
  "Low": "Air movement may be stirring light environmental irritation, especially if you already have mild allergy sensitivity.",
259
  "Moderate": "Breezy conditions can carry pollen and dust that push sinus and histamine-related headaches. Keep indoor air cleaner and limit exposure if needed.",
260
+ "High": "Wind-driven allergen exposure is likely one of today's main triggers. Protect your airways, keep windows controlled, and manage the histamine load early."
261
  },
262
  "actions": [
263
  "Keep windows closed if pollen is a usual issue.",
 
292
  "Improve indoor lighting if screens feel heavy on the eyes.",
293
  "Watch posture and avoid collapsing into the desk."
294
  ]
295
+ },
296
  }
297
 
298
 
299
+ # ═══════════════════════════════════════════════════════════════════════
300
+ # REQUEST / RESPONSE
301
+ # ═══════════════════════════════════════════════════════════════════════
302
+
303
  class WeatherInput(BaseModel):
304
+ temp_c: float = Field(..., description="Temperature in Celsius")
305
+ pressure_hpa: float = Field(..., description="Barometric pressure in hPa/mbar")
306
+ humidity: float = Field(..., description="Relative humidity %")
307
+ wind_kph: float = Field(..., description="Wind speed km/h")
308
+ uv_index: int = Field(..., ge=0, le=11, description="UV index 0-11")
309
+ pressure_drop: float = Field(..., description="24h pressure change in hPa (negative = drop)")
310
+ temp_change: float = Field(..., description="24h temperature change in Β°C")
311
+
312
+
313
+ class ConditionResponse(BaseModel):
314
+ id: int
315
+ title: str
316
+ emoji: str
317
+ text: str
318
+ actions: List[str]
319
+
320
+
321
+ class PredictResponse(BaseModel):
322
+ risk_score: int
323
+ risk_level: str
324
+ condition: ConditionResponse
325
+
326
 
327
+ # ═══════════════════════════════════════════════════════════════════════
328
+ # LOGIC
329
+ # ═══════════════════════════════════════════════════════════════════════
330
 
331
+ def clamp_risk(value) -> int:
332
  try:
333
  return int(max(0, min(100, round(float(value)))))
334
  except Exception:
335
  return 0
336
 
337
 
338
+ def get_risk_level(score: int) -> str:
339
+ if score > 55: return "High"
340
+ if score > 30: return "Moderate"
 
 
341
  return "Low"
342
 
343
 
344
+ def infer_rule_condition(row: dict) -> tuple:
345
+ """Rule-based coherence β€” corrects ML when physics is obvious."""
346
+ temp = float(row["temp_c"])
347
+ hum = float(row["humidity"])
348
+ wind = float(row["wind_kph"])
349
+ uv = int(row["uv_index"])
350
+ pd_ = float(row["pressure_drop"])
351
+ tc = float(row["temp_change"])
352
+
353
+ cands = []
354
+
355
+ if pd_ <= -8: cands.append((1, 95))
356
+ elif pd_ <= -4: cands.append((10, 72))
357
+ if pd_ >= 8: cands.append((2, 95))
358
+ elif pd_ >= 4: cands.append((11, 72))
359
+ if temp >= 29 and hum >= 70: cands.append((3, 92))
360
+ if wind >= 40: cands.append((4, 90))
361
+ elif wind >= 20: cands.append((12, 66))
362
+ if uv >= 8: cands.append((5, 88))
363
+ if temp <= 0: cands.append((6, 84))
364
+ if tc <= -7: cands.append((7, 89))
365
+ elif tc >= 7: cands.append((8, 89))
366
+ if hum >= 92 and wind <= 12: cands.append((9, 76))
367
+ if hum <= 30: cands.append((13, 78))
368
+ if uv <= 2 and hum >= 75 and wind <= 10: cands.append((14, 64))
369
+
370
+ if not cands:
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
371
  return 0, 0
372
+ return max(cands, key=lambda x: x[1])
373
 
 
374
 
375
+ def select_condition(ml_id: int, row: dict, risk: int) -> int:
376
+ """ML first, rules correct obvious mismatches."""
377
+ rule_id, strength = infer_rule_condition(row)
378
 
379
+ if ml_id not in ADVICE_LIBRARY:
380
+ return rule_id if strength > 0 else 0
381
+ if strength >= 90 and ml_id != rule_id:
382
+ return rule_id
383
+ if ml_id == 0 and risk >= 45 and strength >= 65:
384
+ return rule_id
385
+ return ml_id
386
 
 
 
387
 
388
+ def build_actions(cond_id: int, risk: int, row: dict) -> List[str]:
389
+ level = get_risk_level(risk)
390
+ acts = []
391
 
392
+ if level == "High":
393
+ acts.extend([
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
394
  "Reduce stimulation for the next few hours: dim lights, lower audio, and shorten screen sessions.",
395
  "Keep hydration, food intake, and routine stable today.",
396
+ "If you have a clinician-approved rescue plan, keep it accessible.",
397
  ])
398
+ elif level == "Moderate":
399
+ acts.extend([
400
  "Protect the basics early: hydration, meals, and shorter screen blocks.",
401
+ "Avoid stacking other triggers like dehydration, long fasting, or poor posture.",
402
  ])
403
  else:
404
+ acts.append("No need to overreact, but stay consistent with hydration and meals.")
 
 
405
 
406
+ acts.extend(ADVICE_LIBRARY[cond_id]["actions"])
 
407
 
 
408
  if row["uv_index"] >= 7:
409
+ acts.append("Use sunglasses outdoors and reduce glare indoors.")
410
  if row["humidity"] >= 70 and row["temp_c"] >= 27:
411
+ acts.append("Prioritize electrolytes and cooler environments.")
412
  if row["humidity"] <= 30:
413
+ acts.append("Support dry sinuses with humidity or saline if needed.")
414
  if row["wind_kph"] >= 25:
415
+ acts.append("Protect your ears and neck when outside.")
416
  if abs(row["temp_change"]) >= 7:
417
+ acts.append("Avoid abrupt indoor/outdoor temperature swings; transition gradually.")
418
  if abs(row["pressure_drop"]) >= 4:
419
+ acts.append("Keep the rest of the day trigger-light: no skipped meals, no dehydration, no unnecessary strain.")
420
+
421
+ # Dedupe keeping order
422
+ seen = set()
423
+ unique = []
424
+ for a in acts:
425
+ if a not in seen:
426
+ seen.add(a)
427
+ unique.append(a)
428
+ return unique[:6]
429
 
 
 
430
 
431
+ # ═══════════════════════════════════════════════════════════════════════
432
+ # ENDPOINTS
433
+ # ═══════════════════════════════════════════════════════════════════════
434
 
435
  @app.get("/")
436
  def home():
437
+ return {
438
+ "service": "Phoebe Bioweather API",
439
+ "version": "2.0.0",
440
+ "by": "EmpedocLabs",
441
+ "status": "running" if risk_model and advice_model else "models_not_loaded",
442
+ }
443
 
444
 
445
+ @app.get("/health")
446
+ def health():
447
+ return {
448
+ "status": "healthy" if risk_model and advice_model else "degraded",
449
+ "risk_model_loaded": risk_model is not None,
450
+ "advice_model_loaded": advice_model is not None,
451
+ }
452
+
453
 
454
+ @app.post("/predict", response_model=PredictResponse)
455
+ def predict(input_data: WeatherInput):
456
+ if not risk_model or not advice_model:
457
+ raise HTTPException(503, "Models not loaded")
 
 
458
 
459
+ row = input_data.model_dump()
460
+ df = pd.DataFrame([row])[FEATURE_COLS]
461
 
462
+ # 1. Risk score
463
  risk_pred = risk_model.predict(df)[0]
464
+ risk_score = clamp_risk(risk_pred)
465
  risk_level = get_risk_level(risk_score)
466
 
467
+ # 2. Condition from ML
468
+ ml_condition = int(advice_model.predict(df)[0])
469
 
470
+ # 3. Deterministic coherence
471
+ condition_id = select_condition(ml_condition, row, risk_score)
472
+ content = ADVICE_LIBRARY.get(condition_id, ADVICE_LIBRARY[0])
473
 
474
+ # 4. Text by risk level
475
  text = content["texts"][risk_level]
476
 
477
+ # 5. Actions
478
+ actions = build_actions(condition_id, risk_score, row)
479
+
480
+ logger.info(f"Predict: risk={risk_score} ({risk_level}), cond={condition_id} ({content['title']})")
481
+
482
+ return PredictResponse(
483
+ risk_score=risk_score,
484
+ risk_level=risk_level,
485
+ condition=ConditionResponse(
486
+ id=condition_id,
487
+ title=content["title"],
488
+ emoji=content["emoji"],
489
+ text=text,
490
+ actions=actions,
491
+ ),
492
+ )
generate_data.py ADDED
@@ -0,0 +1,197 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Bioweather Production Data Generator v2.0
3
+ EmpedocLabs Β© 2025
4
+
5
+ Generates clinically-plausible weather β†’ headache risk data with:
6
+ - 15 distinct biometeo conditions
7
+ - Seasonal/geographic variation
8
+ - Multi-trigger overlap scoring
9
+ - Graded risk (not just if/else buckets)
10
+ - 20,000+ samples for robust training
11
+ """
12
+
13
+ import numpy as np
14
+ import pandas as pd
15
+
16
+
17
+ def generate_production_data(n: int = 25000, seed: int = 42) -> pd.DataFrame:
18
+ rng = np.random.default_rng(seed)
19
+ rows = []
20
+
21
+ for _ in range(n):
22
+ # ── Base weather with seasonal coherence ─────────────────────
23
+ season = rng.choice(["winter", "spring", "summer", "autumn"],
24
+ p=[0.25, 0.25, 0.25, 0.25])
25
+
26
+ if season == "winter":
27
+ temp = rng.normal(-2, 8)
28
+ humidity = rng.normal(70, 15)
29
+ uv = rng.integers(0, 4)
30
+ wind = abs(rng.normal(15, 12))
31
+ elif season == "spring":
32
+ temp = rng.normal(14, 7)
33
+ humidity = rng.normal(55, 18)
34
+ uv = rng.integers(2, 8)
35
+ wind = abs(rng.normal(18, 10))
36
+ elif season == "summer":
37
+ temp = rng.normal(28, 6)
38
+ humidity = rng.normal(55, 20)
39
+ uv = rng.integers(5, 11)
40
+ wind = abs(rng.normal(12, 8))
41
+ else: # autumn
42
+ temp = rng.normal(12, 8)
43
+ humidity = rng.normal(65, 15)
44
+ uv = rng.integers(1, 6)
45
+ wind = abs(rng.normal(16, 10))
46
+
47
+ temp = np.clip(temp, -15, 45)
48
+ humidity = np.clip(humidity, 8, 99)
49
+ uv = int(np.clip(uv, 0, 11))
50
+ wind = np.clip(wind, 0, 70)
51
+
52
+ pressure = rng.normal(1013, 12)
53
+ pressure = np.clip(pressure, 970, 1050)
54
+
55
+ # Pressure change: occasional fronts
56
+ if rng.random() < 0.10:
57
+ p_drop = rng.normal(-8, 3) # cold front
58
+ elif rng.random() < 0.08:
59
+ p_drop = rng.normal(7, 2.5) # high pressure ridge
60
+ else:
61
+ p_drop = rng.normal(0, 2.5)
62
+ p_drop = np.clip(p_drop, -15, 15)
63
+
64
+ # Temp change: some days have big swings
65
+ if rng.random() < 0.07:
66
+ t_change = rng.choice([-1, 1]) * abs(rng.normal(10, 3))
67
+ else:
68
+ t_change = rng.normal(0, 3)
69
+ t_change = np.clip(t_change, -15, 15)
70
+
71
+ # ── Additive risk scoring (multiple triggers stack) ──────────
72
+ risk = 5.0 # baseline
73
+ condition_scores = {} # condition_id β†’ contribution
74
+
75
+ # 1. Pressure drop (strongest weather trigger per literature)
76
+ if p_drop <= -8:
77
+ contribution = 35 + abs(p_drop) * 1.5
78
+ condition_scores[1] = contribution
79
+ risk += contribution
80
+ elif p_drop <= -4:
81
+ contribution = 15 + abs(p_drop) * 1.2
82
+ condition_scores[10] = contribution
83
+ risk += contribution
84
+ elif p_drop <= -2:
85
+ contribution = 8 + abs(p_drop) * 0.8
86
+ condition_scores[10] = contribution
87
+ risk += contribution
88
+
89
+ # 2. Pressure rise
90
+ if p_drop >= 8:
91
+ contribution = 25 + p_drop * 1.0
92
+ condition_scores[2] = contribution
93
+ risk += contribution
94
+ elif p_drop >= 4:
95
+ contribution = 12 + p_drop * 0.7
96
+ condition_scores[11] = contribution
97
+ risk += contribution
98
+ elif p_drop >= 2:
99
+ contribution = 6 + p_drop * 0.5
100
+ condition_scores[11] = contribution
101
+ risk += contribution
102
+
103
+ # 3. Sauna effect (heat + humidity)
104
+ if temp >= 28 and humidity >= 65:
105
+ strength = (temp - 28) * 2 + (humidity - 65) * 0.5
106
+ condition_scores[3] = strength
107
+ risk += strength
108
+
109
+ # 4. Wind
110
+ if wind >= 40:
111
+ condition_scores[4] = 25 + (wind - 40) * 0.8
112
+ risk += condition_scores[4]
113
+ elif wind >= 20:
114
+ condition_scores[12] = 10 + (wind - 20) * 0.3
115
+ risk += condition_scores[12]
116
+
117
+ # 5. UV glare
118
+ if uv >= 8:
119
+ condition_scores[5] = 20 + (uv - 8) * 3
120
+ risk += condition_scores[5]
121
+ elif uv >= 6 and temp > 15:
122
+ condition_scores[5] = 8 + (uv - 6) * 2
123
+ risk += condition_scores[5]
124
+
125
+ # 6. Bitter cold
126
+ if temp <= -5:
127
+ condition_scores[6] = 25 + abs(temp + 5) * 2
128
+ risk += condition_scores[6]
129
+ elif temp <= 2:
130
+ condition_scores[6] = 10 + abs(temp - 2) * 1.5
131
+ risk += condition_scores[6]
132
+
133
+ # 7. Drastic temp drop
134
+ if t_change <= -8:
135
+ condition_scores[7] = 30 + abs(t_change) * 1.5
136
+ risk += condition_scores[7]
137
+ elif t_change <= -5:
138
+ condition_scores[7] = 12 + abs(t_change) * 0.8
139
+ risk += condition_scores[7]
140
+
141
+ # 8. Heat shock
142
+ if t_change >= 8:
143
+ condition_scores[8] = 28 + t_change * 1.2
144
+ risk += condition_scores[8]
145
+ elif t_change >= 5:
146
+ condition_scores[8] = 10 + t_change * 0.7
147
+ risk += condition_scores[8]
148
+
149
+ # 9. Heavy dampness
150
+ if humidity >= 88 and wind <= 12:
151
+ condition_scores[9] = 15 + (humidity - 88) * 0.8
152
+ risk += condition_scores[9]
153
+
154
+ # 13. Dry air
155
+ if humidity <= 25:
156
+ condition_scores[13] = 18 + (25 - humidity) * 0.8
157
+ risk += condition_scores[13]
158
+ elif humidity <= 32:
159
+ condition_scores[13] = 8 + (32 - humidity) * 0.5
160
+ risk += condition_scores[13]
161
+
162
+ # 14. Stagnant & gloomy
163
+ if uv <= 2 and humidity >= 72 and wind <= 10 and temp < 18:
164
+ condition_scores[14] = 10 + (humidity - 72) * 0.3
165
+ risk += condition_scores[14]
166
+
167
+ # ── Determine primary condition ──────────────────────────────
168
+ if condition_scores:
169
+ label = max(condition_scores, key=condition_scores.get)
170
+ else:
171
+ label = 0 # clear skies
172
+
173
+ # ── Add realistic noise ──────────────────────────────────────
174
+ risk += rng.normal(0, 2.5)
175
+ risk = int(np.clip(round(risk), 0, 100))
176
+
177
+ rows.append([
178
+ round(temp, 1), round(pressure, 1), round(humidity, 1),
179
+ round(wind, 1), uv, round(p_drop, 2), round(t_change, 2),
180
+ risk, label,
181
+ ])
182
+
183
+ df = pd.DataFrame(rows, columns=[
184
+ "temp_c", "pressure_hpa", "humidity", "wind_kph", "uv_index",
185
+ "pressure_drop", "temp_change", "risk_score", "advice_label",
186
+ ])
187
+
188
+ print(f"βœ… Generated {len(df):,} samples")
189
+ print(f" Risk: mean={df['risk_score'].mean():.1f}, std={df['risk_score'].std():.1f}")
190
+ print(f" Conditions: {df['advice_label'].value_counts().sort_index().to_dict()}")
191
+ return df
192
+
193
+
194
+ if __name__ == "__main__":
195
+ df = generate_production_data()
196
+ df.to_csv("smart_weather_data.csv", index=False)
197
+ print(f"πŸ’Ύ Saved β†’ smart_weather_data.csv")
metadata.json ADDED
@@ -0,0 +1,24 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "version": "2.0.0",
3
+ "trained_at": "2026-03-12T12:07:07.026616",
4
+ "training_samples": 21250,
5
+ "features": [
6
+ "temp_c",
7
+ "pressure_hpa",
8
+ "humidity",
9
+ "wind_kph",
10
+ "uv_index",
11
+ "pressure_drop",
12
+ "temp_change"
13
+ ],
14
+ "num_conditions": 15,
15
+ "risk_metrics": {
16
+ "mae": 2.52,
17
+ "rmse": 3.28,
18
+ "r2": 0.9773
19
+ },
20
+ "advice_metrics": {
21
+ "accuracy": 0.9859,
22
+ "f1_macro": 0.9714
23
+ }
24
+ }
requirements.txt CHANGED
@@ -1,7 +1,6 @@
1
- fastapi
2
- uvicorn
3
- pydantic
4
- pandas
5
- numpy
6
- xgboost
7
- scikit-learn
 
1
+ fastapi>=0.104.0
2
+ uvicorn[standard]>=0.24.0
3
+ pydantic>=2.5.0
4
+ numpy>=1.24.0
5
+ pandas>=2.0.0
6
+ scikit-learn>=1.3.0
 
risk_model.pkl CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:e7c9b2874de4ad95960419b3f234f6637708840f8cafccb4c49c76591a68631b
3
- size 1122479
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:b916115815b51272b6e47b185ba99bfbc2dcb2f9c0c456c6cdec11ad0150e44b
3
+ size 1398959
train.py ADDED
@@ -0,0 +1,145 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Bioweather Model Training v2.0
3
+ EmpedocLabs Β© 2025
4
+
5
+ Trains:
6
+ 1. Risk regressor (0-100 score)
7
+ 2. Advice classifier (15 weather conditions)
8
+
9
+ Both use HistGradientBoosting (sklearn) β€” no XGBoost dependency needed.
10
+ """
11
+
12
+ import os
13
+ import pickle
14
+ import json
15
+ import numpy as np
16
+ import pandas as pd
17
+ from datetime import datetime
18
+
19
+ from sklearn.ensemble import HistGradientBoostingRegressor, HistGradientBoostingClassifier
20
+ from sklearn.model_selection import train_test_split
21
+ from sklearn.metrics import (
22
+ mean_absolute_error, mean_squared_error, r2_score,
23
+ classification_report, accuracy_score, f1_score,
24
+ )
25
+
26
+ from generate_data import generate_production_data
27
+
28
+ FEATURE_COLS = [
29
+ "temp_c", "pressure_hpa", "humidity", "wind_kph",
30
+ "uv_index", "pressure_drop", "temp_change",
31
+ ]
32
+
33
+ CONDITION_NAMES = {
34
+ 0: "Clear Skies", 1: "Rapid Pressure Drop", 2: "Pressure Squeeze",
35
+ 3: "Sauna Effect", 4: "High Wind", 5: "High UV Glare",
36
+ 6: "Bitter Cold", 7: "Drastic Temp Drop", 8: "Heat Shock",
37
+ 9: "Heavy Dampness", 10: "Mild Pressure Dip", 11: "Mild Pressure Rise",
38
+ 12: "Breezy Pollen", 13: "Dry Air", 14: "Stagnant & Gloomy",
39
+ }
40
+
41
+
42
+ def main():
43
+ print("=" * 60)
44
+ print(" BIOWEATHER v2.0 β€” Production Training")
45
+ print(" EmpedocLabs")
46
+ print("=" * 60)
47
+
48
+ # ── 1. Generate data ─────────────────────────────────────────
49
+ print("\nπŸ“Š Generating training data...")
50
+ df = generate_production_data(n=25000, seed=42)
51
+
52
+ X = df[FEATURE_COLS].values
53
+ y_risk = df["risk_score"].values
54
+ y_advice = df["advice_label"].values
55
+
56
+ # ── 2. Split ─────────────────────────────────────────────────
57
+ X_train, X_test, yr_train, yr_test, ya_train, ya_test = train_test_split(
58
+ X, y_risk, y_advice, test_size=0.15, random_state=42,
59
+ )
60
+ print(f"\nπŸ“‚ Split: Train={len(X_train):,} Test={len(X_test):,}")
61
+
62
+ # ── 3. Train risk regressor ──────────────────────────────────
63
+ print("\nπŸš€ Training risk regressor...")
64
+ risk_model = HistGradientBoostingRegressor(
65
+ max_iter=400,
66
+ max_depth=6,
67
+ learning_rate=0.05,
68
+ min_samples_leaf=15,
69
+ l2_regularization=0.5,
70
+ early_stopping=True,
71
+ validation_fraction=0.1,
72
+ n_iter_no_change=30,
73
+ random_state=42,
74
+ )
75
+ risk_model.fit(X_train, yr_train)
76
+ print(f" Iterations: {risk_model.n_iter_}")
77
+
78
+ yr_pred = risk_model.predict(X_test)
79
+ yr_pred = np.clip(yr_pred, 0, 100)
80
+ mae = mean_absolute_error(yr_test, yr_pred)
81
+ rmse = np.sqrt(mean_squared_error(yr_test, yr_pred))
82
+ r2 = r2_score(yr_test, yr_pred)
83
+ print(f" MAE: {mae:.2f}")
84
+ print(f" RMSE: {rmse:.2f}")
85
+ print(f" RΒ²: {r2:.4f}")
86
+
87
+ # ── 4. Train advice classifier ───────────────────────────────
88
+ print("\nπŸš€ Training advice classifier (15 conditions)...")
89
+ advice_model = HistGradientBoostingClassifier(
90
+ max_iter=400,
91
+ max_depth=6,
92
+ learning_rate=0.05,
93
+ min_samples_leaf=10,
94
+ l2_regularization=0.3,
95
+ early_stopping=True,
96
+ validation_fraction=0.1,
97
+ n_iter_no_change=30,
98
+ random_state=42,
99
+ )
100
+ advice_model.fit(X_train, ya_train)
101
+ print(f" Iterations: {advice_model.n_iter_}")
102
+
103
+ ya_pred = advice_model.predict(X_test)
104
+ acc = accuracy_score(ya_test, ya_pred)
105
+ f1_macro = f1_score(ya_test, ya_pred, average="macro", zero_division=0)
106
+ print(f" Accuracy: {acc:.4f}")
107
+ print(f" F1 macro: {f1_macro:.4f}")
108
+
109
+ print("\n Per-condition report:")
110
+ target_names = [CONDITION_NAMES.get(i, f"Cond_{i}") for i in sorted(set(ya_test) | set(ya_pred))]
111
+ print(classification_report(ya_test, ya_pred, target_names=target_names, zero_division=0))
112
+
113
+ # ── 5. Save models ───────────────────────────────────────────
114
+ os.makedirs("model", exist_ok=True)
115
+
116
+ with open("model/risk_model.pkl", "wb") as f:
117
+ pickle.dump(risk_model, f)
118
+ with open("model/advice_model.pkl", "wb") as f:
119
+ pickle.dump(advice_model, f)
120
+
121
+ metadata = {
122
+ "version": "2.0.0",
123
+ "trained_at": datetime.now().isoformat(),
124
+ "training_samples": len(X_train),
125
+ "features": FEATURE_COLS,
126
+ "num_conditions": 15,
127
+ "risk_metrics": {"mae": round(mae, 2), "rmse": round(rmse, 2), "r2": round(r2, 4)},
128
+ "advice_metrics": {"accuracy": round(acc, 4), "f1_macro": round(f1_macro, 4)},
129
+ }
130
+ with open("model/metadata.json", "w") as f:
131
+ json.dump(metadata, f, indent=2)
132
+
133
+ print(f"\nπŸ’Ύ model/risk_model.pkl ({os.path.getsize('model/risk_model.pkl') // 1024} KB)")
134
+ print(f"πŸ’Ύ model/advice_model.pkl ({os.path.getsize('model/advice_model.pkl') // 1024} KB)")
135
+ print(f"πŸ“‹ model/metadata.json")
136
+
137
+ print(f"\n{'=' * 60}")
138
+ print(f" βœ… BIOWEATHER v2.0 READY")
139
+ print(f" Risk: MAE={mae:.2f}, RΒ²={r2:.4f}")
140
+ print(f" Advice: Acc={acc:.4f}, F1={f1_macro:.4f}")
141
+ print(f"{'=' * 60}")
142
+
143
+
144
+ if __name__ == "__main__":
145
+ main()