jtlevine Claude Opus 4.7 (1M context) commited on
Commit
9b0be4c
·
1 Parent(s): d5f2ccd

Zone-specific trigger thresholds behind THRESHOLD_MODE env var

Browse files

Under THRESHOLD_MODE=zone_specific, each zone gets its own (alert_c,
payout_c) calibrated to the P90/P97 of that zone's 20-year ERA5-Land ×
UHI-corrected WBGT distribution. This matches the actuarial pattern
used by ARC, CCRIF, and SEWA heat pilot, and fixes the "Jangwani $0
premium" problem that surfaced under UHI_MODEL=lst with the global
35.1°C threshold — zones get equitable ~10% alert / ~3% payout trigger
frequency regardless of absolute temperature distribution.

Under UHI_MODEL=lst + THRESHOLD_MODE=zone_specific:
Jangwani: alert 31.79°C / payout 32.54°C (vs global 35.1/36.0 it
never reached, producing $0 premium)
Tandale: alert 36.10°C / payout 36.93°C (hotter than global, LST
confirmed this zone really is a heat hotspot)
Vingunguti: alert 36.73°C / payout 37.55°C (hottest zone)

Default remains THRESHOLD_MODE=global so production behavior is
unchanged until the HF Space secret is set.

- src/pricing/zone_thresholds.py: new module. Computes per-zone
P90/P97 from ERA5-Land × active UHI model, caches to
data/zone_thresholds.json on first call (gitignored; regenerates if
UHI_MODEL changes or cache is deleted).
- src/pricing/burn_analysis.py::burn_for_zone(): passes zone-specific
threshold into compute_burn() so actuarial pricing reflects zone
return periods.
- src/pipeline.py: trigger call site and observed-WBGT fallback both
now use get_zone_thresholds(zone). Drops the last reference to the
global WBGT_THRESHOLD_C constant in the per-zone loop.

Caveat worth flagging in the preprint: zone-specific thresholds make
coverage relative to local climatology, not absolute WBGT health risk
(ISO 7243). Product narrative needs to align with that choice —
workers in Jangwani are protected against "unusual heat for Jangwani,"
which is not the same as "WBGT ≥ 28°C absolute health threshold."

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

.gitignore CHANGED
@@ -14,3 +14,4 @@ dist/
14
  data/nasa_power_cache/
15
  data/era5_cache/
16
  data/era5land_cache/
 
 
14
  data/nasa_power_cache/
15
  data/era5_cache/
16
  data/era5land_cache/
17
+ data/zone_thresholds.json
CLAUDE.md CHANGED
@@ -110,6 +110,17 @@ Data under `data/landsat_lst/`:
110
  - `zone_features.json` — per-zone climatology features + hot-season anomaly
111
  - `city_climatology.json` — city-mean LST across covered zones
112
 
 
 
 
 
 
 
 
 
 
 
 
113
  ## Things to know
114
 
115
  - **HF Space runs on A100 GPU** — needed for GraphCast inference (~5-8s per forecast). Space wakes, runs pipeline, pauses. Cost: ~$0.50/week.
 
110
  - `zone_features.json` — per-zone climatology features + hot-season anomaly
111
  - `city_climatology.json` — city-mean LST across covered zones
112
 
113
+ ## Trigger threshold mode
114
+
115
+ `THRESHOLD_MODE` env var picks between global (Dar-wide) and zone-specific trigger thresholds.
116
+
117
+ - `THRESHOLD_MODE=global` (default, current production) — every zone uses the same 35.1°C alert / 36.0°C payout thresholds. Trigger frequency varies wildly across zones because UHI delta shifts the effective threshold per zone (under LST UHI, Jangwani's effective threshold is unreachable → $0 premium).
118
+ - `THRESHOLD_MODE=zone_specific` — each zone gets its own (alert_c, payout_c) calibrated to P90/P97 of that zone's own 20-year ERA5-Land × UHI-corrected WBGT distribution. Trigger frequency normalizes to ~10% alert / ~3% payout per year across all zones. Standard parametric-insurance actuarial pattern (ARC, CCRIF, SEWA).
119
+
120
+ Thresholds are computed at first pipeline invocation and cached to `data/zone_thresholds.json`. Delete the cache to force recompute after any UHI model change or panel rebuild. The cache depends on the active `UHI_MODEL` — changing the UHI model invalidates the threshold cache.
121
+
122
+ To flip production fully data-anchored: set both `UHI_MODEL=lst` and `THRESHOLD_MODE=zone_specific` as HF Space secrets.
123
+
124
  ## Things to know
125
 
126
  - **HF Space runs on A100 GPU** — needed for GraphCast inference (~5-8s per forecast). Space wakes, runs pipeline, pauses. Cost: ~$0.50/week.
src/pipeline.py CHANGED
@@ -638,24 +638,30 @@ class HeatRiskPipeline:
638
 
639
  if gc_wbgt is not None:
640
  zone_wbgt = [w + mean_uhi for w in gc_wbgt]
 
 
 
 
 
 
641
  action = forecast_trigger_decision(
642
  zone_wbgt,
643
  alert_duration_days=ALERT_CONSECUTIVE_DAYS,
644
  payout_duration_days=PAYOUT_CONSECUTIVE_DAYS,
645
- window_threshold_c=WBGT_THRESHOLD_C,
646
- payout_severity_c=30.7,
647
  )
648
  max_wbgt = max(zone_wbgt) if zone_wbgt else 0
649
  # Max consecutive-run length above threshold in the forecast
650
  consec = 0
651
  run_length = 0
652
  for w in zone_wbgt:
653
- if w > WBGT_THRESHOLD_C:
654
  run_length += 1
655
  consec = max(consec, run_length)
656
  else:
657
  run_length = 0
658
- total_above = sum(1 for w in zone_wbgt if w > WBGT_THRESHOLD_C)
659
 
660
  if action == "alert_cash":
661
  all_triggers.append(_make_trigger(
@@ -669,14 +675,16 @@ class HeatRiskPipeline:
669
  # Fallback: use recent observed WBGT
670
  recent_wbgt = wbgts[-7:] if len(wbgts) >= 7 else wbgts
671
  if recent_wbgt:
 
 
672
  max_wbgt = max(recent_wbgt)
673
  consec = 0
674
  for w in reversed(recent_wbgt):
675
- if w > WBGT_THRESHOLD_C:
676
  consec += 1
677
  else:
678
  break
679
- total_above = sum(1 for w in recent_wbgt if w > WBGT_THRESHOLD_C)
680
 
681
  if consec >= PAYOUT_CONSECUTIVE_DAYS:
682
  all_triggers.append(_make_trigger(
 
638
 
639
  if gc_wbgt is not None:
640
  zone_wbgt = [w + mean_uhi for w in gc_wbgt]
641
+ # Zone-specific trigger thresholds (P90/P97 of each zone's
642
+ # own corrected WBGT climatology under THRESHOLD_MODE=
643
+ # zone_specific; otherwise fall back to the global
644
+ # 35.1°C / 36.0°C values).
645
+ from src.pricing.zone_thresholds import get_zone_thresholds
646
+ zone_alert_c, zone_payout_c = get_zone_thresholds(zone)
647
  action = forecast_trigger_decision(
648
  zone_wbgt,
649
  alert_duration_days=ALERT_CONSECUTIVE_DAYS,
650
  payout_duration_days=PAYOUT_CONSECUTIVE_DAYS,
651
+ window_threshold_c=zone_alert_c,
652
+ payout_severity_c=zone_payout_c,
653
  )
654
  max_wbgt = max(zone_wbgt) if zone_wbgt else 0
655
  # Max consecutive-run length above threshold in the forecast
656
  consec = 0
657
  run_length = 0
658
  for w in zone_wbgt:
659
+ if w > zone_alert_c:
660
  run_length += 1
661
  consec = max(consec, run_length)
662
  else:
663
  run_length = 0
664
+ total_above = sum(1 for w in zone_wbgt if w > zone_alert_c)
665
 
666
  if action == "alert_cash":
667
  all_triggers.append(_make_trigger(
 
675
  # Fallback: use recent observed WBGT
676
  recent_wbgt = wbgts[-7:] if len(wbgts) >= 7 else wbgts
677
  if recent_wbgt:
678
+ from src.pricing.zone_thresholds import get_zone_thresholds
679
+ zone_alert_c, _ = get_zone_thresholds(zone)
680
  max_wbgt = max(recent_wbgt)
681
  consec = 0
682
  for w in reversed(recent_wbgt):
683
+ if w > zone_alert_c:
684
  consec += 1
685
  else:
686
  break
687
+ total_above = sum(1 for w in recent_wbgt if w > zone_alert_c)
688
 
689
  if consec >= PAYOUT_CONSECUTIVE_DAYS:
690
  all_triggers.append(_make_trigger(
src/pricing/burn_analysis.py CHANGED
@@ -200,7 +200,12 @@ class BurnAnalysisPricer:
200
  uhi_lo, uhi_hi = get_zone_uhi_range(zone)
201
  mean_uhi = (uhi_lo + uhi_hi) / 2.0
202
 
203
- result = compute_burn(records, mean_uhi)
 
 
 
 
 
204
  result.zone_id = zone.zone_id
205
  result.basis_risk_score = _basis_risk_for_zone(zone, mean_uhi)
206
 
 
200
  uhi_lo, uhi_hi = get_zone_uhi_range(zone)
201
  mean_uhi = (uhi_lo + uhi_hi) / 2.0
202
 
203
+ # Zone-specific trigger threshold (THRESHOLD_MODE env var selects
204
+ # global=35.1°C vs zone_specific=per-zone P90 from local climatology).
205
+ from src.pricing.zone_thresholds import get_zone_thresholds
206
+ alert_c, _payout_peak_c = get_zone_thresholds(zone)
207
+
208
+ result = compute_burn(records, mean_uhi, threshold_c=alert_c)
209
  result.zone_id = zone.zone_id
210
  result.basis_risk_score = _basis_risk_for_zone(zone, mean_uhi)
211
 
src/pricing/zone_thresholds.py ADDED
@@ -0,0 +1,148 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """Zone-specific trigger threshold calibration.
2
+
3
+ Computes (alert_wbgt_c, payout_peak_wbgt_c) per zone from each zone's own
4
+ 20-year ERA5-Land WBGT distribution, with UHI delta applied. Replaces the
5
+ global 35.1°C / 36.0°C thresholds used in Phase 1 with zone-relative
6
+ thresholds calibrated to the same percentiles (P90 alert, P97 payout) —
7
+ the actuarial pattern used by ARC, CCRIF, SEWA heat pilot.
8
+
9
+ Activated by THRESHOLD_MODE env var:
10
+ THRESHOLD_MODE=global (default) use WBGT_THRESHOLD_C / PAYOUT_PEAK
11
+ THRESHOLD_MODE=zone_specific per-zone P90/P97 calibrated from local history
12
+
13
+ Cached to data/zone_thresholds.json for reuse; safe to delete to force
14
+ recompute after any UHI model change.
15
+ """
16
+ from __future__ import annotations
17
+
18
+ import json
19
+ import math
20
+ import os
21
+ from pathlib import Path
22
+ from typing import Tuple
23
+
24
+ _REPO_ROOT = Path(__file__).resolve().parents[2]
25
+ ERA5_PATH = _REPO_ROOT / "data" / "era5land_dar_es_salaam.json"
26
+ CACHE_PATH = _REPO_ROOT / "data" / "zone_thresholds.json"
27
+
28
+ ALERT_PERCENTILE = 0.90 # P90 for alert-tier trigger (matches grid-cell
29
+ # 35.1°C historical origin on raw ERA5-Land)
30
+ PAYOUT_PERCENTILE = 0.97 # P97 for payout-tier peak severity
31
+
32
+
33
+ def _calculate_wbgt(temp_c: float, humidity_pct: float) -> float:
34
+ """Liljegren simplified outdoor — matches CRE src.indexing.heat_index."""
35
+ es = 6.112 * math.exp((17.67 * temp_c) / (temp_c + 243.5))
36
+ e = es * (humidity_pct / 100.0)
37
+ return 0.567 * temp_c + 0.393 * e + 3.94
38
+
39
+
40
+ def _percentile(values: list[float], p: float) -> float:
41
+ if not values:
42
+ return 0.0
43
+ s = sorted(values)
44
+ idx = p * (len(s) - 1)
45
+ lo, hi = int(math.floor(idx)), int(math.ceil(idx))
46
+ if lo == hi:
47
+ return s[lo]
48
+ return s[lo] + (s[hi] - s[lo]) * (idx - lo)
49
+
50
+
51
+ def _threshold_mode() -> str:
52
+ return os.environ.get("THRESHOLD_MODE", "global").lower()
53
+
54
+
55
+ def compute_zone_thresholds(use_cache: bool = True) -> dict[str, dict[str, float]]:
56
+ """Return {zone_id: {alert_c, payout_c, n_days, mean_wbgt_c}} for Dar zones.
57
+
58
+ Iterates each Dar zone, applies its UHI delta (from the active UHI model)
59
+ to the 20-year ERA5-Land DAR-JAN grid-cell series, computes per-day WBGT,
60
+ and extracts percentiles.
61
+
62
+ Cached to ``CACHE_PATH`` to avoid re-computing on every pipeline run.
63
+ """
64
+ if use_cache and CACHE_PATH.exists():
65
+ return json.loads(CACHE_PATH.read_text())
66
+
67
+ from config import ZONES
68
+ from src.downscaling import get_uhi_corrector
69
+ from datetime import datetime
70
+
71
+ era5 = json.loads(ERA5_PATH.read_text())
72
+ grid_rows = era5["DAR-JAN"] # all 15 Dar zones resolve to this grid cell
73
+ corrector = get_uhi_corrector()
74
+
75
+ out: dict[str, dict[str, float]] = {}
76
+ for z in ZONES:
77
+ if z.city != "Dar es Salaam":
78
+ continue
79
+ wbgts = []
80
+ for r in grid_rows:
81
+ t = r.get("temp_max_c")
82
+ h = r.get("humidity_pct")
83
+ if t is None or h is None:
84
+ continue
85
+ month = int(r["date"][5:7])
86
+ # Apply UHI correction at this zone for this month (mid-day)
87
+ corrected_t, _, _ = corrector.correct_temperature(z, float(t), hour=14, month=month)
88
+ wbgts.append(_calculate_wbgt(corrected_t, float(h)))
89
+ if not wbgts:
90
+ continue
91
+ out[z.zone_id] = {
92
+ "alert_c": round(_percentile(wbgts, ALERT_PERCENTILE), 2),
93
+ "payout_c": round(_percentile(wbgts, PAYOUT_PERCENTILE), 2),
94
+ "n_days": len(wbgts),
95
+ "mean_wbgt_c": round(sum(wbgts) / len(wbgts), 2),
96
+ "uhi_model": os.environ.get("UHI_MODEL", "synthetic").lower(),
97
+ }
98
+ CACHE_PATH.parent.mkdir(parents=True, exist_ok=True)
99
+ CACHE_PATH.write_text(json.dumps(out, indent=2))
100
+ return out
101
+
102
+
103
+ # Global-mode fallback constants (imported by callers for back-compat)
104
+ GLOBAL_ALERT_C = 35.1
105
+ GLOBAL_PAYOUT_PEAK_C = 36.0
106
+
107
+
108
+ def get_zone_thresholds(zone) -> Tuple[float, float]:
109
+ """Return (alert_c, payout_peak_c) for a zone.
110
+
111
+ THRESHOLD_MODE=zone_specific: zone's own P90/P97 from historical series.
112
+ THRESHOLD_MODE=global (default): (35.1, 36.0) regardless of zone.
113
+
114
+ Zones without pre-computed thresholds (non-Dar zones, or first-ever run
115
+ before cache exists) fall back to global.
116
+ """
117
+ if _threshold_mode() != "zone_specific":
118
+ return GLOBAL_ALERT_C, GLOBAL_PAYOUT_PEAK_C
119
+ try:
120
+ thresholds = compute_zone_thresholds(use_cache=True)
121
+ except Exception:
122
+ return GLOBAL_ALERT_C, GLOBAL_PAYOUT_PEAK_C
123
+ zt = thresholds.get(zone.zone_id)
124
+ if zt is None:
125
+ return GLOBAL_ALERT_C, GLOBAL_PAYOUT_PEAK_C
126
+ return float(zt["alert_c"]), float(zt["payout_c"])
127
+
128
+
129
+ def current_mode() -> str:
130
+ return _threshold_mode()
131
+
132
+
133
+ if __name__ == "__main__":
134
+ import sys
135
+ sys.path.insert(0, str(_REPO_ROOT))
136
+ # Force recompute
137
+ if CACHE_PATH.exists():
138
+ CACHE_PATH.unlink()
139
+ import os as _os
140
+ _os.environ.setdefault("UHI_MODEL", "lst")
141
+ print(f"Computing zone thresholds under UHI_MODEL={_os.environ['UHI_MODEL']}...")
142
+ t = compute_zone_thresholds(use_cache=False)
143
+ print(f"\n{'zone':9s} {'alert':>7s} {'payout':>7s} {'mean':>7s} {'n':>6s}")
144
+ for zid, v in sorted(t.items()):
145
+ print(f"{zid:9s} {v['alert_c']:>7.2f} {v['payout_c']:>7.2f} "
146
+ f"{v['mean_wbgt_c']:>7.2f} {v['n_days']:>6d}")
147
+ print(f"\nGlobal reference (THRESHOLD_MODE=global): alert={GLOBAL_ALERT_C} payout={GLOBAL_PAYOUT_PEAK_C}")
148
+ print(f"\nCached at: {CACHE_PATH}")