KSvend Claude Opus 4.6 (1M context) commited on
Commit
ffb57c8
·
1 Parent(s): e968fb9

fix: data integrity — built-up persistence, status gating, SAR drift detection, direction-aware narrative

Browse files

Second-pass fixes based on the Nyala South (Sudan) test report showing
AMBER flags driven by statistical artifacts rather than real signals.

1. Built-up areas: persistence-based classification
- Replace monthly NDBI>0 classification with a persistence mask:
pixel is built-up only if NDBI>0 AND NDVI<0.15 in ≥60% of months
across the analysis period. Removes the seasonal cycling artifact
where bare soil got classified as built-up in dry seasons only.
- Tighten NDVI threshold from 0.2 → 0.15 (sparse-veg regions).
- Replace monthly z-score with direct percentage change between
current and baseline persistent masks. Change thresholds:
|change|≥15% → RED, ≥5% → AMBER, else GREEN.
- Chart now shows baseline vs current as two bars instead of an
oscillating monthly time series that misrepresents reality.
- Shared _analyze_persistence helper used by both batch and
non-batch harvest paths (eliminates duplication).
- Smoke-tested: synthetic seasonal pixel correctly rejected,
persistent pixel correctly kept.

2. Status gating (_classify_zscore)
- Aggregate z-score alone is no longer sufficient to trigger RED.
Moderate z (>1) requires supporting evidence: multiple anomaly
months OR hotspot ≥ 5%. Strong z (>2) without any supporting
evidence caps at AMBER, not RED.
- New min_coverage_pct gate: when water coverage < 0.5% of AOI,
the indicator cannot leave GREEN (prevents noise-driven alerts
on near-dry landscapes).
- Water AMBER on 0.1% coverage now correctly stays GREEN.
- NDVI z=+1.9 with 2/24 anomaly months stays AMBER; same z with
0 anomaly months and no hotspots drops to GREEN.

3. SAR baseline drift detection
- When >40% of months are flagged anomalous, the finding is more
likely a Sentinel-1 IPF version change, orbit geometry shift, or
regime shift than a per-month anomaly pattern. Flag as AMBER with
"baseline may be unreliable" headline and add a limitation noting
the drift check.
- Applied to both batch and non-batch harvest paths.

4. Direction-aware compound signals and narrative
- Cross-pattern matcher now checks z-score direction (up/down/stable),
not just status level. Fixes the bug where a greening + contraction
result produced "vegetation loss coincides with settlement expansion"
in the situation narrative.
- New rule for conflict-context greening + built-up contraction:
"Vegetation recovery coincides with apparent built-up contraction —
in conflict or displacement contexts this can reflect abandoned
land returning to vegetation".
- Direction threshold is product-aware: buildup uses 5% change (its
AMBER cutoff); other indicators use z=1.0.

5. Formatting fixes
- Narrative no longer produces ".." — strip trailing period from
each sentence before re-adding exactly one.
- Overview map date range moved from below the axes (where it
collided with x-axis ticks) into the title block as a second line.

Smoke-tested: all gating cases, direction matching, persistence mask,
and a Nyala-like synthetic scenario confirm no regressions and the
specific bugs in the review report are fixed.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

app/eo_products/base.py CHANGED
@@ -95,14 +95,66 @@ class BaseProduct(abc.ABC):
95
  return float(valid / total) if total > 0 else 0.0
96
 
97
  @staticmethod
98
- def _classify_zscore(z_score: float, hotspot_pct: float) -> "StatusLevel":
99
- """Classify status using z-score and hotspot percentage."""
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
100
  from app.models import StatusLevel
101
  from app.config import ZSCORE_THRESHOLD
102
- if abs(z_score) > ZSCORE_THRESHOLD or hotspot_pct > 25:
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
103
  return StatusLevel.RED
104
- if abs(z_score) > 1.0 or hotspot_pct > 10:
 
 
 
 
105
  return StatusLevel.AMBER
 
 
 
 
 
106
  return StatusLevel.GREEN
107
 
108
  @staticmethod
 
95
  return float(valid / total) if total > 0 else 0.0
96
 
97
  @staticmethod
98
+ def _classify_zscore(
99
+ z_score: float,
100
+ hotspot_pct: float,
101
+ *,
102
+ anomaly_months: int = 0,
103
+ total_months: int = 0,
104
+ min_coverage_pct: float | None = None,
105
+ ) -> "StatusLevel":
106
+ """Classify status using z-score, hotspot %, and evidence gates.
107
+
108
+ Evidence rules (all apply):
109
+
110
+ 1. **Minimum coverage gate.** If the caller passes ``min_coverage_pct``
111
+ and it is below that threshold, the indicator cannot go above GREEN.
112
+ Used by water: when AOI water fraction is <0.5%, a large z-score
113
+ is dominated by noise from a handful of pixels.
114
+
115
+ 2. **Aggregate-z-score alone is not enough.** A z-score between 1.0
116
+ and 2.0 only raises status to AMBER if supported by *at least one*
117
+ of: (a) two or more monthly anomalies, or (b) hotspot coverage ≥ 5%.
118
+ This prevents single-number z-scores from driving false alarms on
119
+ otherwise stable time series.
120
+
121
+ 3. **RED requires either strong aggregate evidence or widespread
122
+ hotspots.** |z| > 2 AND ≥1 supporting month OR hotspot ≥ 25%.
123
+ """
124
  from app.models import StatusLevel
125
  from app.config import ZSCORE_THRESHOLD
126
+
127
+ z = safe_float(z_score)
128
+ hot = safe_float(hotspot_pct)
129
+
130
+ # Gate 1: minimum coverage
131
+ if min_coverage_pct is not None and min_coverage_pct < 0.5:
132
+ return StatusLevel.GREEN
133
+
134
+ strong_z = abs(z) > ZSCORE_THRESHOLD
135
+ moderate_z = abs(z) > 1.0
136
+ any_monthly_anomaly = anomaly_months >= 1
137
+ multiple_monthly_anomalies = anomaly_months >= 2
138
+ strong_hot = hot > 25
139
+ moderate_hot = hot > 10
140
+ minor_hot = hot > 5
141
+
142
+ # RED: widespread hotspots alone, OR strong z backed by any evidence
143
+ if strong_hot:
144
+ return StatusLevel.RED
145
+ if strong_z and (any_monthly_anomaly or moderate_hot):
146
  return StatusLevel.RED
147
+
148
+ # AMBER: moderate z backed by supporting evidence, or moderate hotspots
149
+ if moderate_hot:
150
+ return StatusLevel.AMBER
151
+ if moderate_z and (multiple_monthly_anomalies or minor_hot):
152
  return StatusLevel.AMBER
153
+
154
+ # Strong z without any supporting evidence → only AMBER, not RED
155
+ if strong_z:
156
+ return StatusLevel.AMBER
157
+
158
  return StatusLevel.GREEN
159
 
160
  @staticmethod
app/eo_products/buildup.py CHANGED
@@ -46,7 +46,12 @@ logger = logging.getLogger(__name__)
46
 
47
  BASELINE_YEARS = 5
48
  NDBI_THRESHOLD = 0.0 # NDBI > 0 = potential built-up
49
- NDVI_BUILDUP_MAX = 0.2 # NDVI < 0.2 required to exclude vegetation (combined with NDBI threshold)
 
 
 
 
 
50
 
51
 
52
  class BuiltupProduct(BaseProduct):
@@ -152,143 +157,13 @@ class BuiltupProduct(BaseProduct):
152
  spatial_completeness = self._compute_spatial_completeness(current_path)
153
 
154
  if baseline_path:
155
- seasonal_stats = compute_seasonal_stats_aoi(baseline_path, n_years=BASELINE_YEARS)
156
- baseline_stats = self._compute_stats(baseline_path)
157
- baseline_frac = baseline_stats["overall_buildup_fraction"]
158
- baseline_ha = baseline_frac * aoi_ha
159
-
160
- start_month = time_range.start.month
161
- most_recent_month = ((start_month + n_current_bands - 2) % 12) + 1
162
-
163
- # Z-score for overall current mean NDBI vs seasonal baseline
164
- if most_recent_month in seasonal_stats and seasonal_stats[most_recent_month]["n_years"] > 0:
165
- s = seasonal_stats[most_recent_month]
166
- z_current = safe_float(compute_zscore(current_mean, s["mean"], s["std"], MIN_STD_BUILDUP))
167
- else:
168
- z_current = 0.0
169
-
170
- # Per-month z-scores and anomaly count
171
- anomaly_months = 0
172
- monthly_zscores = []
173
- for i, val in enumerate(current_stats["monthly_means"]):
174
- cal_month = ((start_month + i - 1) % 12) + 1
175
- if cal_month in seasonal_stats and seasonal_stats[cal_month]["n_years"] > 0:
176
- z = safe_float(compute_zscore(val, seasonal_stats[cal_month]["mean"],
177
- seasonal_stats[cal_month]["std"], MIN_STD_BUILDUP))
178
- monthly_zscores.append(z)
179
- if abs(z) > ZSCORE_THRESHOLD:
180
- anomaly_months += 1
181
- else:
182
- monthly_zscores.append(0.0)
183
-
184
- # Pixel-level hotspot detection
185
- month_map = group_bands_by_calendar_month(baseline_stats["valid_months_total"], BASELINE_YEARS)
186
- hotspot_pct = 0.0
187
- self._zscore_raster = None
188
- self._hotspot_mask = None
189
- if most_recent_month in month_map and len(month_map[most_recent_month]) > 0:
190
- pixel_stats = compute_seasonal_stats_pixel(baseline_path, month_map[most_recent_month])
191
- with rasterio.open(current_path) as src:
192
- current_band_idx = min(n_current_bands, src.count)
193
- current_data = src.read(current_band_idx).astype(np.float32)
194
- if src.nodata is not None:
195
- current_data[current_data == src.nodata] = np.nan
196
-
197
- z_raster = compute_zscore_raster(current_data, pixel_stats["mean"],
198
- pixel_stats["std"], MIN_STD_BUILDUP)
199
- hotspot_mask, hotspot_pct = detect_hotspots(z_raster, ZSCORE_THRESHOLD)
200
- self._zscore_raster = z_raster
201
- self._hotspot_mask = hotspot_mask
202
-
203
- # Four-factor confidence scoring
204
- baseline_depth = sum(1 for m in range(1, 13)
205
- if m in seasonal_stats and seasonal_stats[m]["n_years"] > 0)
206
- mean_baseline_years = (sum(seasonal_stats[m]["n_years"] for m in range(1, 13)
207
- if m in seasonal_stats) / max(baseline_depth, 1))
208
- conf = compute_confidence(
209
- valid_months=n_current_bands,
210
-
211
- baseline_years_with_data=int(mean_baseline_years),
212
  spatial_completeness=spatial_completeness,
213
  )
214
- confidence = conf["level"]
215
- confidence_factors = conf["factors"]
216
-
217
- status = self._classify_zscore(z_current, hotspot_pct)
218
- trend = self._compute_trend_zscore(monthly_zscores)
219
-
220
- baseline_buildup_fractions = self._build_seasonal_buildup_fractions(
221
- baseline_stats["monthly_buildup_fractions"], BASELINE_YEARS,
222
- )
223
- chart_data = self._build_seasonal_chart_data(
224
- current_stats["monthly_buildup_fractions"], baseline_buildup_fractions,
225
- time_range, monthly_zscores, aoi_ha,
226
- )
227
-
228
- headline = self._generate_headline(
229
- status=status,
230
- z_current=z_current,
231
- hotspot_pct=hotspot_pct,
232
- anomaly_months=anomaly_months,
233
- total_months=n_current_bands,
234
- value_phrase=f"{current_ha:.0f} ha built-up",
235
- indicator_label="Built-up areas",
236
- direction_up="expansion",
237
- direction_down="contraction",
238
- )
239
-
240
- # Write change raster for map rendering
241
- change_map_path = os.path.join(results_dir, "buildup_change.tif")
242
- self._write_change_raster(current_path, baseline_path, change_map_path)
243
-
244
- self._spatial_data = SpatialData(
245
- map_type="raster",
246
- label="Built-up Change",
247
- colormap="PiYG",
248
- vmin=-1,
249
- vmax=1,
250
- )
251
- self._product_raster_path = change_map_path
252
- self._render_band = 1
253
-
254
- return ProductResult(
255
- product_id=self.id,
256
- headline=headline,
257
- status=status,
258
- trend=trend,
259
- confidence=confidence,
260
- map_layer_path=change_map_path,
261
- chart_data=chart_data,
262
- data_source="satellite",
263
- anomaly_months=anomaly_months,
264
- z_score_current=round(z_current, 2),
265
- hotspot_pct=round(hotspot_pct, 1),
266
- confidence_factors=confidence_factors,
267
- summary=(
268
- f"Built-up area covers {current_frac*100:.1f}% of the AOI "
269
- f"({current_ha:.0f} ha), mean NDBI {current_mean:.3f} "
270
- f"(z-score {z_current:+.1f} vs seasonal baseline). "
271
- f"{anomaly_months} of {n_current_bands} months show significant anomalies. "
272
- f"{hotspot_pct:.0f}% of AOI has statistically significant change. "
273
- f"Pixel-level NDBI analysis at {BUILDUP_RESOLUTION_M}m resolution."
274
- ),
275
- methodology=(
276
- f"Sentinel-2 L2A pixel-level NDBI = (B11 \u2212 B08) / (B11 + B08). "
277
- f"Built-up classified as NDBI > {NDBI_THRESHOLD}. "
278
- f"Cloud-masked using SCL band. "
279
- f"Monthly median composites at {BUILDUP_RESOLUTION_M}m native resolution. "
280
- f"Baseline: {BASELINE_YEARS}-year seasonal baselines (per calendar month). "
281
- f"Anomaly detection via z-scores (threshold: \u00b1{ZSCORE_THRESHOLD}). "
282
- f"Processed via CDSE openEO batch jobs."
283
- ),
284
- limitations=[
285
- f"Resampled to {BUILDUP_RESOLUTION_M}m \u2014 detects settlement extent, not individual buildings.",
286
- "NDBI may confuse bare rock/sand with built-up in arid landscapes.",
287
- "Seasonal vegetation cycles can cause false positives at settlement fringes.",
288
- "For building-level analysis, the SR4S pipeline (GPU-dependent) would be needed.",
289
- "Z-score anomalies assume baseline is representative of normal conditions.",
290
- ],
291
- )
292
  else:
293
  # Degraded mode — no baseline
294
  z_current = 0.0
@@ -415,105 +290,108 @@ class BuiltupProduct(BaseProduct):
415
  )
416
 
417
  self._true_color_path = true_color_path
 
418
 
419
- # --- Seasonal baseline analysis ---
420
- current_stats = self._compute_stats(current_path)
421
- baseline_stats = self._compute_stats(baseline_path)
422
- current_mean = current_stats["overall_mean"]
423
- current_frac = current_stats["overall_buildup_fraction"]
424
- n_current_bands = current_stats["valid_months"]
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
425
  aoi_ha = aoi.area_km2 * 100 # km² → hectares
426
- current_ha = current_frac * aoi_ha
427
- baseline_frac = baseline_stats["overall_buildup_fraction"]
428
- baseline_ha = baseline_frac * aoi_ha
429
 
430
- spatial_completeness = self._compute_spatial_completeness(current_path)
 
 
 
 
 
431
 
432
- seasonal_stats = compute_seasonal_stats_aoi(baseline_path, n_years=BASELINE_YEARS)
433
- start_month = time_range.start.month
434
- most_recent_month = ((start_month + n_current_bands - 2) % 12) + 1
435
 
436
- # Z-score for overall current mean NDBI vs seasonal baseline
437
- if most_recent_month in seasonal_stats and seasonal_stats[most_recent_month]["n_years"] > 0:
438
- s = seasonal_stats[most_recent_month]
439
- z_current = compute_zscore(current_mean, s["mean"], s["std"], MIN_STD_BUILDUP)
 
440
  else:
441
- z_current = 0.0
442
 
443
- # Per-month z-scores and anomaly count
444
- anomaly_months = 0
445
- monthly_zscores = []
446
- for i, val in enumerate(current_stats["monthly_means"]):
447
- cal_month = ((start_month + i - 1) % 12) + 1
448
- if cal_month in seasonal_stats and seasonal_stats[cal_month]["n_years"] > 0:
449
- z = compute_zscore(val, seasonal_stats[cal_month]["mean"],
450
- seasonal_stats[cal_month]["std"], MIN_STD_BUILDUP)
451
- monthly_zscores.append(z)
452
- if abs(z) > ZSCORE_THRESHOLD:
453
- anomaly_months += 1
454
- else:
455
- monthly_zscores.append(0.0)
456
-
457
- # Pixel-level hotspot detection
458
- month_map = group_bands_by_calendar_month(baseline_stats["valid_months_total"], BASELINE_YEARS)
459
- hotspot_pct = 0.0
460
- self._zscore_raster = None
461
- self._hotspot_mask = None
462
- if most_recent_month in month_map and len(month_map[most_recent_month]) > 0:
463
- pixel_stats = compute_seasonal_stats_pixel(baseline_path, month_map[most_recent_month])
464
- with rasterio.open(current_path) as src:
465
- current_band_idx = min(n_current_bands, src.count)
466
- current_data = src.read(current_band_idx).astype(np.float32)
467
- if src.nodata is not None:
468
- current_data[current_data == src.nodata] = np.nan
469
- z_raster = compute_zscore_raster(current_data, pixel_stats["mean"],
470
- pixel_stats["std"], MIN_STD_BUILDUP)
471
- hotspot_mask, hotspot_pct = detect_hotspots(z_raster, ZSCORE_THRESHOLD)
472
- self._zscore_raster = z_raster
473
- self._hotspot_mask = hotspot_mask
474
-
475
- # Four-factor confidence scoring
476
- baseline_depth = sum(1 for m in range(1, 13)
477
- if m in seasonal_stats and seasonal_stats[m]["n_years"] > 0)
478
- mean_baseline_years = (sum(seasonal_stats[m]["n_years"] for m in range(1, 13)
479
- if m in seasonal_stats) / max(baseline_depth, 1))
480
- conf = compute_confidence(
481
- valid_months=n_current_bands,
482
- baseline_years_with_data=int(mean_baseline_years),
483
- spatial_completeness=spatial_completeness,
484
  )
485
- confidence = conf["level"]
486
- confidence_factors = conf["factors"]
487
 
488
- status = self._classify_zscore(z_current, hotspot_pct)
489
- trend = self._compute_trend_zscore(monthly_zscores)
490
- baseline_buildup_fractions = self._build_seasonal_buildup_fractions(
491
- baseline_stats["monthly_buildup_fractions"], BASELINE_YEARS,
492
  )
493
- chart_data = self._build_seasonal_chart_data(
494
- current_stats["monthly_buildup_fractions"], baseline_buildup_fractions,
495
- time_range, monthly_zscores, aoi_ha,
496
  )
 
 
 
497
 
498
- headline = self._generate_headline(
499
- status=status,
500
- z_current=z_current,
501
- hotspot_pct=hotspot_pct,
502
- anomaly_months=anomaly_months,
503
- total_months=n_current_bands,
504
- value_phrase=f"{current_ha:.0f} ha built-up",
505
- indicator_label="Built-up areas",
506
- direction_up="expansion",
507
- direction_down="contraction",
508
  )
509
 
510
- # Write change raster for map rendering
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
511
  change_map_path = os.path.join(results_dir, "buildup_change.tif")
512
- self._write_change_raster(current_path, baseline_path, change_map_path)
 
 
 
 
513
 
514
  self._spatial_data = SpatialData(
515
  map_type="raster",
516
- label="Built-up Change",
517
  colormap="PiYG",
518
  vmin=-1,
519
  vmax=1,
@@ -526,40 +404,150 @@ class BuiltupProduct(BaseProduct):
526
  headline=headline,
527
  status=status,
528
  trend=trend,
529
- confidence=confidence,
530
  map_layer_path=change_map_path,
531
  chart_data=chart_data,
532
  data_source="satellite",
533
- anomaly_months=anomaly_months,
534
- z_score_current=round(z_current, 2),
535
  hotspot_pct=round(hotspot_pct, 1),
536
- confidence_factors=confidence_factors,
537
  summary=(
538
- f"Built-up area covers {current_frac*100:.1f}% of the AOI "
539
- f"({current_ha:.0f} ha), mean NDBI {current_mean:.3f} "
540
- f"(z-score {z_current:+.1f} vs seasonal baseline). "
541
- f"{anomaly_months} of {n_current_bands} months show significant anomalies. "
542
- f"{hotspot_pct:.0f}% of AOI has statistically significant change. "
543
- f"Pixel-level NDBI analysis at {BUILDUP_RESOLUTION_M}m resolution."
 
 
544
  ),
545
  methodology=(
546
- f"Sentinel-2 L2A pixel-level NDBI = (B11 \u2212 B08) / (B11 + B08). "
547
- f"Built-up classified as NDBI > {NDBI_THRESHOLD}. "
 
 
 
 
 
 
 
 
548
  f"Cloud-masked using SCL band. "
549
- f"Monthly median composites at {BUILDUP_RESOLUTION_M}m native resolution. "
550
- f"Baseline: {BASELINE_YEARS}-year seasonal baselines (per calendar month). "
551
- f"Anomaly detection via z-scores (threshold: \u00b1{ZSCORE_THRESHOLD}). "
552
- f"Processed server-side via CDSE openEO."
553
  ),
554
  limitations=[
555
- f"Resampled to {BUILDUP_RESOLUTION_M}m \u2014 detects settlement extent, not individual buildings.",
556
- "NDBI may confuse bare rock/sand with built-up in arid landscapes.",
557
- "Seasonal vegetation cycles can cause false positives at settlement fringes.",
558
- "For building-level analysis, the SR4S pipeline (GPU-dependent) would be needed.",
559
- "Z-score anomalies assume baseline is representative of normal conditions.",
560
  ],
561
  )
562
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
563
  @staticmethod
564
  def _compute_stats(tif_path: str) -> dict[str, Any]:
565
  """Extract monthly built-up fraction and raw NDBI stats from GeoTIFF.
 
46
 
47
  BASELINE_YEARS = 5
48
  NDBI_THRESHOLD = 0.0 # NDBI > 0 = potential built-up
49
+ NDVI_BUILDUP_MAX = 0.15 # NDVI < 0.15 to exclude even sparse vegetation (tighter than generic 0.2)
50
+ PERSISTENCE_MIN_FRAC = 0.6 # Pixel must meet built-up condition in ≥60% of months to be "persistent" (excludes dry-season-only false positives)
51
+
52
+ # Change thresholds for status classification (persistent fraction, not z-score)
53
+ CHANGE_AMBER_PCT = 5.0 # |change| ≥ 5% → AMBER
54
+ CHANGE_RED_PCT = 15.0 # |change| ≥ 15% → RED
55
 
56
 
57
  class BuiltupProduct(BaseProduct):
 
157
  spatial_completeness = self._compute_spatial_completeness(current_path)
158
 
159
  if baseline_path:
160
+ return self._analyze_persistence(
161
+ current_path=current_path,
162
+ baseline_path=baseline_path,
163
+ aoi=aoi,
164
+ results_dir=results_dir,
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
165
  spatial_completeness=spatial_completeness,
166
  )
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
167
  else:
168
  # Degraded mode — no baseline
169
  z_current = 0.0
 
290
  )
291
 
292
  self._true_color_path = true_color_path
293
+ spatial_completeness = self._compute_spatial_completeness(current_path)
294
 
295
+ return self._analyze_persistence(
296
+ current_path=current_path,
297
+ baseline_path=baseline_path,
298
+ aoi=aoi,
299
+ results_dir=results_dir,
300
+ spatial_completeness=spatial_completeness,
301
+ )
302
+
303
+ def _analyze_persistence(
304
+ self,
305
+ current_path: str,
306
+ baseline_path: str,
307
+ aoi: AOI,
308
+ results_dir: str,
309
+ spatial_completeness: float,
310
+ ) -> ProductResult:
311
+ """Shared persistence-based analysis for built-up change.
312
+
313
+ Used by both the batch harvest path and the non-batch process path.
314
+ Removes the monthly z-score approach (which tracks seasonal
315
+ vegetation cycles) and replaces it with a persistent-mask
316
+ comparison between the current and baseline periods.
317
+ """
318
  aoi_ha = aoi.area_km2 * 100 # km² → hectares
 
 
 
319
 
320
+ current_mask, current_persist_frac, n_current_months, _ = (
321
+ self._compute_persistent_buildup_mask(current_path)
322
+ )
323
+ baseline_mask, baseline_persist_frac, n_baseline_months, _ = (
324
+ self._compute_persistent_buildup_mask(baseline_path)
325
+ )
326
 
327
+ current_ha = current_persist_frac * aoi_ha
328
+ baseline_ha = baseline_persist_frac * aoi_ha
 
329
 
330
+ if baseline_persist_frac > 1e-6:
331
+ change_pct = safe_float(
332
+ (current_persist_frac - baseline_persist_frac)
333
+ / baseline_persist_frac * 100.0
334
+ )
335
  else:
336
+ change_pct = 0.0
337
 
338
+ status = self._classify_buildup_change(change_pct)
339
+ trend = (
340
+ TrendDirection.DETERIORATING if change_pct > CHANGE_AMBER_PCT
341
+ else (TrendDirection.IMPROVING if change_pct < -CHANGE_AMBER_PCT
342
+ else TrendDirection.STABLE)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
343
  )
 
 
344
 
345
+ change_raster = current_mask - baseline_mask
346
+ newly_built_pct = safe_float(
347
+ np.sum(change_raster > 0) / change_raster.size * 100.0
 
348
  )
349
+ lost_built_pct = safe_float(
350
+ np.sum(change_raster < 0) / change_raster.size * 100.0
 
351
  )
352
+ hotspot_pct = newly_built_pct + lost_built_pct
353
+ self._zscore_raster = change_raster.astype(np.float32) * 3.0
354
+ self._hotspot_mask = np.abs(change_raster) > 0.5
355
 
356
+ conf = compute_confidence(
357
+ valid_months=n_current_months,
358
+ baseline_years_with_data=max(1, n_baseline_months // 12),
359
+ spatial_completeness=spatial_completeness,
 
 
 
 
 
 
360
  )
361
 
362
+ chart_data = {
363
+ "dates": ["Baseline period", "Current period"],
364
+ "values": [round(baseline_ha, 1), round(current_ha, 1)],
365
+ "baseline_mean": [round(baseline_ha, 1), round(baseline_ha, 1)],
366
+ "baseline_min": [round(baseline_ha, 1), round(baseline_ha, 1)],
367
+ "baseline_max": [round(baseline_ha, 1), round(baseline_ha, 1)],
368
+ "anomaly_flags": [False, abs(change_pct) >= CHANGE_AMBER_PCT],
369
+ "label": "Persistent built-up area (hectares)",
370
+ }
371
+
372
+ if status == StatusLevel.GREEN:
373
+ headline = (
374
+ f"Built-up areas stable ({current_ha:.0f} ha, "
375
+ f"{change_pct:+.1f}% vs baseline)."
376
+ )
377
+ else:
378
+ severity = "Major" if status == StatusLevel.RED else "Moderate"
379
+ direction = "expansion" if change_pct > 0 else "contraction"
380
+ headline = (
381
+ f"Built-up areas: {severity.lower()} {direction} "
382
+ f"({change_pct:+.1f}%, now {current_ha:.0f} ha)."
383
+ )
384
+
385
  change_map_path = os.path.join(results_dir, "buildup_change.tif")
386
+ with rasterio.open(current_path) as src:
387
+ profile = src.profile.copy()
388
+ profile.update(count=1, dtype="float32")
389
+ with rasterio.open(change_map_path, "w", **profile) as dst:
390
+ dst.write(change_raster.astype(np.float32), 1)
391
 
392
  self._spatial_data = SpatialData(
393
  map_type="raster",
394
+ label="Built-up change (new / lost)",
395
  colormap="PiYG",
396
  vmin=-1,
397
  vmax=1,
 
404
  headline=headline,
405
  status=status,
406
  trend=trend,
407
+ confidence=conf["level"],
408
  map_layer_path=change_map_path,
409
  chart_data=chart_data,
410
  data_source="satellite",
411
+ anomaly_months=0,
412
+ z_score_current=round(change_pct, 2),
413
  hotspot_pct=round(hotspot_pct, 1),
414
+ confidence_factors=conf["factors"],
415
  summary=(
416
+ f"Persistent built-up area: {current_ha:.0f} ha "
417
+ f"({current_persist_frac*100:.1f}% of AOI). "
418
+ f"Baseline ({BASELINE_YEARS}-year average): {baseline_ha:.0f} ha "
419
+ f"({baseline_persist_frac*100:.1f}%). "
420
+ f"Change: {change_pct:+.1f}% "
421
+ f"({newly_built_pct:.1f}% newly built-up, {lost_built_pct:.1f}% lost). "
422
+ f"Classification uses persistence filter: pixel must be "
423
+ f"built-up in ≥{int(PERSISTENCE_MIN_FRAC*100)}% of months."
424
  ),
425
  methodology=(
426
+ f"Sentinel-2 L2A: NDBI = (B11 B08) / (B11 + B08); "
427
+ f"NDVI = (B08 B04) / (B08 + B04). "
428
+ f"A pixel is classified as persistently built-up if "
429
+ f"NDBI > {NDBI_THRESHOLD} AND NDVI < {NDVI_BUILDUP_MAX} "
430
+ f"in at least {int(PERSISTENCE_MIN_FRAC*100)}% of valid months "
431
+ f"across the analysis period. This removes seasonal "
432
+ f"false-positives from bare soil in dry seasons. "
433
+ f"Baseline: {BASELINE_YEARS} years preceding the current period. "
434
+ f"Status thresholds: |change|≥{CHANGE_RED_PCT}% → RED, "
435
+ f"≥{CHANGE_AMBER_PCT}% → AMBER. "
436
  f"Cloud-masked using SCL band. "
437
+ f"{BUILDUP_RESOLUTION_M}m native resolution. "
438
+ f"Processed via CDSE openEO batch jobs."
 
 
439
  ),
440
  limitations=[
441
+ f"Resampled to {BUILDUP_RESOLUTION_M}m detects settlement extent, not individual buildings.",
442
+ "NDBI-based classification can confuse bare rock/sand with built-up; the NDVI mask and persistence filter reduce but do not eliminate this.",
443
+ "For sparse, low-density or newly-built structures, detection is unreliable below a few hundred m².",
444
+ "Persistent-mask approach cannot capture sub-annual growth; use annual windows for tighter monitoring.",
445
+ "For building-level analysis, a dedicated footprint extraction pipeline (e.g., GHSL, WSF, SR4S) would be needed.",
446
  ],
447
  )
448
 
449
+ @staticmethod
450
+ def _classify_buildup_change(change_pct: float) -> StatusLevel:
451
+ """Classify built-up change using absolute % change, not z-score.
452
+
453
+ Z-scores are unstable for built-up because monthly NDBI
454
+ classification cycles with vegetation. A direct percentage
455
+ comparison against the baseline persistence mask is more honest.
456
+ """
457
+ abs_change = abs(safe_float(change_pct))
458
+ if abs_change >= CHANGE_RED_PCT:
459
+ return StatusLevel.RED
460
+ if abs_change >= CHANGE_AMBER_PCT:
461
+ return StatusLevel.AMBER
462
+ return StatusLevel.GREEN
463
+
464
+ @staticmethod
465
+ def _compute_persistent_buildup_mask(tif_path: str) -> tuple[np.ndarray, float, int, dict]:
466
+ """Compute a persistence-based built-up mask from paired NDBI/NDVI bands.
467
+
468
+ A pixel is classified as *persistently* built-up only if it satisfies
469
+ the combined condition (NDBI > NDBI_THRESHOLD AND NDVI < NDVI_BUILDUP_MAX)
470
+ in at least PERSISTENCE_MIN_FRAC of the valid months in the stack.
471
+
472
+ This removes the seasonal cycling artifact where bare soil in the
473
+ dry season gets classified as built-up in winter but not summer.
474
+
475
+ Returns
476
+ -------
477
+ mask : np.ndarray
478
+ Binary 2D mask (1 = persistently built-up, 0 = not).
479
+ fraction : float
480
+ Fraction of non-nodata pixels classified as persistent built-up.
481
+ valid_months : int
482
+ Number of valid monthly observations found in the stack.
483
+ profile : dict
484
+ Rasterio profile of the source TIF (for downstream writes).
485
+ """
486
+ with rasterio.open(tif_path) as src:
487
+ count = src.count
488
+ profile = src.profile.copy()
489
+ paired = count >= 2 and count % 2 == 0
490
+ n_months = count // 2 if paired else count
491
+
492
+ # Accumulate: number of months each pixel satisfies built-up condition
493
+ # and number of months the pixel had any valid observation at all.
494
+ built_counts: np.ndarray | None = None
495
+ obs_counts: np.ndarray | None = None
496
+ nodata = src.nodata
497
+
498
+ for m in range(n_months):
499
+ if paired:
500
+ ndbi = src.read(m * 2 + 1).astype(np.float32)
501
+ ndvi = src.read(m * 2 + 2).astype(np.float32)
502
+ else:
503
+ ndbi = src.read(m + 1).astype(np.float32)
504
+ ndvi = None
505
+
506
+ if nodata is not None:
507
+ valid = ndbi != nodata
508
+ else:
509
+ valid = ~np.isnan(ndbi)
510
+
511
+ if paired and ndvi is not None:
512
+ cond = (ndbi > NDBI_THRESHOLD) & (ndvi < NDVI_BUILDUP_MAX) & valid
513
+ else:
514
+ # Legacy single-band fallback: NDBI only
515
+ cond = (ndbi > NDBI_THRESHOLD) & valid
516
+
517
+ if built_counts is None:
518
+ built_counts = cond.astype(np.uint16)
519
+ obs_counts = valid.astype(np.uint16)
520
+ else:
521
+ built_counts += cond.astype(np.uint16)
522
+ obs_counts += valid.astype(np.uint16)
523
+
524
+ if built_counts is None or obs_counts is None:
525
+ return np.zeros((1, 1), dtype=np.float32), 0.0, 0, profile
526
+
527
+ # A pixel counts as persistently built-up if:
528
+ # - it has ≥1 valid observation, AND
529
+ # - built_count / obs_count ≥ PERSISTENCE_MIN_FRAC
530
+ with np.errstate(divide="ignore", invalid="ignore"):
531
+ persistent_frac_per_pixel = np.where(
532
+ obs_counts > 0,
533
+ built_counts.astype(np.float32) / obs_counts.astype(np.float32),
534
+ 0.0,
535
+ )
536
+ mask = (
537
+ (obs_counts >= 1)
538
+ & (persistent_frac_per_pixel >= PERSISTENCE_MIN_FRAC)
539
+ ).astype(np.float32)
540
+
541
+ # Area fraction: persistent built-up pixels / all pixels with any valid data
542
+ any_valid = obs_counts >= 1
543
+ total_valid = int(np.sum(any_valid))
544
+ if total_valid > 0:
545
+ fraction = safe_float(np.sum(mask) / total_valid)
546
+ else:
547
+ fraction = 0.0
548
+
549
+ return mask, fraction, n_months, profile
550
+
551
  @staticmethod
552
  def _compute_stats(tif_path: str) -> dict[str, Any]:
553
  """Extract monthly built-up fraction and raw NDBI stats from GeoTIFF.
app/eo_products/ndvi.py CHANGED
@@ -200,7 +200,11 @@ class NdviProduct(BaseProduct):
200
  confidence = conf["level"]
201
  confidence_factors = conf["factors"]
202
 
203
- status = self._classify_zscore(z_current, hotspot_pct)
 
 
 
 
204
  trend = self._compute_trend_zscore(monthly_zscores)
205
 
206
  chart_data = self._build_seasonal_chart_data(
@@ -397,7 +401,11 @@ class NdviProduct(BaseProduct):
397
  confidence = conf["level"]
398
  confidence_factors = conf["factors"]
399
 
400
- status = self._classify_zscore(z_current, hotspot_pct)
 
 
 
 
401
  trend = self._compute_trend_zscore(monthly_zscores)
402
  chart_data = self._build_seasonal_chart_data(
403
  current_stats["monthly_means"], seasonal_stats, time_range, monthly_zscores,
 
200
  confidence = conf["level"]
201
  confidence_factors = conf["factors"]
202
 
203
+ status = self._classify_zscore(
204
+ z_current, hotspot_pct,
205
+ anomaly_months=anomaly_months,
206
+ total_months=n_current_bands,
207
+ )
208
  trend = self._compute_trend_zscore(monthly_zscores)
209
 
210
  chart_data = self._build_seasonal_chart_data(
 
401
  confidence = conf["level"]
402
  confidence_factors = conf["factors"]
403
 
404
+ status = self._classify_zscore(
405
+ z_current, hotspot_pct,
406
+ anomaly_months=anomaly_months,
407
+ total_months=n_current_bands,
408
+ )
409
  trend = self._compute_trend_zscore(monthly_zscores)
410
  chart_data = self._build_seasonal_chart_data(
411
  current_stats["monthly_means"], seasonal_stats, time_range, monthly_zscores,
app/eo_products/sar.py CHANGED
@@ -43,6 +43,7 @@ logger = logging.getLogger(__name__)
43
  BASELINE_YEARS = 5
44
  CHANGE_THRESHOLD_DB = 3.0 # dB change considered significant
45
  FLOOD_SIGMA = 2.0 # Standard deviations below baseline mean
 
46
 
47
 
48
  class SarProduct(BaseProduct):
@@ -226,25 +227,45 @@ class SarProduct(BaseProduct):
226
  ))
227
  hotspot_pct = safe_float(hotspot_pct)
228
 
229
- status = self._classify_zscore(z_current, hotspot_pct)
230
- trend = self._compute_trend_zscore(monthly_zscores)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
231
 
232
  chart_data = self._build_seasonal_chart_data(
233
  current_stats["monthly_vv_means"], seasonal_stats, time_range, monthly_zscores,
234
  )
235
 
236
- headline = self._generate_headline(
237
- status=status,
238
- z_current=z_current,
239
- hotspot_pct=hotspot_pct,
240
- anomaly_months=anomaly_months,
241
- total_months=n_current_bands,
242
- value_phrase=f"backscatter {current_mean:.1f} dB",
243
- indicator_label="Ground surface",
244
- direction_up="brightening (drying or new structures)",
245
- direction_down="darkening (possible flooding or moisture)",
246
- )
247
-
248
  change_map_path = os.path.join(results_dir, "sar_change.tif")
249
  self._write_change_raster(current_path, baseline_path, change_map_path)
250
 
@@ -259,16 +280,30 @@ class SarProduct(BaseProduct):
259
  self._render_band = 1
260
  map_layer_path = change_map_path
261
 
262
- summary = (
263
- f"Mean VV backscatter: {current_mean:.1f} dB (z-score {z_current:+.1f} vs seasonal baseline). "
264
- f"{anomaly_months} of {n_current_bands} months show significant anomalies. "
265
- f"{hotspot_pct:.0f}% of AOI has statistically significant change. "
266
- f"Mean VV change: {change_db:+.1f} dB. "
267
- f"{change_pct:.1f}% area with >{CHANGE_THRESHOLD_DB} dB change. "
268
- f"{flood_months} month(s) with potential flood signals. "
269
- f"Pixel-level analysis at {SAR_RESOLUTION_M}m resolution."
270
- )
271
- extra_limitations: list[str] = []
 
 
 
 
 
 
 
 
 
 
 
 
 
 
272
  else:
273
  # Degraded mode — no baseline
274
  z_current = 0.0
@@ -332,6 +367,8 @@ class SarProduct(BaseProduct):
332
  f"Anomaly detection via z-scores (threshold: ±{ZSCORE_THRESHOLD}). "
333
  f"Change detection: >{CHANGE_THRESHOLD_DB} dB difference vs baseline. "
334
  f"Flood mapping: VV < baseline_mean − {FLOOD_SIGMA}σ. "
 
 
335
  f"Processed via CDSE openEO batch jobs."
336
  ),
337
  limitations=[
@@ -494,25 +531,41 @@ class SarProduct(BaseProduct):
494
  ))
495
  hotspot_pct = safe_float(hotspot_pct)
496
 
497
- status = self._classify_zscore(z_current, hotspot_pct)
498
- trend = self._compute_trend_zscore(monthly_zscores)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
499
 
500
  chart_data = self._build_seasonal_chart_data(
501
  current_stats["monthly_vv_means"], seasonal_stats, time_range, monthly_zscores,
502
  )
503
 
504
- headline = self._generate_headline(
505
- status=status,
506
- z_current=z_current,
507
- hotspot_pct=hotspot_pct,
508
- anomaly_months=anomaly_months,
509
- total_months=n_current_bands,
510
- value_phrase=f"backscatter {current_mean:.1f} dB",
511
- indicator_label="Ground surface",
512
- direction_up="brightening (drying or new structures)",
513
- direction_down="darkening (possible flooding or moisture)",
514
- )
515
-
516
  # Store raster path for map rendering — write a change map
517
  change_map_path = os.path.join(results_dir, "sar_change.tif")
518
  self._write_change_raster(current_path, baseline_path, change_map_path)
@@ -541,13 +594,24 @@ class SarProduct(BaseProduct):
541
  hotspot_pct=round(hotspot_pct, 1),
542
  confidence_factors=confidence_factors,
543
  summary=(
544
- f"Mean VV backscatter: {current_mean:.1f} dB (z-score {z_current:+.1f} vs seasonal baseline). "
545
- f"{anomaly_months} of {n_current_bands} months show significant anomalies. "
546
- f"{hotspot_pct:.0f}% of AOI has statistically significant change. "
547
- f"Mean VV change: {change_db:+.1f} dB. "
548
- f"{change_pct:.1f}% area with >{CHANGE_THRESHOLD_DB} dB change. "
549
- f"{flood_months} month(s) with potential flood signals. "
550
- f"Pixel-level analysis at {SAR_RESOLUTION_M}m resolution."
 
 
 
 
 
 
 
 
 
 
 
551
  ),
552
  methodology=(
553
  f"Sentinel-1 GRD IW VV/VH polarizations, ascending orbit. "
@@ -557,6 +621,8 @@ class SarProduct(BaseProduct):
557
  f"Anomaly detection via z-scores (threshold: ±{ZSCORE_THRESHOLD}). "
558
  f"Change detection: >{CHANGE_THRESHOLD_DB} dB difference vs baseline. "
559
  f"Flood mapping: VV < baseline_mean − {FLOOD_SIGMA}σ. "
 
 
560
  f"Processed via CDSE openEO."
561
  ),
562
  limitations=[
@@ -565,7 +631,10 @@ class SarProduct(BaseProduct):
565
  "Sentinel-1 coverage over East Africa can be inconsistent.",
566
  "VV decrease may indicate flooding, moisture, or vegetation change — not uniquely flood.",
567
  "Z-score anomalies assume baseline is representative of normal conditions.",
568
- ],
 
 
 
569
  )
570
 
571
  # ------------------------------------------------------------------
 
43
  BASELINE_YEARS = 5
44
  CHANGE_THRESHOLD_DB = 3.0 # dB change considered significant
45
  FLOOD_SIGMA = 2.0 # Standard deviations below baseline mean
46
+ BASELINE_DRIFT_THRESHOLD = 0.4 # If >40% of months anomalous, treat as baseline drift not finding
47
 
48
 
49
  class SarProduct(BaseProduct):
 
227
  ))
228
  hotspot_pct = safe_float(hotspot_pct)
229
 
230
+ # Baseline drift detection: if a large majority of months are
231
+ # flagged as anomalous, this is almost certainly a baseline
232
+ # calibration shift (S1 IPF version change, orbit geometry) or
233
+ # regime shift, not a real per-month anomaly pattern.
234
+ drift_frac = (anomaly_months / n_current_bands) if n_current_bands > 0 else 0.0
235
+ baseline_drift_detected = drift_frac >= BASELINE_DRIFT_THRESHOLD
236
+
237
+ if baseline_drift_detected:
238
+ status = StatusLevel.AMBER
239
+ trend = TrendDirection.STABLE
240
+ headline = (
241
+ f"Ground surface: baseline may be unreliable — "
242
+ f"{anomaly_months} of {n_current_bands} months diverge from the "
243
+ f"5-year baseline (possible sensor calibration or regime shift, "
244
+ f"not a per-month anomaly pattern)."
245
+ )
246
+ else:
247
+ status = self._classify_zscore(
248
+ z_current, hotspot_pct,
249
+ anomaly_months=anomaly_months,
250
+ total_months=n_current_bands,
251
+ )
252
+ trend = self._compute_trend_zscore(monthly_zscores)
253
+ headline = self._generate_headline(
254
+ status=status,
255
+ z_current=z_current,
256
+ hotspot_pct=hotspot_pct,
257
+ anomaly_months=anomaly_months,
258
+ total_months=n_current_bands,
259
+ value_phrase=f"backscatter {current_mean:.1f} dB",
260
+ indicator_label="Ground surface",
261
+ direction_up="brightening (drying or new structures)",
262
+ direction_down="darkening (possible flooding or moisture)",
263
+ )
264
 
265
  chart_data = self._build_seasonal_chart_data(
266
  current_stats["monthly_vv_means"], seasonal_stats, time_range, monthly_zscores,
267
  )
268
 
 
 
 
 
 
 
 
 
 
 
 
 
269
  change_map_path = os.path.join(results_dir, "sar_change.tif")
270
  self._write_change_raster(current_path, baseline_path, change_map_path)
271
 
 
280
  self._render_band = 1
281
  map_layer_path = change_map_path
282
 
283
+ if baseline_drift_detected:
284
+ summary = (
285
+ f"Baseline instability detected: {anomaly_months} of "
286
+ f"{n_current_bands} months diverge from the {BASELINE_YEARS}-year "
287
+ f"baseline. This pattern is more consistent with Sentinel-1 "
288
+ f"processor/calibration change or regime shift than a "
289
+ f"per-month anomaly signal. Per-month z-scores not reported as "
290
+ f"reliable indicators. Mean VV: {current_mean:.1f} dB "
291
+ f"(change from baseline: {change_db:+.1f} dB)."
292
+ )
293
+ extra_limitations: list[str] = [
294
+ "Baseline instability flagged — the 5-year SAR baseline may include calibration changes or regime shifts. Re-check with a shorter baseline window or consistent relative-orbit filtering before interpreting.",
295
+ ]
296
+ else:
297
+ summary = (
298
+ f"Mean VV backscatter: {current_mean:.1f} dB (z-score {z_current:+.1f} vs seasonal baseline). "
299
+ f"{anomaly_months} of {n_current_bands} months show significant anomalies. "
300
+ f"{hotspot_pct:.0f}% of AOI has statistically significant change. "
301
+ f"Mean VV change: {change_db:+.1f} dB. "
302
+ f"{change_pct:.1f}% area with >{CHANGE_THRESHOLD_DB} dB change. "
303
+ f"{flood_months} month(s) with potential flood signals. "
304
+ f"Pixel-level analysis at {SAR_RESOLUTION_M}m resolution."
305
+ )
306
+ extra_limitations: list[str] = []
307
  else:
308
  # Degraded mode — no baseline
309
  z_current = 0.0
 
367
  f"Anomaly detection via z-scores (threshold: ±{ZSCORE_THRESHOLD}). "
368
  f"Change detection: >{CHANGE_THRESHOLD_DB} dB difference vs baseline. "
369
  f"Flood mapping: VV < baseline_mean − {FLOOD_SIGMA}σ. "
370
+ f"Baseline drift check: if >{int(BASELINE_DRIFT_THRESHOLD*100)}% of months are anomalous, "
371
+ f"the finding is flagged as likely baseline instability rather than a real signal. "
372
  f"Processed via CDSE openEO batch jobs."
373
  ),
374
  limitations=[
 
531
  ))
532
  hotspot_pct = safe_float(hotspot_pct)
533
 
534
+ drift_frac = (anomaly_months / n_current_bands) if n_current_bands > 0 else 0.0
535
+ baseline_drift_detected = drift_frac >= BASELINE_DRIFT_THRESHOLD
536
+
537
+ if baseline_drift_detected:
538
+ status = StatusLevel.AMBER
539
+ trend = TrendDirection.STABLE
540
+ headline = (
541
+ f"Ground surface: baseline may be unreliable — "
542
+ f"{anomaly_months} of {n_current_bands} months diverge from the "
543
+ f"{BASELINE_YEARS}-year baseline (possible sensor calibration "
544
+ f"or regime shift, not a per-month anomaly pattern)."
545
+ )
546
+ else:
547
+ status = self._classify_zscore(
548
+ z_current, hotspot_pct,
549
+ anomaly_months=anomaly_months,
550
+ total_months=n_current_bands,
551
+ )
552
+ trend = self._compute_trend_zscore(monthly_zscores)
553
+ headline = self._generate_headline(
554
+ status=status,
555
+ z_current=z_current,
556
+ hotspot_pct=hotspot_pct,
557
+ anomaly_months=anomaly_months,
558
+ total_months=n_current_bands,
559
+ value_phrase=f"backscatter {current_mean:.1f} dB",
560
+ indicator_label="Ground surface",
561
+ direction_up="brightening (drying or new structures)",
562
+ direction_down="darkening (possible flooding or moisture)",
563
+ )
564
 
565
  chart_data = self._build_seasonal_chart_data(
566
  current_stats["monthly_vv_means"], seasonal_stats, time_range, monthly_zscores,
567
  )
568
 
 
 
 
 
 
 
 
 
 
 
 
 
569
  # Store raster path for map rendering — write a change map
570
  change_map_path = os.path.join(results_dir, "sar_change.tif")
571
  self._write_change_raster(current_path, baseline_path, change_map_path)
 
594
  hotspot_pct=round(hotspot_pct, 1),
595
  confidence_factors=confidence_factors,
596
  summary=(
597
+ (
598
+ f"Baseline instability detected: {anomaly_months} of "
599
+ f"{n_current_bands} months diverge from the {BASELINE_YEARS}-year "
600
+ f"baseline. This pattern is more consistent with Sentinel-1 "
601
+ f"processor/calibration change or regime shift than per-month "
602
+ f"anomalies. Mean VV: {current_mean:.1f} dB "
603
+ f"(change from baseline: {change_db:+.1f} dB)."
604
+ )
605
+ if baseline_drift_detected
606
+ else (
607
+ f"Mean VV backscatter: {current_mean:.1f} dB (z-score {z_current:+.1f} vs seasonal baseline). "
608
+ f"{anomaly_months} of {n_current_bands} months show significant anomalies. "
609
+ f"{hotspot_pct:.0f}% of AOI has statistically significant change. "
610
+ f"Mean VV change: {change_db:+.1f} dB. "
611
+ f"{change_pct:.1f}% area with >{CHANGE_THRESHOLD_DB} dB change. "
612
+ f"{flood_months} month(s) with potential flood signals. "
613
+ f"Pixel-level analysis at {SAR_RESOLUTION_M}m resolution."
614
+ )
615
  ),
616
  methodology=(
617
  f"Sentinel-1 GRD IW VV/VH polarizations, ascending orbit. "
 
621
  f"Anomaly detection via z-scores (threshold: ±{ZSCORE_THRESHOLD}). "
622
  f"Change detection: >{CHANGE_THRESHOLD_DB} dB difference vs baseline. "
623
  f"Flood mapping: VV < baseline_mean − {FLOOD_SIGMA}σ. "
624
+ f"Baseline drift check: if >{int(BASELINE_DRIFT_THRESHOLD*100)}% of months are anomalous, "
625
+ f"the finding is flagged as likely baseline instability rather than a real signal. "
626
  f"Processed via CDSE openEO."
627
  ),
628
  limitations=[
 
631
  "Sentinel-1 coverage over East Africa can be inconsistent.",
632
  "VV decrease may indicate flooding, moisture, or vegetation change — not uniquely flood.",
633
  "Z-score anomalies assume baseline is representative of normal conditions.",
634
+ ] + (
635
+ ["Baseline instability flagged — re-check with a shorter baseline window or consistent relative-orbit filtering before drawing conclusions."]
636
+ if baseline_drift_detected else []
637
+ ),
638
  )
639
 
640
  # ------------------------------------------------------------------
app/eo_products/water.py CHANGED
@@ -205,7 +205,12 @@ class WaterProduct(BaseProduct):
205
  confidence = conf["level"]
206
  confidence_factors = conf["factors"]
207
 
208
- status = self._classify_zscore(z_current, hotspot_pct)
 
 
 
 
 
209
  trend = self._compute_trend_zscore(monthly_zscores)
210
 
211
  baseline_seasonal_fractions = self._build_seasonal_water_fractions(
@@ -410,7 +415,12 @@ class WaterProduct(BaseProduct):
410
  confidence = conf["level"]
411
  confidence_factors = conf["factors"]
412
 
413
- status = self._classify_zscore(z_current, hotspot_pct)
 
 
 
 
 
414
  trend = self._compute_trend_zscore(monthly_zscores)
415
  baseline_seasonal_fractions = self._build_seasonal_water_fractions(
416
  baseline_stats["monthly_water_fractions"], BASELINE_YEARS,
 
205
  confidence = conf["level"]
206
  confidence_factors = conf["factors"]
207
 
208
+ status = self._classify_zscore(
209
+ z_current, hotspot_pct,
210
+ anomaly_months=anomaly_months,
211
+ total_months=n_current_bands,
212
+ min_coverage_pct=current_frac * 100.0,
213
+ )
214
  trend = self._compute_trend_zscore(monthly_zscores)
215
 
216
  baseline_seasonal_fractions = self._build_seasonal_water_fractions(
 
415
  confidence = conf["level"]
416
  confidence_factors = conf["factors"]
417
 
418
+ status = self._classify_zscore(
419
+ z_current, hotspot_pct,
420
+ anomaly_months=anomaly_months,
421
+ total_months=n_current_bands,
422
+ min_coverage_pct=current_frac * 100.0,
423
+ )
424
  trend = self._compute_trend_zscore(monthly_zscores)
425
  baseline_seasonal_fractions = self._build_seasonal_water_fractions(
426
  baseline_stats["monthly_water_fractions"], BASELINE_YEARS,
app/outputs/maps.py CHANGED
@@ -403,14 +403,15 @@ def render_overview_map(
403
  # AOI outline
404
  _draw_aoi_rect(ax, aoi, INK)
405
 
406
- # Title and date range
407
- if title:
 
 
 
 
408
  ax.set_title(title, fontsize=10, color=INK, fontweight="bold", pad=8)
409
- if date_range:
410
- ax.text(
411
- 0.5, -0.05, date_range,
412
- transform=ax.transAxes, ha="center", fontsize=7, color=INK_MUTED,
413
- )
414
 
415
  ax.set_xlim(extent[0], extent[1])
416
  ax.set_ylim(extent[2], extent[3])
 
403
  # AOI outline
404
  _draw_aoi_rect(ax, aoi, INK)
405
 
406
+ # Title (with date range on a second line if provided) — placed above the
407
+ # axes so the date no longer collides with the x-axis tick labels.
408
+ if title and date_range:
409
+ full_title = f"{title}\n{date_range}"
410
+ ax.set_title(full_title, fontsize=9, color=INK, fontweight="bold", pad=8)
411
+ elif title:
412
  ax.set_title(title, fontsize=10, color=INK, fontweight="bold", pad=8)
413
+ elif date_range:
414
+ ax.set_title(date_range, fontsize=8, color=INK_MUTED, pad=6)
 
 
 
415
 
416
  ax.set_xlim(extent[0], extent[1])
417
  ax.set_ylim(extent[2], extent[3])
app/outputs/narrative.py CHANGED
@@ -42,22 +42,66 @@ def get_verify_suggestion(product_id: str, status: StatusLevel) -> str:
42
  return ""
43
  return _VERIFY_SUGGESTIONS.get((product_id, status), "")
44
 
45
- # --- Cross-indicator pattern rules ---
46
- _CROSS_PATTERNS: list[tuple[dict[str, set[StatusLevel]], str]] = [
 
 
 
 
 
 
 
 
 
47
  (
48
- {"ndvi": {StatusLevel.RED, StatusLevel.AMBER}, "buildup": {StatusLevel.RED, StatusLevel.AMBER}},
49
- "Vegetation loss coincides with settlement expansion, indicating possible land-use conversion.",
50
  ),
51
  (
52
- {"ndvi": {StatusLevel.RED, StatusLevel.AMBER}, "sar": {StatusLevel.RED, StatusLevel.AMBER}},
53
- "Vegetation decline paired with SAR backscatter anomalies may indicate flood damage or soil saturation.",
 
 
54
  ),
55
  (
56
- {"water": {StatusLevel.RED, StatusLevel.AMBER}, "sar": {StatusLevel.RED, StatusLevel.AMBER}},
57
- "Increased water extent and SAR signal changes suggest flooding or waterlogging.",
 
 
 
 
 
 
 
 
 
 
58
  ),
59
  ]
60
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
61
  _LEAD_TEMPLATES = {
62
  StatusLevel.RED: "One or more indicators show major changes that warrant action and ground verification.",
63
  StatusLevel.AMBER: "One or more indicators show elevated change that should be monitored.",
@@ -73,10 +117,21 @@ def get_interpretation(product_id: str, status: StatusLevel) -> str:
73
  )
74
 
75
 
 
 
 
 
 
76
  def generate_narrative(results: Sequence[ProductResult]) -> str:
77
- """Generate a cross-indicator narrative paragraph from indicator results."""
 
 
 
 
 
 
78
  if not results:
79
- return "No EO product data available for narrative generation."
80
 
81
  parts: list[str] = []
82
 
@@ -85,20 +140,21 @@ def generate_narrative(results: Sequence[ProductResult]) -> str:
85
  (r.status for r in results),
86
  key=lambda s: [StatusLevel.GREEN, StatusLevel.AMBER, StatusLevel.RED].index(s),
87
  )
88
- parts.append(_LEAD_TEMPLATES[worst])
89
 
90
- # 2. Per-indicator sentences
 
91
  for r in results:
92
- parts.append(f"{r.headline}.")
93
 
94
- # 3. Cross-indicator connection
95
- result_map = {r.product_id: r.status for r in results}
96
  for required, sentence in _CROSS_PATTERNS:
97
  if all(
98
- ind_id in result_map and result_map[ind_id] in allowed_statuses
99
- for ind_id, allowed_statuses in required.items()
100
  ):
101
- parts.append(sentence)
102
  break
103
 
104
  return " ".join(parts)
 
42
  return ""
43
  return _VERIFY_SUGGESTIONS.get((product_id, status), "")
44
 
45
+ # --- Direction-aware cross-indicator pattern rules ---
46
+ #
47
+ # Each rule describes a pattern of (indicator_id, required_direction) pairs.
48
+ # Direction is checked against `z_score_current` sign: "down" means z<-1,
49
+ # "up" means z>+1, "any" means any non-GREEN status regardless of direction.
50
+ #
51
+ # Only triggered when ALL constraints are met. This replaces the previous
52
+ # status-only matcher which produced sentences like "vegetation loss
53
+ # coincides with settlement expansion" even when both were actually going
54
+ # the opposite direction.
55
+ _CROSS_PATTERNS: list[tuple[list[tuple[str, str]], str]] = [
56
  (
57
+ [("ndvi", "down"), ("buildup", "up")],
58
+ "Vegetation loss coincides with built-up expansion, indicating possible land-use conversion.",
59
  ),
60
  (
61
+ [("ndvi", "up"), ("buildup", "down")],
62
+ "Vegetation recovery coincides with apparent built-up contraction "
63
+ "in conflict or displacement contexts this can reflect abandoned land "
64
+ "returning to vegetation.",
65
  ),
66
  (
67
+ [("ndvi", "down"), ("sar", "any")],
68
+ "Vegetation decline paired with ground-surface anomalies may indicate "
69
+ "damage, soil saturation, or disturbance.",
70
+ ),
71
+ (
72
+ [("water", "up"), ("sar", "down")],
73
+ "Expanded water extent combined with radar darkening suggests "
74
+ "flooding or waterlogging.",
75
+ ),
76
+ (
77
+ [("water", "down"), ("ndvi", "down")],
78
+ "Drying water and declining vegetation together indicate drought stress.",
79
  ),
80
  ]
81
 
82
+
83
+ def _z_direction(r: "ProductResult") -> str:
84
+ """Return 'up', 'down', or 'stable'.
85
+
86
+ Uses z_score_current (which for most indicators is a z-score with
87
+ threshold 1.0, but for the buildup indicator is a change percentage
88
+ with threshold 5.0% — matching its AMBER cutoff).
89
+ """
90
+ val = float(r.z_score_current or 0.0)
91
+ threshold = 5.0 if r.product_id == "buildup" else 1.0
92
+ if val > threshold:
93
+ return "up"
94
+ if val < -threshold:
95
+ return "down"
96
+ return "stable"
97
+
98
+
99
+ def _direction_matches(actual: str, required: str) -> bool:
100
+ """Match a direction constraint. 'any' matches any non-stable direction."""
101
+ if required == "any":
102
+ return actual in ("up", "down")
103
+ return actual == required
104
+
105
  _LEAD_TEMPLATES = {
106
  StatusLevel.RED: "One or more indicators show major changes that warrant action and ground verification.",
107
  StatusLevel.AMBER: "One or more indicators show elevated change that should be monitored.",
 
117
  )
118
 
119
 
120
+ def _clean_sentence(text: str) -> str:
121
+ """Strip trailing period(s) and whitespace so we can re-add exactly one."""
122
+ return text.rstrip().rstrip(".").rstrip()
123
+
124
+
125
  def generate_narrative(results: Sequence[ProductResult]) -> str:
126
+ """Generate a cross-indicator narrative paragraph from indicator results.
127
+
128
+ The narrative is *direction-aware* — it checks z-score signs, not just
129
+ status levels, so it no longer produces sentences like "vegetation loss
130
+ coincides with settlement expansion" when aggregates actually show
131
+ greening and contraction.
132
+ """
133
  if not results:
134
+ return "No indicator data available for narrative generation."
135
 
136
  parts: list[str] = []
137
 
 
140
  (r.status for r in results),
141
  key=lambda s: [StatusLevel.GREEN, StatusLevel.AMBER, StatusLevel.RED].index(s),
142
  )
143
+ parts.append(_clean_sentence(_LEAD_TEMPLATES[worst]) + ".")
144
 
145
+ # 2. Per-indicator sentences — strip any pre-existing trailing period so we
146
+ # don't end up with ".." when concatenating.
147
  for r in results:
148
+ parts.append(_clean_sentence(r.headline) + ".")
149
 
150
+ # 3. Direction-aware cross-indicator connection
151
+ directions = {r.product_id: _z_direction(r) for r in results}
152
  for required, sentence in _CROSS_PATTERNS:
153
  if all(
154
+ ind_id in directions and _direction_matches(directions[ind_id], req_dir)
155
+ for ind_id, req_dir in required
156
  ):
157
+ parts.append(_clean_sentence(sentence) + ".")
158
  break
159
 
160
  return " ".join(parts)