Tabular Classification
Scikit-learn
Joblib
remote-sensing
tree-canopy
sentinel-2
philippines
metro-manila
civic-technology
Instructions to use xmpuspus/leaves-ph with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Scikit-learn
How to use xmpuspus/leaves-ph with Scikit-learn:
from huggingface_hub import hf_hub_download import joblib model = joblib.load( hf_hub_download("xmpuspus/leaves-ph", "sklearn_model.joblib") ) # only load pickle files from sources you trust # read more about it here https://skops.readthedocs.io/en/stable/persistence.html - Notebooks
- Google Colab
- Kaggle
Leaves.PH canopy classifier: RESULTS.md
Browse files- RESULTS.md +95 -0
RESULTS.md
ADDED
|
@@ -0,0 +1,95 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# Manual high-resolution labeling — reality-anchored accuracy (Claim 3 gold truth)
|
| 2 |
+
|
| 3 |
+
Done 2026-05-29. The first accuracy number for Leaves.PH measured against
|
| 4 |
+
**human visual labels on high-resolution imagery**, not against another satellite
|
| 5 |
+
product. Replaces "agreement with ESA WorldCover" as the headline accuracy.
|
| 6 |
+
|
| 7 |
+
## What was done
|
| 8 |
+
|
| 9 |
+
- **Sample.** 42 30m pixels inside the 17-LGU mask, drawn (seed 42) across **6
|
| 10 |
+
disjoint strata** that partition the 904,715 valid pixels, so the sample is
|
| 11 |
+
population-weightable (Horvitz-Thompson) and both predicted classes are present:
|
| 12 |
+
|
| 13 |
+
| Stratum | Definition | Pop pixels | Pop % | Sampled |
|
| 14 |
+
|---|---|---|---|---|
|
| 15 |
+
| D clear-canopy | NDVI > 0.65 | 77,845 | 8.6% | 10 |
|
| 16 |
+
| C boundary | NDVI in [0.55, 0.65] | 37,953 | 4.2% | 10 |
|
| 17 |
+
| A dense-urban | ESA built-up & NDVI < 0.55 | 475,429 | 52.6% | 6 |
|
| 18 |
+
| B reclaimed | ESA bare & NDVI < 0.55 | 6,209 | 0.7% | 6 |
|
| 19 |
+
| E green-fringe | ESA tree & NDVI < 0.55 | 32,660 | 3.6% | 6 |
|
| 20 |
+
| F other-low | other ESA & NDVI < 0.55 | 274,619 | 30.4% | 4 |
|
| 21 |
+
|
| 22 |
+
- **Reference imagery.** Esri World Imagery (~0.5–1m, no-auth ArcGIS export) per
|
| 23 |
+
chip, bbox = target 30m cell ± 3.5 cells (~210m), with a red box drawn on the
|
| 24 |
+
**exact** 30m target cell. Sentinel-2 RGB crop (`s2_rgb_2021.tif`) as a second
|
| 25 |
+
view. Ambiguous cells re-inspected at 2.3× and 3.4× zoom.
|
| 26 |
+
- **Labeling.** Each chip labeled by Claude via visual inspection (read the
|
| 27 |
+
annotated chip, decide "is ≥25% of the marked 30m cell woody **tree** canopy?").
|
| 28 |
+
Labels + one-line reasons in `my_labels.csv`; chips in `chips/`, zooms in `zoom/`,
|
| 29 |
+
ultra-zooms in `uz/`, contact sheet `contact_sheet.png`.
|
| 30 |
+
|
| 31 |
+
## Headline result — NDVI > 0.62 mask vs manual labels
|
| 32 |
+
|
| 33 |
+
Confusion (n=42): **TP 10, FP 3, FN 5, TN 24.**
|
| 34 |
+
|
| 35 |
+
| Metric | Pooled (stratified sample) | Population-weighted (H-T) |
|
| 36 |
+
|---|---|---|
|
| 37 |
+
| Precision | 0.77 (95% CI 0.50–0.92) | **0.78** (95% CI 0.54–1.0) |
|
| 38 |
+
| Recall | 0.67 (95% CI 0.42–0.85) | **0.73** (95% CI 0.61–0.88) |
|
| 39 |
+
| F1 | 0.71 | **0.76** |
|
| 40 |
+
| IoU | 0.56 | **0.61** |
|
| 41 |
+
| Accuracy | 0.81 | 0.95 |
|
| 42 |
+
|
| 43 |
+
- **Implied true canopy fraction = 10.5%**, against the mask's 9.86% — the
|
| 44 |
+
reality-anchored canopy area lands within ~0.7pp of the published 9.79% estimate.
|
| 45 |
+
- Dropping the 5 cells Claude flagged ambiguous (n=37): precision 0.86, recall 0.83,
|
| 46 |
+
IoU 0.73 (population-weighted precision 0.86, recall 0.83).
|
| 47 |
+
|
| 48 |
+
### Where the errors live (matches the prior ESA-gap analysis)
|
| 49 |
+
|
| 50 |
+
- **3 false positives, all high-NDVI non-tree vegetation:** riparian scrub on a
|
| 51 |
+
gravel bar (#0), a dense grass/low-veg slope (#6), a dry-grass field with a green
|
| 52 |
+
edge (#12). Exactly the "we over-call dense grass/scrub" failure the threshold
|
| 53 |
+
analysis predicted.
|
| 54 |
+
- **5 false negatives, all real canopy the strict 0.62 cut or 30m mixing misses:**
|
| 55 |
+
a tall-canopy cell at NDVI 0.619 just under the cut (#13, Meta 11m), and four
|
| 56 |
+
ESA-tree green-fringe cells (#32/33/35/37) where sub-5m or sparse street/yard trees
|
| 57 |
+
dilute the 30m NDVI below threshold.
|
| 58 |
+
- Per-stratum: dense-urban (A), reclaimed (B) and water/other (F) are 100% correctly
|
| 59 |
+
negative; all recall loss is concentrated in the green-fringe (E) stratum.
|
| 60 |
+
|
| 61 |
+
## Detection-model ceiling — Meta height ≥ 5m vs manual labels
|
| 62 |
+
|
| 63 |
+
The detection model (CLIP + gradient-boosted regression) is trained to **reproduce
|
| 64 |
+
Meta's 1m canopy fraction**, so the Meta target is its accuracy ceiling. Meta height
|
| 65 |
+
≥ 5m as a classifier vs the same manual labels:
|
| 66 |
+
|
| 67 |
+
| Metric | Pooled | Population-weighted |
|
| 68 |
+
|---|---|---|
|
| 69 |
+
| Precision | 1.00 (95% CI 0.68–1.0) | 1.00 |
|
| 70 |
+
| Recall | 0.53 (95% CI 0.30–0.75) | 0.59 (95% CI 0.34–0.81) |
|
| 71 |
+
| F1 / IoU | 0.70 / 0.53 | 0.74 / 0.59 |
|
| 72 |
+
|
| 73 |
+
Meta never false-positives in this sample (every Meta ≥ 5m cell is real canopy by
|
| 74 |
+
eye) but recovers only ~59% of canopy and implies just 6.2% canopy vs the 10.5%
|
| 75 |
+
truth — it misses sub-5m and sparse urban trees. The NDVI mask trades some precision
|
| 76 |
+
(0.78 vs 1.00) for much higher recall (0.73 vs 0.59); the two are complementary, and
|
| 77 |
+
the model, reproducing Meta at R² 0.83–0.86, inherits Meta's high-precision /
|
| 78 |
+
moderate-recall profile.
|
| 79 |
+
|
| 80 |
+
## Honest caveats
|
| 81 |
+
|
| 82 |
+
- n=42 manual labels → wide CIs; this is a defensible first reality-anchored number,
|
| 83 |
+
not a definitive accuracy. Single labeler (Claude) — no second-rater κ.
|
| 84 |
+
- "Canopy" = ≥25% of the 30m cell is woody tree canopy by eye on ~0.6m imagery dated
|
| 85 |
+
near (not exactly) 2021; canopy moves slowly so ±1–2yr basemap drift is minor.
|
| 86 |
+
- Population-weighting leans on small per-stratum n (A=6, F=4 carry large weights);
|
| 87 |
+
those strata are unambiguous (roofs / bare / water), so their TN weight is robust,
|
| 88 |
+
but the weighted recall depends on the 6 green-fringe chips.
|
| 89 |
+
|
| 90 |
+
## Files
|
| 91 |
+
|
| 92 |
+
`sample_metadata.csv` · `my_labels.csv` · `accuracy_results.json` ·
|
| 93 |
+
`strata_pop.json` · `contact_sheet.png` · `chips/` `zoom/` `uz/` `s2crops/` ·
|
| 94 |
+
scripts `sample_chips.py` `build_composites.py` `zoom.py` `uz.py` `uz2.py`
|
| 95 |
+
`compute_accuracy.py` `build_contact_sheet.py`.
|