File size: 5,798 Bytes
397915f
 
6908c67
 
 
 
 
 
 
 
 
 
397915f
6908c67
 
 
 
 
52bbc87
6908c67
 
 
 
 
 
 
 
 
 
 
 
 
 
cdc4997
 
 
 
 
 
 
6908c67
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
3047448
 
 
 
 
 
6908c67
 
 
3047448
 
 
 
 
 
6908c67
3047448
6908c67
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
4d672d2
 
 
 
 
6908c67
 
 
 
 
 
 
4d672d2
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
---
license: cc-by-4.0
tags:
  - image-segmentation
  - image-classification
  - solar-panels
  - photovoltaic
  - remote-sensing
  - aerial-imagery
  - pytorch
datasets:
  - gabrielkasmi/bdappv
---

# BDAPPV Models

Baseline models for the [BDAPPV dataset](https://huggingface.co/datasets/gabrielkasmi/bdappv) β€” aerial images of rooftop photovoltaic installations in France and Belgium.

**Paper:** [Kasmi et al., Scientific Data, 2023](https://doi.org/10.1038/s41597-023-01951-4) β€” [arXiv:2209.03726](https://arxiv.org/abs/2209.03726)

---

## Models

Two architectures, trained independently on each imagery provider:

| File | Task | Architecture | Provider |
|------|------|-------------|----------|
| `deeplab_google_best.pth` | Segmentation | DeepLabV3-ResNet101 | Google |
| `deeplab_ign_best.pth` | Segmentation | DeepLabV3-ResNet101 | IGN |
| `inception_google_best.pth` | Classification | InceptionV3 | Google |
| `inception_ign_best.pth` | Classification | InceptionV3 | IGN |

**Note on training data and licensing.** Checkpoints fine-tuned on the
Google subset of BDAPPV (`*_google_*`) derive from imagery distributed
under CC-BY-NC 4.0; commercial users should prefer the IGN-trained
checkpoints (`*_ign_*`, CC-BY 4.0 imagery) or assess accordingly. All
models are initialized from Mayer et al. (2022) checkpoints β€” see their
repository for base model licensing.

---

## Benchmark protocol

Three evaluation tracks are defined:

**Track 1 β€” Segmentation (single provider)**
Train and evaluate on the same provider. Report IoU and F1 on the test split.

**Track 2 β€” Classification (single provider)**
Train and evaluate on the same provider. Report accuracy and F1 on the test split.

**Track 3 β€” Distribution shift (cross-provider)**
Train on Google, evaluate on IGN test split. This is the primary robustness benchmark. Report IoU.

Rules:
- The test split must not be used for model selection or hyperparameter tuning β€” validation split only.
- The spatial holdout by department must not be modified. Re-splitting invalidates comparability with published results.
- For Track 3, only the Google training split may be used for training.

---
## Results

Models evaluated on the official test split (spatial holdout by French department β€” see dataset card for details).

### Segmentation (DeepLabV3-ResNet101)

| Train | Test | IoU | F1 | n (test) |
|-------|------|-----|----|----------|
| Google | Google | 0.884 | 0.937 | 1,935 |
| IGN | IGN | 0.735 | 0.844 | 1,239 |
| Google | IGN | 0.561 | 0.709 | 1,239 |
| IGN | Google | 0.657 | 0.786 | 1,935 |

### Classification (InceptionV3)

| Train | Test | Accuracy | Precision | Recall | F1 | n (test) |
|-------|------|----------|-----------|--------|----|----------|
| Google | Google | 0.952 | 0.990 | 0.912 | 0.949 | 3,884 |
| IGN | IGN | 0.640 | 0.831 | 0.309 | 0.451 | 2,593 |
| Google | IGN | 0.592 | 0.815 | 0.188 | 0.306 | 2,593 |
| IGN | Google | 0.543 | 1.000 | 0.083 | 0.153 | 3,884 |

**Note on classification cross-provider results:** the IGN-trained model collapses on Google imagery (Recall=0.08, Precision=1.0), indicating the model rarely predicts positives β€” a degenerate operating point. This illustrates the severity of the distribution shift documented in [Kasmi et al. (2025)](https://doi.org/10.1017/eds.2025.13).

---

## Usage

A `model.py` helper is included in this repo to simplify loading:

```python
from huggingface_hub import hf_hub_download
import importlib.util

path = hf_hub_download("gabrielkasmi/bdappv-models", "model.py")
spec = importlib.util.spec_from_file_location("bdappv_model", path)
mod  = importlib.util.module_from_spec(spec)
spec.loader.exec_module(mod)

seg = mod.load_segmentation_model("google")    # or "ign"
clf = mod.load_classification_model("google")  # or "ign"
```

Both functions return the model in `eval()` mode. An optional `device` argument is supported (`"cpu"`, `"cuda"`, `"mps"`).

---

## Training

Models trained on the official BDAPPV splits using:

- **Optimizer:** AdamW (lr=1e-4, weight_decay=1e-4)
- **Scheduler:** Cosine annealing
- **Effective batch size:** 32 (batch 16 Γ— grad accum 2)
- **Early stopping:** patience=7 epochs on validation metric
- **Input size:** 400Γ—400 px
- **Initialization:** checkpoints from [Mayer et al. (2022)](https://doi.org/10.1016/j.apenergy.2021.118469), who fine-tuned DeepLabV3-ResNet101 and InceptionV3 on 10 cm/px orthoimagery from North Rhine-Westphalia (Germany) for rooftop PV detection. These checkpoints were then further fine-tuned on BDAPPV using the splits above.

Training scripts available in the [BDAPPV dataset repository](https://huggingface.co/datasets/gabrielkasmi/bdappv).

---

## Citation

If you use these models, please cite:

```bibtex
@article{kasmi2022towards,
  title={Towards unsupervised assessment with open-source data of the accuracy of deep learning-based distributed PV mapping},
  author={Kasmi, Gabriel and Dubus, Laurent and Blanc, Philippe and Saint-Drenan, Yves-Marie},
  journal={arXiv preprint arXiv:2207.07466},
  year={2022}
}
```

## References

- Mayer et al. (2022). [3D-PV-Locator: Large-scale detection of rooftop-mounted photovoltaic systems in 3D.](https://doi.org/10.1016/j.apenergy.2021.118469) *Applied Energy*, 310, 118469. *(source of the base checkpoints)*
- Kasmi et al. (2023). [A crowdsourced dataset of aerial images with annotated solar photovoltaic arrays and installation metadata.](https://doi.org/10.1038/s41597-023-01951-4) *Scientific Data*, 10, 59. *(BDAPPV dataset)*
- Kasmi et al. (2025). [Space-scale exploration of the poor reliability of deep learning models: the case of the remote sensing of rooftop photovoltaic systems.](https://doi.org/10.1017/eds.2025.13) *Environmental Data Science*. *(cross-provider distribution shift)*