| --- |
| license: cc-by-4.0 |
| tags: |
| - image-segmentation |
| - image-classification |
| - solar-panels |
| - photovoltaic |
| - remote-sensing |
| - aerial-imagery |
| - pytorch |
| datasets: |
| - gabrielkasmi/bdappv |
| --- |
| |
| # BDAPPV Models |
|
|
| Baseline models for the [BDAPPV dataset](https://huggingface.co/datasets/gabrielkasmi/bdappv) β aerial images of rooftop photovoltaic installations in France and Belgium. |
|
|
| **Paper:** [Kasmi et al., Scientific Data, 2023](https://doi.org/10.1038/s41597-023-01951-4) β [arXiv:2209.03726](https://arxiv.org/abs/2209.03726) |
|
|
| --- |
|
|
| ## Models |
|
|
| Two architectures, trained independently on each imagery provider: |
|
|
| | File | Task | Architecture | Provider | |
| |------|------|-------------|----------| |
| | `deeplab_google_best.pth` | Segmentation | DeepLabV3-ResNet101 | Google | |
| | `deeplab_ign_best.pth` | Segmentation | DeepLabV3-ResNet101 | IGN | |
| | `inception_google_best.pth` | Classification | InceptionV3 | Google | |
| | `inception_ign_best.pth` | Classification | InceptionV3 | IGN | |
|
|
| **Note on training data and licensing.** Checkpoints fine-tuned on the |
| Google subset of BDAPPV (`*_google_*`) derive from imagery distributed |
| under CC-BY-NC 4.0; commercial users should prefer the IGN-trained |
| checkpoints (`*_ign_*`, CC-BY 4.0 imagery) or assess accordingly. All |
| models are initialized from Mayer et al. (2022) checkpoints β see their |
| repository for base model licensing. |
|
|
| --- |
|
|
| ## Benchmark protocol |
|
|
| Three evaluation tracks are defined: |
|
|
| **Track 1 β Segmentation (single provider)** |
| Train and evaluate on the same provider. Report IoU and F1 on the test split. |
|
|
| **Track 2 β Classification (single provider)** |
| Train and evaluate on the same provider. Report accuracy and F1 on the test split. |
|
|
| **Track 3 β Distribution shift (cross-provider)** |
| Train on Google, evaluate on IGN test split. This is the primary robustness benchmark. Report IoU. |
|
|
| Rules: |
| - The test split must not be used for model selection or hyperparameter tuning β validation split only. |
| - The spatial holdout by department must not be modified. Re-splitting invalidates comparability with published results. |
| - For Track 3, only the Google training split may be used for training. |
|
|
| --- |
| ## Results |
|
|
| Models evaluated on the official test split (spatial holdout by French department β see dataset card for details). |
|
|
| ### Segmentation (DeepLabV3-ResNet101) |
|
|
| | Train | Test | IoU | F1 | n (test) | |
| |-------|------|-----|----|----------| |
| | Google | Google | 0.884 | 0.937 | 1,935 | |
| | IGN | IGN | 0.735 | 0.844 | 1,239 | |
| | Google | IGN | 0.561 | 0.709 | 1,239 | |
| | IGN | Google | 0.657 | 0.786 | 1,935 | |
|
|
| ### Classification (InceptionV3) |
|
|
| | Train | Test | Accuracy | Precision | Recall | F1 | n (test) | |
| |-------|------|----------|-----------|--------|----|----------| |
| | Google | Google | 0.952 | 0.990 | 0.912 | 0.949 | 3,884 | |
| | IGN | IGN | 0.640 | 0.831 | 0.309 | 0.451 | 2,593 | |
| | Google | IGN | 0.592 | 0.815 | 0.188 | 0.306 | 2,593 | |
| | IGN | Google | 0.543 | 1.000 | 0.083 | 0.153 | 3,884 | |
|
|
| **Note on classification cross-provider results:** the IGN-trained model collapses on Google imagery (Recall=0.08, Precision=1.0), indicating the model rarely predicts positives β a degenerate operating point. This illustrates the severity of the distribution shift documented in [Kasmi et al. (2025)](https://doi.org/10.1017/eds.2025.13). |
|
|
| --- |
|
|
| ## Usage |
|
|
| A `model.py` helper is included in this repo to simplify loading: |
|
|
| ```python |
| from huggingface_hub import hf_hub_download |
| import importlib.util |
| |
| path = hf_hub_download("gabrielkasmi/bdappv-models", "model.py") |
| spec = importlib.util.spec_from_file_location("bdappv_model", path) |
| mod = importlib.util.module_from_spec(spec) |
| spec.loader.exec_module(mod) |
| |
| seg = mod.load_segmentation_model("google") # or "ign" |
| clf = mod.load_classification_model("google") # or "ign" |
| ``` |
|
|
| Both functions return the model in `eval()` mode. An optional `device` argument is supported (`"cpu"`, `"cuda"`, `"mps"`). |
|
|
| --- |
|
|
| ## Training |
|
|
| Models trained on the official BDAPPV splits using: |
|
|
| - **Optimizer:** AdamW (lr=1e-4, weight_decay=1e-4) |
| - **Scheduler:** Cosine annealing |
| - **Effective batch size:** 32 (batch 16 Γ grad accum 2) |
| - **Early stopping:** patience=7 epochs on validation metric |
| - **Input size:** 400Γ400 px |
| - **Initialization:** checkpoints from [Mayer et al. (2022)](https://doi.org/10.1016/j.apenergy.2021.118469), who fine-tuned DeepLabV3-ResNet101 and InceptionV3 on 10 cm/px orthoimagery from North Rhine-Westphalia (Germany) for rooftop PV detection. These checkpoints were then further fine-tuned on BDAPPV using the splits above. |
| |
| Training scripts available in the [BDAPPV dataset repository](https://huggingface.co/datasets/gabrielkasmi/bdappv). |
| |
| --- |
| |
| ## Citation |
| |
| If you use these models, please cite: |
| |
| ```bibtex |
| @article{kasmi2022towards, |
| title={Towards unsupervised assessment with open-source data of the accuracy of deep learning-based distributed PV mapping}, |
| author={Kasmi, Gabriel and Dubus, Laurent and Blanc, Philippe and Saint-Drenan, Yves-Marie}, |
| journal={arXiv preprint arXiv:2207.07466}, |
| year={2022} |
| } |
| ``` |
| |
| ## References |
| |
| - Mayer et al. (2022). [3D-PV-Locator: Large-scale detection of rooftop-mounted photovoltaic systems in 3D.](https://doi.org/10.1016/j.apenergy.2021.118469) *Applied Energy*, 310, 118469. *(source of the base checkpoints)* |
| - Kasmi et al. (2023). [A crowdsourced dataset of aerial images with annotated solar photovoltaic arrays and installation metadata.](https://doi.org/10.1038/s41597-023-01951-4) *Scientific Data*, 10, 59. *(BDAPPV dataset)* |
| - Kasmi et al. (2025). [Space-scale exploration of the poor reliability of deep learning models: the case of the remote sensing of rooftop photovoltaic systems.](https://doi.org/10.1017/eds.2025.13) *Environmental Data Science*. *(cross-provider distribution shift)* |