Spaces:

phiniqs
/

seg-models

Sleeping

App Files Files Community

seg-models / README.md

Mohamed-ENNHIRI

Solar Panel Segmentation app for HF Spaces

52efd90 25 days ago

preview code

raw

history blame contribute delete

6.04 kB

A newer version of the Streamlit SDK is available: 1.58.0

Upgrade

metadata

title: Solar Panel Segmentation
emoji: 📊
colorFrom: blue
colorTo: green
sdk: streamlit
sdk_version: 1.36.0
app_file: app.py
pinned: false

Solar Panel Segmentation — Model Zoo

Aerial / satellite imagery → binary mask of photovoltaic (PV) panels.

This repo trains and compares five semantic-segmentation models on the same solar-panel dataset, ships their checkpoints, runs batch inference over a large image pool, and serves an interactive HTML tool to compare model outputs side-by-side.

What's in here

seg_models/
├── final_data/                       # Training / validation dataset
│   ├── train/{images,masks}          # 5,325 image+mask pairs
│   └── val/{images,masks}            # 1,331 image+mask pairs
├── final_data.zip                    # Zipped dataset (~31 MB)
│
├── pv_panel_models/                  # 4 "lightweight" baselines + tooling
│   ├── cnn_model/                    # SegNet (encoder-decoder w/ MaxUnpool)
│   ├── unet_model/                   # Classic U-Net
│   ├── vit_model/                    # SegFormer mit-b0  (transformer)
│   ├── segformer_b5_model/           # SegFormer mit-b5  (large transformer)
│   ├── predict_all.py                # Runs all 4 models over a folder of images
│   ├── predictions/                  # Per-model PNG masks + manifest.json
│   ├── source_images -> .../filtered_images   # Symlink to inference inputs
│   ├── comparison_tool.html          # Interactive viewer (per-image, all models)
│   ├── model_comparison_dashboard.html        # Training-curve dashboard
│   ├── model_architectures.html      # Architecture explainer page
│   ├── generate_dashboard.py         # Builds the dashboard from training_logs.txt
│   └── serve.sh                      # `python -m http.server` wrapper
│
└── hybrid_20_epochs_4ep_stop/        # 5th, heavier model (separate experiment)
    ├── hybrid_solar_segmenter.py     # Hybrid Attention-UNet (ResNet34 + SE + AG)
    ├── hybrid_solar_segmenter.pth    # Trained weights (~335 MB)
    ├── README.md / WORKFLOW.md       # Detailed write-up of this model
    ├── training_curves.png
    └── prediction_*.png              # Sample qualitative outputs

Dataset

Binary segmentation masks for solar panels, stored as *.jpg images paired with *_mask.png ground truth. Splits:

Split	Samples
Train	5,325
Val	1,331

The 4 baselines train on 128×128 crops; the hybrid model trains on 256×256. Augmentations: H/V flips, 90° rotations, brightness/contrast jitter.

Loss is consistent across all models: 0.5 * BCE + 0.5 * Dice (CombinedLoss), with Adam / AdamW optimizers and ReduceLROnPlateau or CosineAnnealingWarmRestarts schedulers.

The five models

Model	Architecture	Params	Best Val Dice
SegNet (CNN)	VGG-style encoder/decoder with stored MaxPool indices for unpooling	11.8 M	0.9161
U-Net	Classic encoder/decoder with skip-concatenation	~31 M	0.9332
SegFormer mit-b0	HuggingFace hierarchical transformer (small)	3.7 M	0.9251
SegFormer mit-b5	HuggingFace hierarchical transformer (large)	84.6 M	0.9334
Hybrid Attention-UNet	ResNet34 (ImageNet) encoder + SE blocks + Attention Gates + UNet decoder, 256×256	~24 M	see `hybrid_20_epochs_4ep_stop/README.md`

Each baseline directory follows the same layout:

<model>/
├── <model>.py              # nn.Module definition
├── dataset.py              # SolarPanelDataset + augmentations + DataLoaders
├── train.py                # Training loop, metrics, best-checkpoint saving
├── requirements.txt
├── checkpoints/
│   ├── best_model.pth      # State-dict by best val Dice
│   └── training_logs.txt   # Per-epoch loss / Dice / IoU / Precision / Recall
└── venv/                   # Per-model virtualenv (not all dirs)

Metrics tracked: Loss, Accuracy, Precision, Recall, Dice, IoU.

Inference & comparison tooling

`predict_all.py`

Loads all four baseline checkpoints, runs each over every JPG in pv_panel_models/source_images/ (symlink → filtered_images), and writes:

predictions/<model_short_name>/<stem>.png — binary mask resized to original image dimensions
predictions/manifest.json — index used by the web tool

Currently contains predictions for 15,423 images per model.

cd pv_panel_models
python predict_all.py

Web viewer

./pv_panel_models/serve.sh        # default port 8000
# then open:
#   http://localhost:8000/comparison_tool.html         # per-image side-by-side viewer
#   http://localhost:8000/model_comparison_dashboard.html  # training curves
#   http://localhost:8000/model_architectures.html     # architecture explainer

The dashboard is regenerated from training logs:

python pv_panel_models/generate_dashboard.py

Reproducing a model

Each baseline is self-contained. From its directory:

pip install -r requirements.txt
python train.py

Train scripts hard-code dataset paths to the original author's ~/Desktop/seg_models/... location — update the train_dir / val_dir (or train_images / train_masks in the hybrid script) to point at this repo's final_data/ before running.

The hybrid model additionally requires scikit-learn and matplotlib and downloads ImageNet ResNet34 weights on first run.

Notes

The .venv/ at the repo root and the per-model venv/ directories are local environments — not portable; recreate from requirements.txt.
final_data.zip is the same dataset as final_data/ packaged for distribution.
Hard-coded absolute paths under /home/mohamed-ennhiri/Desktop/... appear in several scripts; expect to patch them when running on a fresh machine.