---
title: Desert Semantic Segmentation Demo
emoji: 🌵
colorFrom: yellow
colorTo: orange
sdk: gradio
sdk_version: 4.44.0
app_file: app.py
pinned: false
license: apache-2.0
tags:
  - semantic-segmentation
  - segformer
  - transformers
  - desert
  - ugv
  - offroad
datasets:
  - Offroad_Segmentation_Training_Dataset
metrics:
  - mean_iou
---

# 🌵 Desert Semantic Segmentation using SegFormer (MiT-B2)

A **SegFormer** transformer model fine-tuned on the Offroad Segmentation Training Dataset for 10-class semantic segmentation of desert terrain — built for UGV (Unmanned Ground Vehicle) autonomous navigation in off-road environments.


---

## 🧠 Model Architecture

| Component       | Detail                              |
|-----------------|-------------------------------------|
| Framework       | HuggingFace Transformers            |
| Model           | SegFormer                           |
| Backbone        | MiT-B2 (`nvidia/mit-b2`)            |
| Parameters      | 27,354,314 (all trainable)          |
| Decoder         | Lightweight MLP Head                |
| Classes         | 10                                  |
| Input Size      | 512 × 512                           |
| GPU             | NVIDIA A100-PCIE-40GB               |

---

## 🗂 Dataset Classes (10 Categories)

| Class ID | Raw Mask Value | Label         |
|----------|---------------|---------------|
| 0        | 100           | Trees         |
| 1        | 200           | Lush Bushes   |
| 2        | 300           | Dry Grass     |
| 3        | 500           | Dry Bushes    |
| 4        | 550           | Ground Clutter|
| 5        | 600           | Flowers       |
| 6        | 700           | Logs          |
| 7        | 800           | Rocks         |
| 8        | 7100          | Landscape     |
| 9        | 10000         | Sky           |

---

## 📊 Dataset Statistics

| Split      | Samples | Proportion |
|------------|---------|------------|
| Train      | 2,142   | 75%        |
| Validation | 286     | 10%        |
| Test       | 429     | 15%        |
| **Total**  | **2,857** | —        |

- Image resolution: **960 × 540** (RGB)
- Mask format: uint16 with raw class value encoding
- Total annotated instances: **16,951**

---

## 🎨 Augmentation Pipeline

11 augmentations specifically chosen for desert and off-road conditions:

| Augmentation         | Purpose                                             |
|----------------------|-----------------------------------------------------|
| Color Jitter         | Handles varying sun angles and color temperatures   |
| Gamma Change         | Simulates over/under-exposed outdoor scenes         |
| Gaussian Noise       | Robustness to sensor noise in UGV cameras           |
| Motion / Gaussian / Median Blur | Motion blur from vehicle movement      |
| Random Shadows       | Shadows from rocks, vegetation, terrain             |
| Random Fog           | Dust storms and atmospheric haze                    |
| Brightness/Contrast  | Atmospheric and lighting variations                 |
| Texture Mixup        | Prevents overfitting to specific terrain patterns   |
| Horizontal Flip      | Improves directional generalization                 |
| Shift / Scale / Rotate | Spatial robustness                               |
| Coarse Dropout       | Simulates sensor occlusion                          |

---

## ⚙️ Training Configuration

| Parameter          | Value       |
|--------------------|-------------|
| Epochs             | 50          |
| Batch Size         | 8           |
| Learning Rate      | 6e-5        |
| Optimizer          | AdamW       |
| Warmup Steps       | 500         |
| Weight Decay       | 0.01        |
| FP16               | ✅ Enabled  |
| Best Model Metric  | mean_iou    |
| Eval Strategy      | Per epoch   |

---

## 📈 Evaluation Results

Evaluated on the **validation split** (286 images) using COCO-style mean IoU.

| Metric          | Value  |
|-----------------|--------|
| **Mean IoU**    | **0.6529** |
| **Mean Accuracy** | **0.7592** |

### Per-Class IoU

| Class          | IoU    |
|----------------|--------|
| Trees          | 0.8517 |
| Lush Bushes    | 0.6990 |
| Dry Grass      | 0.7007 |
| Dry Bushes     | 0.4873 |
| Ground Clutter | 0.3647 |
| Flowers        | 0.7246 |
| Logs           | 0.5591 |
| Rocks          | 0.4544 |
| Landscape      | 0.7014 |
| Sky            | 0.9860 |

**Best class:** Sky (0.9860) — large uniform regions  
**Hardest class:** Ground Clutter (0.3647) — small, heterogeneous objects

---

## ⚙️ Inference
```python
from transformers import SegformerImageProcessor, SegformerForSemanticSegmentation
from PIL import Image
import torch
import torch.nn.functional as F

# Load model
processor = SegformerImageProcessor.from_pretrained("PUSHPENDAR/desert-segformer")
model = SegformerForSemanticSegmentation.from_pretrained("PUSHPENDAR/desert-segformer")
model.eval()

# Load image
image = Image.open("desert_scene.jpg").convert("RGB")
inputs = processor(images=image, return_tensors="pt")

# Predict
with torch.no_grad():
    outputs = model(**inputs)
    logits = outputs.logits  # (1, num_classes, H/4, W/4)

# Upsample to original size
upsampled = F.interpolate(
    logits,
    size=(image.height, image.width),
    mode="bilinear",
    align_corners=False
)
pred_mask = upsampled.argmax(dim=1)[0].numpy()  # (H, W)
print("Predicted class map shape:", pred_mask.shape)
```

---

## 📦 Repository Files

| File / Folder            | Description                              |
|--------------------------|------------------------------------------|
| `pytorch_model.bin`      | Fine-tuned SegFormer weights             |
| `config.json`            | Model configuration                      |
| `preprocessor_config.json` | Image processor settings              |
| `outputs/validation_metrics.json` | Saved evaluation metrics       |
| `outputs/training_curves.png`     | Loss and mIoU training curves  |
| `outputs/test_predictions/`       | Per-image prediction masks     |

---

## 🚀 Run Locally
```bash
git clone https://huggingface.co/PUSHPENDAR/desert-segformer
cd desert-segformer
pip install transformers torch pillow
python app.py
```

---

## 📝 Citation

If you use this model or dataset, please cite:
```bibtex
@misc{desert-segformer-2025,
  title     = {Desert Semantic Segmentation with SegFormer (MiT-B2)},
  author    = {Pushpendar Choudhary},
  year      = {2025},
  publisher = {HuggingFace},
  url       = {https://huggingface.co/PUSHPENDAR/desert-segformer}
}
```

---

## 📄 License

Apache 2.0 — see [LICENSE](LICENSE) for details.