SegFormer Semantic Segmentation (Road / Grass / Footpath / Water)
This repository contains a fine-tuned SegFormer model for semantic segmentation of outdoor scenes, specifically targeting:
- background
- footpath
- grass
- road
- water
The model is designed for applications such as:
- Autonomous navigation
- Robotics perception
- Scene understanding
- Smart city / mapping solutions
Model Details
- Architecture: SegFormer
- Framework: Hugging Face Transformers
- Input Size: 224 ร 224
- Task: Semantic Segmentation
- Number of Classes: 5
Class Labels
| ID | Label |
|---|---|
| 0 | background |
| 1 | footpath |
| 2 | grass |
| 3 | road |
| 4 | water |
Quick Start
1. Install Dependencies
pip install transformers torch pillow
2. Run Inference
from transformers import SegformerForSemanticSegmentation, SegformerImageProcessor
from PIL import Image
import torch
import torch.nn.functional as F
import numpy as np
MODEL_ID = "Dinusharg/segformer_environment_1"
IMAGE_PATH = "input_image_path"
OUT_MASK = "segmented_mask.png"
OUT_OVERLAY = "segmented_overlay.png"
palette = {
0: (0, 0, 0), # background
1: (255, 0, 0), # footpath
2: (0, 255, 0), # grass
3: (128, 128, 128), # road
4: (0, 0, 255), # water
}
processor = SegformerImageProcessor.from_pretrained(MODEL_ID)
model = SegformerForSemanticSegmentation.from_pretrained(MODEL_ID)
image = Image.open(IMAGE_PATH).convert("RGB")
inputs = processor(images=image, return_tensors="pt")
with torch.no_grad():
outputs = model(**inputs)
upsampled_logits = F.interpolate(
outputs.logits,
size=image.size[::-1],
mode="bilinear",
align_corners=False
)
pred = upsampled_logits.argmax(dim=1)[0].cpu().numpy()
print("Unique predicted classes:", sorted(set(pred.flatten().tolist())))
print("Labels:", model.config.id2label)
color_mask = np.zeros((pred.shape[0], pred.shape[1], 3), dtype=np.uint8)
for class_id, color in palette.items():
color_mask[pred == class_id] = color
Image.fromarray(color_mask).save(OUT_MASK)
print(f"Saved mask: {OUT_MASK}")
ALPHA = 0.5
image_np = np.array(image).astype(np.float32)
mask_np = color_mask.astype(np.float32)
overlay = (image_np * (1 - ALPHA) + mask_np * ALPHA).clip(0, 255).astype(np.uint8)
Image.fromarray(overlay).save(OUT_OVERLAY)
print(f"Saved overlay: {OUT_OVERLAY}")
This will returns the colored masked and overlay masked images.
Hugging Face Space (Live Demo)
Try the model interactively:
Upload an image and visualize segmentation results instantly.
Sample Results
Contact
For questions or collaboration:
- Downloads last month
- 60








