File size: 2,741 Bytes
69bc659
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84

---
base_model: deeplabv3_resnet50
model_name: offroad_segmentation
tags:
- image-segmentation
- pytorch
- computer-vision
- deeplabv3
widget:
  - src: https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/widget-images/image-segmentation.png
    example_title: Image segmentation example
---
# Offroad Terrain Segmentation Model

This is a semantic segmentation model trained to identify offroad terrains from images.

## Model Details
- **Model Architecture**: DeepLabV3 with ResNet50 backbone
- **Pre-training**: Initialized with weights pre-trained on COCO dataset
- **Dataset**: 'Offroad_Segmentation_Training_Dataset'
- **Input**: RGB images (540x960 pixels)
- **Output**: Segmentation mask with 2 classes (e.g., background, offroad terrain)
- **Training Epochs**: 10
- **Batch Size**: 16
- **Learning Rate**: 0.001

## Training Metrics
- **Final Training Loss**: 0.0682
- **Final Validation Loss**: 0.0785
- **Final Training Mean IoU**: 0.2789
- **Final Validation Mean IoU**: 0.2690

## How to use

```python
import torch
import torchvision.models.segmentation as models
from torchvision import transforms
from PIL import Image
import cv2
import numpy as np

# Load the model architecture
model = models.deeplabv3_resnet50(pretrained=False) # Set pretrained=False as we load custom weights
model.classifier[4] = torch.nn.Conv2d(256, 2, kernel_size=1) # Adjust output channels for 2 classes

# Load the state dictionary
model_path = "deeplabv3_resnet50_offroad.pth" # Path to your saved model
model.load_state_dict(torch.load(model_path, map_location=torch.device('cpu')))
model.eval()

# Preprocessing transformations
transform = transforms.Compose([
    transforms.ToTensor(),
    transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]),
])

def predict_mask(image_path, model, transform):
    image = Image.open(image_path).convert("RGB")
    original_size = image.size

    # Resize to model's expected input size (960x540 for this model, or handle dynamic resizing)
    # For simplicity, assuming model input was trained on fixed size, let's resize
    image_tensor = transform(image).unsqueeze(0) # Add batch dimension

    with torch.no_grad():
        output = model(image_tensor)['out']

    # Get the predicted class for each pixel
    predicted_mask = torch.argmax(output.squeeze(), dim=0).cpu().numpy()

    # Resize mask back to original image size if necessary
    predicted_mask_resized = cv2.resize(predicted_mask.astype(np.uint8), original_size, interpolation=cv2.INTER_NEAREST)

    return predicted_mask_resized

# Example usage:
# Assuming you have an image 'test_image.jpg'
# mask = predict_mask('test_image.jpg', model, transform)
# plt.imshow(mask)
# plt.show()
```