--- base_model: deeplabv3_resnet50 model_name: offroad_segmentation tags: - image-segmentation - pytorch - computer-vision - deeplabv3 widget: - src: https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/widget-images/image-segmentation.png example_title: Image segmentation example --- # Offroad Terrain Segmentation Model This is a semantic segmentation model trained to identify offroad terrains from images. ## Model Details - **Model Architecture**: DeepLabV3 with ResNet50 backbone - **Pre-training**: Initialized with weights pre-trained on COCO dataset - **Dataset**: 'Offroad_Segmentation_Training_Dataset' - **Input**: RGB images (540x960 pixels) - **Output**: Segmentation mask with 2 classes (e.g., background, offroad terrain) - **Training Epochs**: 10 - **Batch Size**: 16 - **Learning Rate**: 0.001 ## Training Metrics - **Final Training Loss**: 0.0682 - **Final Validation Loss**: 0.0785 - **Final Training Mean IoU**: 0.2789 - **Final Validation Mean IoU**: 0.2690 ## How to use ```python import torch import torchvision.models.segmentation as models from torchvision import transforms from PIL import Image import cv2 import numpy as np # Load the model architecture model = models.deeplabv3_resnet50(pretrained=False) # Set pretrained=False as we load custom weights model.classifier[4] = torch.nn.Conv2d(256, 2, kernel_size=1) # Adjust output channels for 2 classes # Load the state dictionary model_path = "deeplabv3_resnet50_offroad.pth" # Path to your saved model model.load_state_dict(torch.load(model_path, map_location=torch.device('cpu'))) model.eval() # Preprocessing transformations transform = transforms.Compose([ transforms.ToTensor(), transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]), ]) def predict_mask(image_path, model, transform): image = Image.open(image_path).convert("RGB") original_size = image.size # Resize to model's expected input size (960x540 for this model, or handle dynamic resizing) # For simplicity, assuming model input was trained on fixed size, let's resize image_tensor = transform(image).unsqueeze(0) # Add batch dimension with torch.no_grad(): output = model(image_tensor)['out'] # Get the predicted class for each pixel predicted_mask = torch.argmax(output.squeeze(), dim=0).cpu().numpy() # Resize mask back to original image size if necessary predicted_mask_resized = cv2.resize(predicted_mask.astype(np.uint8), original_size, interpolation=cv2.INTER_NEAREST) return predicted_mask_resized # Example usage: # Assuming you have an image 'test_image.jpg' # mask = predict_mask('test_image.jpg', model, transform) # plt.imshow(mask) # plt.show() ```