|
|
| --- |
| base_model: deeplabv3_resnet50 |
| model_name: offroad_segmentation |
| tags: |
| - image-segmentation |
| - pytorch |
| - computer-vision |
| - deeplabv3 |
| widget: |
| - src: https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/widget-images/image-segmentation.png |
| example_title: Image segmentation example |
| --- |
| # Offroad Terrain Segmentation Model |
|
|
| This is a semantic segmentation model trained to identify offroad terrains from images. |
|
|
| ## Model Details |
| - **Model Architecture**: DeepLabV3 with ResNet50 backbone |
| - **Pre-training**: Initialized with weights pre-trained on COCO dataset |
| - **Dataset**: 'Offroad_Segmentation_Training_Dataset' |
| - **Input**: RGB images (540x960 pixels) |
| - **Output**: Segmentation mask with 2 classes (e.g., background, offroad terrain) |
| - **Training Epochs**: 10 |
| - **Batch Size**: 16 |
| - **Learning Rate**: 0.001 |
| |
| ## Training Metrics |
| - **Final Training Loss**: 0.0682 |
| - **Final Validation Loss**: 0.0785 |
| - **Final Training Mean IoU**: 0.2789 |
| - **Final Validation Mean IoU**: 0.2690 |
| |
| ## How to use |
| |
| ```python |
| import torch |
| import torchvision.models.segmentation as models |
| from torchvision import transforms |
| from PIL import Image |
| import cv2 |
| import numpy as np |
| |
| # Load the model architecture |
| model = models.deeplabv3_resnet50(pretrained=False) # Set pretrained=False as we load custom weights |
| model.classifier[4] = torch.nn.Conv2d(256, 2, kernel_size=1) # Adjust output channels for 2 classes |
| |
| # Load the state dictionary |
| model_path = "deeplabv3_resnet50_offroad.pth" # Path to your saved model |
| model.load_state_dict(torch.load(model_path, map_location=torch.device('cpu'))) |
| model.eval() |
|
|
| # Preprocessing transformations |
| transform = transforms.Compose([ |
| transforms.ToTensor(), |
| transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]), |
| ]) |
| |
| def predict_mask(image_path, model, transform): |
| image = Image.open(image_path).convert("RGB") |
| original_size = image.size |
| |
| # Resize to model's expected input size (960x540 for this model, or handle dynamic resizing) |
| # For simplicity, assuming model input was trained on fixed size, let's resize |
| image_tensor = transform(image).unsqueeze(0) # Add batch dimension |
| |
| with torch.no_grad(): |
| output = model(image_tensor)['out'] |
| |
| # Get the predicted class for each pixel |
| predicted_mask = torch.argmax(output.squeeze(), dim=0).cpu().numpy() |
| |
| # Resize mask back to original image size if necessary |
| predicted_mask_resized = cv2.resize(predicted_mask.astype(np.uint8), original_size, interpolation=cv2.INTER_NEAREST) |
| |
| return predicted_mask_resized |
| |
| # Example usage: |
| # Assuming you have an image 'test_image.jpg' |
| # mask = predict_mask('test_image.jpg', model, transform) |
| # plt.imshow(mask) |
| # plt.show() |
| ``` |
| |