Upload initial DeepLabV3 ResNet50 segmentation model with corrected README.md

69bc659 verified 6 days ago

2.74 kB


	---
	base_model: deeplabv3_resnet50
	model_name: offroad_segmentation
	tags:
	- image-segmentation
	- pytorch
	- computer-vision
	- deeplabv3
	widget:
	- src: https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/widget-images/image-segmentation.png
	example_title: Image segmentation example
	---
	# Offroad Terrain Segmentation Model

	This is a semantic segmentation model trained to identify offroad terrains from images.

	## Model Details
	- Model Architecture: DeepLabV3 with ResNet50 backbone
	- Pre-training: Initialized with weights pre-trained on COCO dataset
	- Dataset: 'Offroad_Segmentation_Training_Dataset'
	- Input: RGB images (540x960 pixels)
	- Output: Segmentation mask with 2 classes (e.g., background, offroad terrain)
	- Training Epochs: 10
	- Batch Size: 16
	- Learning Rate: 0.001

	## Training Metrics
	- Final Training Loss: 0.0682
	- Final Validation Loss: 0.0785
	- Final Training Mean IoU: 0.2789
	- Final Validation Mean IoU: 0.2690

	## How to use

	```python
	import torch
	import torchvision.models.segmentation as models
	from torchvision import transforms
	from PIL import Image
	import cv2
	import numpy as np

	# Load the model architecture
	model = models.deeplabv3_resnet50(pretrained=False) # Set pretrained=False as we load custom weights
	model.classifier[4] = torch.nn.Conv2d(256, 2, kernel_size=1) # Adjust output channels for 2 classes

	# Load the state dictionary
	model_path = "deeplabv3_resnet50_offroad.pth" # Path to your saved model
	model.load_state_dict(torch.load(model_path, map_location=torch.device('cpu')))
	model.eval()

	# Preprocessing transformations
	transform = transforms.Compose([
	transforms.ToTensor(),
	transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]),
	])

	def predict_mask(image_path, model, transform):
	image = Image.open(image_path).convert("RGB")
	original_size = image.size

	# Resize to model's expected input size (960x540 for this model, or handle dynamic resizing)
	# For simplicity, assuming model input was trained on fixed size, let's resize
	image_tensor = transform(image).unsqueeze(0) # Add batch dimension

	with torch.no_grad():
	output = model(image_tensor)['out']

	# Get the predicted class for each pixel
	predicted_mask = torch.argmax(output.squeeze(), dim=0).cpu().numpy()

	# Resize mask back to original image size if necessary
	predicted_mask_resized = cv2.resize(predicted_mask.astype(np.uint8), original_size, interpolation=cv2.INTER_NEAREST)

	return predicted_mask_resized

	# Example usage:
	# Assuming you have an image 'test_image.jpg'
	# mask = predict_mask('test_image.jpg', model, transform)
	# plt.imshow(mask)
	# plt.show()
	```