Exploring the Underwater World Segmentation without Extra Training
Paper β’ 2511.07923 β’ Published
Alam, M., Dhavale, S. V., and Srikanth, D.
Indian Journal of Technical Education, Vol. 48, No. 2, December 2025.
Peer-Reviewed Publication.
PIAU-Net extends the classical U-Net architecture with physics-inspired components designed for underwater image segmentation, where light scattering, turbidity, and backscatter degrade visibility in ways that challenge standard appearance-based models.
Key components:
t) and backscatter (b) feature maps at the bottleneck from scene features alone, without explicit physical supervision.The attention gate computes:
alpha = Sigmoid( ReLU( W_g(g) + W_x(skip) + W_phys(t_upsampled) ) )
output = skip * alpha
| Model | mIoU (%) | Dice (%) | Precision (%) | Recall (%) | Pixel Acc. (%) |
|---|---|---|---|---|---|
| U-Net | 93.48 | 94.66 | 96.50 | 96.83 | 95.70 |
| Attention U-Net | 95.23 | 96.53 | 97.60 | 97.46 | 98.06 |
| DeepLab V3+ | 95.01 | 96.04 | 96.42 | 97.67 | 96.85 |
| PIAU-Net (Ours) | 97.38 | 98.18 | 98.83 | 98.53 | 99.54 |
| Model | mIoU (%) | Dice (%) | Precision (%) | Recall (%) | Pixel Acc. (%) |
|---|---|---|---|---|---|
| U-Net (fine-tuned) | 87.79 | 90.67 | 88.96 | 92.65 | 95.10 |
| Attention U-Net (fine-tuned) | 88.05 | 90.92 | 90.29 | 91.59 | 95.38 |
| DeepLab V3+ (fine-tuned) | 90.54 | 94.91 | 95.83 | 94.04 | 97.50 |
| PIAU-Net (Ours) | 93.98 | 96.85 | 96.56 | 97.13 | 98.41 |
pip install torch torchvision opencv-python albumentations tqdm
import torch
from model.model import PhysicsInformedAttentionUNet
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model = PhysicsInformedAttentionUNet(in_ch=3, out_ch=2).to(device)
checkpoint = torch.load("best_model.pth", map_location=device)
if isinstance(checkpoint, dict) and "model_state" in checkpoint:
model.load_state_dict(checkpoint["model_state"])
else:
model.load_state_dict(checkpoint)
model.eval()
import torch
import torchvision.transforms as T
from PIL import Image
transform = T.Compose([
T.Resize((256, 256)),
T.ToTensor(),
T.Normalize(mean=[0.485, 0.456, 0.406],
std=[0.229, 0.224, 0.225]),
])
image = Image.open("your_image.jpg").convert("RGB")
x = transform(image).unsqueeze(0).to(device) # (1, 3, 256, 256)
with torch.no_grad():
seg_main, seg_aux, j, t, b = model(x)
pred_mask = seg_main.argmax(dim=1) # (1, 256, 256)
Outputs:
| Output | Shape | Description |
|---|---|---|
seg_main |
(B, 2, H, W) |
Primary segmentation logits |
seg_aux |
list of 2 tensors | Deep supervision heads |
j |
(B, 3, H, W) |
Estimated illumination map |
t |
(B, 1, H, W) |
Turbidity map |
b |
(B, 1, H, W) |
Backscatter map |
| Setting | Value |
|---|---|
| Optimizer | Adam |
| LR Scheduler | Cosine Annealing |
| Mixed Precision | AMP |
| Gradient Clipping | Yes |
| Input Size | 256 Γ 256 |
| Batch Size | 4 |
| Initial LR | 5e-4 |
| Epochs | 30 (stage 1) + fine-tuning |
| Test-Time Augmentation | Horizontal flip |
Both datasets use binary masks: 0 = background, 1 = foreground.
t, b, j) are learned auxiliary features, not physically calibrated measurements.@article{alam2025piaunet,
title = {Physics-Informed Attention U-Net (PIAUNet): An Enhanced U-Net Framework
for Underwater Segmentation in Aquaculture},
author = {Alam, Mahboob and Dhavale, Sunita Vikram and Srikanth, D.},
journal = {Indian Journal of Technical Education},
volume = {48},
number = {2},
month = {December},
year = {2025}
}
github.com/MahboobAlam0/fish-monitoring-system-using-piaunet