Kvasir-SEG U-Net
This repository contains a U-Net model trained for gastrointestinal polyp segmentation on the Angelou0516/kvasir-seg dataset.
Architecture
- U-Net implemented in PyTorch
- Encoder with convolution + max-pooling
- Bottleneck
- Decoder with transposed convolutions
- Skip connections
- Final 1x1 convolution producing a single-channel mask
Input shape:
3 x 256 x 256
Output shape:
1 x 256 x 256
Loss function rationale
Medical image segmentation suffers from severe class imbalance because the foreground object occupies a small region of the image while the background dominates most pixels.
To address this, the model was trained with a combined loss:
- BCEWithLogitsLoss
- Dice Loss
Final loss:
0.5 * BCEWithLogitsLoss + 0.5 * DiceLoss
Preprocessing
- Resize to
256 x 256 - RGB normalization to
[-1, 1] - Masks resized with nearest-neighbor interpolation
- Masks binarized to
{0, 1}
Training
- Hugging Face Trainer
- Epochs: 20
- Learning rate: 1e-3
- Batch size: 2
Results
- Test Dice: 0.7096
- Test IoU: 0.6066