U-Net for Gastrointestinal Polyp Segmentation (Kvasir-SEG)

Model Description

This model is a U-Net convolutional neural network trained for binary segmentation of gastrointestinal polyps on the Kvasir-SEG dataset.

The model follows the classic U-Net architecture with the following components:

Encoder: 4 levels of DoubleConv blocks (Conv2d → BatchNorm → ReLU → Conv2d → BatchNorm → ReLU) followed by MaxPool2d, progressively increasing channels from 3 → 64 → 128 → 256 → 512 while halving spatial resolution.
Bottleneck: A DoubleConv block at the lowest resolution (512 → 1024 channels, 16×16 spatial).
Decoder: 4 levels of transposed convolutions (ConvTranspose2d) for upsampling, each followed by concatenation with the corresponding encoder skip connection and a DoubleConv block.
Skip Connections: Feature maps from each encoder level are concatenated with the corresponding decoder level to preserve spatial information.
Output: A 1×1 Conv2d reducing to 1 channel, producing a binary segmentation mask of shape [B, 1, H, W].

Training uses a combined BCE + Dice Loss:

BCEWithLogitsLoss: Provides stable pixel-wise binary cross-entropy.
Dice Loss: Directly optimizes overlap between prediction and ground truth, making it robust to the class imbalance present in this dataset (≈85% background, ≈15% polyp).

Metric	Score
IoU	0.6895
Accuracy	0.9374

Downloads last month: -; Downloads are not tracked for this model. How to track

Safetensors

Model size

31M params

Tensor type

F32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support