Model Card: Nematostella Rosette Detector

Model Description

Model type: Attention U-Net (Oktay et al. 2018)
Task: Semantic segmentation — pixel-wise detection of epithelial rosette structures in Nematostella vectensis confocal microscopy images
Input: 2-channel binary boundary representation (512×512×2): thin inner cell boundary lines + morphologically thickened cell boundaries. No fluorescence intensity used.
Output: Pixel-wise probability map (0–1) of rosette likelihood
Framework: PyTorch
License: MIT

Intended Use

Primary use: Generating candidate rosette proposals for expert-reviewed human-in-the-loop annotation in napari
Out-of-scope: Direct automated quantification without expert review; application to other organisms, tissue types, or imaging modalities without retraining

Training Data

214 confocal microscopy images of Nematostella vectensis juvenile epidermis
Acquired on Olympus IX83 FV3000, 60× silicone objective, 1024×1024 px, 0.134 µm/px
Ground truth: manually annotated rosette instance masks (napari), minimum 5 cells sharing a common central axis or coalescing around an extruding cell
Will be deposited on Zenodo upon publication

Evaluation

Evaluated on held-out validation set (54 images, 269 rosette instances, 20% of total dataset):

Metric	Value
Pixel-level Dice	0.54
Pixel-level Recall	0.65
Event-level Recall (≥1px, threshold 0.5)	88.8% (239/269)
Rosettes with ≥10% pixel coverage at threshold 0.4	82.9% (223/269)
Rosettes with >80% pixel coverage (threshold 0.5)	47.2% (127/269)
Rosettes with >40% pixel coverage (threshold 0.5)	72.5% (195/269)
Completely missed (no heatmap signal)	11.2% (30/269)

Note: Pixel-level recall (0.65) reflects boundary imprecision in detected rosettes, not missed detection events. Event-level recall (88.8%) is the operationally relevant metric for the human-in-the-loop workflow. Pixel-level metrics computed on full images using sliding window inference (512×512 patches, 256px overlap, threshold 0.5).

Architecture

4-level encoder-decoder (U-Net)
Additive attention gates at 3 upsampling junctions
Feature maps: 64 → 128 → 256 → 512 → 1024 (bottleneck)
Final layer: 1×1 convolution + Sigmoid

Training Configuration

Parameter	Value
Loss	0.5× BCE + 0.5× Dice
Optimizer	AdamW
Learning rate	1×10⁻⁴
Batch size	4
Early stopping	Patience 15 epochs (val loss)
Input patch size	512×512

Data Augmentation

Random rotation (p=0.5), elastic deformation (α=120, σ=6, p=0.4), affine transforms (p=0.6), coarse dropout (p=0.3) via Albumentations.

Limitations

Trained exclusively on a single laboratory's images (single instrument, single organism, single staining protocol)
Generalisation to other imaging setups not evaluated
11.2% of rosette events receive no predicted pixels at threshold 0.5 — expert full-image review of the full image is required
Validation set was also used for early stopping (standard practice); the model was never trained on validation images

Hardware

Apple MacBook Pro M2 Max (64 GB unified memory), PyTorch MPS backend. Training: a few hours. Inference: <1 min/image.

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support