Planktoscope Phytoplankton Classifier
A 22-class phytoplankton image classifier for Planktoscope ROIs from the Santa Cruz Municipal Wharf timeseries.
Architecture
A frozen DINOv2 ViT-S/14 backbone (timm: vit_small_patch14_dinov2.lvd142m)
produces a 384-d embedding, and a single linear layer (linear probe) maps it
to 22 classes. Only the linear head is trained; the backbone is unchanged.
Preprocessing (must match exactly)
- Pad to square with fill
(255, 255, 255)(light, matching the ROI background) โ preserves aspect ratio so chains are not distorted. - Resize to 224ร224.
- Normalize with ImageNet mean
[0.485, 0.456, 0.406]/ std[0.229, 0.224, 0.225].
Classes
Akashiwo, Asterionella, Centric, Cerataulina_Guinardia_Dactyliosen, Chaetoceros, Detritus, Dinophysis, Eucampia, Lioloma, Margalefidinium, Pennate, Phaeocystis, Pleurosigma, Polykrikos, Protoperidinium, Pseudo-nitzschia, Rhizosolenia, Thalassionema, Tiarina, Tintinnid, Tripos, Zooplankton
Usage
See inference_example.py (self-contained โ does not require the training repo).
Predictions below recommended_min_confidence (0.7)
should be treated as Unassigned, important when running over raw samples that
contain detritus and unfamiliar particles.
Per-class performance (held-out test)
| class | support | precision | recall | f1 |
|---|---|---|---|---|
| Dinophysis | 10 | 1.0 | 1.0 | 1.0 |
| Phaeocystis | 160 | 0.964 | 1.0 | 0.982 |
| Thalassionema | 55 | 0.982 | 0.982 | 0.982 |
| Tripos | 25 | 0.962 | 1.0 | 0.98 |
| Tintinnid | 142 | 0.959 | 0.993 | 0.976 |
| Akashiwo | 270 | 0.989 | 0.963 | 0.976 |
| Zooplankton | 19 | 1.0 | 0.947 | 0.973 |
| Chaetoceros | 564 | 0.97 | 0.968 | 0.969 |
| Detritus | 140 | 0.985 | 0.95 | 0.967 |
| Cerataulina_Guinardia_Dactyliosen | 244 | 0.963 | 0.967 | 0.965 |
| Margalefidinium | 78 | 0.928 | 0.987 | 0.957 |
| Tiarina | 33 | 0.917 | 1.0 | 0.957 |
| Asterionella | 61 | 0.951 | 0.951 | 0.951 |
| Centric | 8 | 1.0 | 0.875 | 0.933 |
| Pseudo-nitzschia | 339 | 0.944 | 0.903 | 0.923 |
| Lioloma | 99 | 0.873 | 0.97 | 0.919 |
| Polykrikos | 11 | 0.846 | 1.0 | 0.917 |
| Protoperidinium | 10 | 0.833 | 1.0 | 0.909 |
| Eucampia | 12 | 0.846 | 0.917 | 0.88 |
| Rhizosolenia | 92 | 0.907 | 0.848 | 0.876 |
| Pennate | 16 | 0.786 | 0.688 | 0.733 |
| Pleurosigma | 8 | 0.667 | 0.75 | 0.706 |
Training data
patcdaniel/planktoscope-phytoplankton โ expert-verified ROIs curated via deep-feature clustering + DINOv2 embedding-similarity mining. This model predicts the 22 classes with โฅ50 training images; the dataset additionally includes rarer curated classes not yet covered by the model.
Caveats
- Abundances derived from this model are counts of particles, not biovolume โ a single cell and a long chain each count once.
- A few small classes remain weak (low support); see the per-class table.
License
cc-by-4.0
- Downloads last month
- 19