YAML Metadata Warning:empty or missing yaml metadata in repo card

Check out the documentation for more information.

🫧 Microbubble Distillation Pipeline (v3 β€” Fixed)

Cellpose-SAM-FT β†’ Pseudo-labels β†’ TinyBubbleNet

A 3-stage pipeline for fast, lightweight microbubble sizing and counting via knowledge distillation.

⚠️ IMPORTANT BUG FIX: The original train_student.py in this repo uses BCEWithLogitsLoss on binary foreground/background masks. This fails catastrophically because microbubble foreground is only ~0.2% of pixels β€” the model learns to predict ALL background and achieves 99.8% accuracy while detecting zero bubbles. The fixed script train_mse_distill.py uses MSE distillation on the teacher's raw cell_prob LOGITS instead, which gives gradients on ALL pixels (background pixels have informative negative logits ~-6). See train_mse_distill.py for the corrected implementation.

The Problem

Cellpose-SAM is excellent for cell/bubble segmentation, but at ~300M params (1.1 GB) it's expensive at inference. For lab settings where your slides look similar and you're "just detecting circles", this is massive overkill. You're paying for the ability to also segment dogs, neurons, and a thousand other things β€” capacity you don't need.

The Solution: Distill Into a Tiny Specialist

Model Params Size 256Γ—256 GPU 256Γ—256 CPU FPS (GPU)
Cellpose-SAM ~300M 1.1 GB ~100 ms seconds ~10
TinyBubbleNet (base_ch=16) 389K 1.5 MB 3 ms 45 ms 337
TinyBubbleNet (base_ch=32) 1.5M 5.8 MB ~5 ms ~80 ms ~200

~33Γ— faster, ~750Γ— smaller. And when your domain is narrow (similar-looking lab slides), the accuracy loss is minimal because the student only needs to learn one visual distribution.

Architecture

TinyBubbleNet is a depthwise-separable U-Net (inspired by PicoSAM2) with a 4-channel output:

Channel Name What it encodes
0 dY Vertical gradient flow (Cellpose-compatible)
1 dX Horizontal gradient flow (Cellpose-compatible)
2 cell_prob Foreground/background probability
3 dist_transform Distance transform (peak = bubble radius)

Instance masks are reconstructed via Euler integration of the flow field β€” identical to Cellpose post-processing. This means the student is fully compatible with the Cellpose ecosystem.

The distance transform head is the key addition for sizing: the peak value within each detected instance directly gives you the bubble radius.

The Bug and The Fix

The Bug (original train_student.py / losses.py)

# BAD: BCE on binary masks
prob_loss = BCEWithLogitsLoss(pred_prob, binary_mask)

With foreground at only ~0.2% of pixels, the model's dominant gradient signal is "predict all background". Even after 300 epochs with "best val loss 0.0008", the model predicts zero bubbles everywhere.

The Fix (train_mse_distill.py)

# GOOD: MSE on teacher's raw logits
prob_loss = MSE(pred_prob_logits, teacher_cell_prob_logits)

The teacher outputs cell_prob as raw logits (range roughly -9 to +5). Every pixel has an informative value β€” background pixels should reproduce ~-6, foreground pixels should reproduce ~+5. MSE on logits gives strong gradients everywhere, and the student successfully learns to segment bubbles.

Why this works

Loss Target Gradient on bg pixels? Result
BCE + binary mask {0, 1} NO (bg is "correct" at 0) Predicts all background
MSE + teacher logits Real numbers (~-6 to +5) YES (bg must match ~-6) Learns proper segmentation

Pipeline Overview

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”     β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”     β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  Stage 1: Teacher   β”‚     β”‚  Stage 2: Distillation   β”‚     β”‚  Stage 3: Inference  β”‚
β”‚                     β”‚     β”‚                          β”‚     β”‚                      β”‚
β”‚  Cellpose-SAM-FT    │────▢│  Train TinyBubbleNet     │────▢│  Fast bubble sizing  β”‚
β”‚  generates pseudo-  β”‚     β”‚  on raw teacher logits   β”‚     β”‚  (~3ms/image GPU)    β”‚
β”‚  labels on 100s of  β”‚     β”‚  (~400 epochs)           β”‚     β”‚                      β”‚
β”‚  lab images         β”‚     β”‚                          β”‚     β”‚                      β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜     β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜     β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Quick Start

Install

pip install cellpose torch torchvision scipy scikit-image huggingface_hub numpy

Stage 1: Generate Pseudo-labels

python generate_pseudolabels.py \
    --image_dir /path/to/lab_images/ \
    --model_path /path/to/your/cellpose_sam_ft_model \
    --output_dir /path/to/pseudolabels/ \
    --diameter 30 \
    --channels 0 0

This runs your fine-tuned Cellpose-SAM on all images and saves:

  • Instance masks, flow fields, distance transforms (.npy)
  • Bubble statistics (count, size distribution)

Stage 2: Train Student (FIXED β€” use this!)

# βœ… FIXED: Uses MSE distillation on teacher logits
python train_mse_distill.py \
    --image_dir /path/to/lab_images/ \
    --label_dir /path/to/pseudolabels/ \
    --output_dir ./checkpoints/ \
    --base_ch 16 \
    --epochs 400 \
    --batch_size 4 \
    --lr 1e-3

# ❌ DEPRECATED (has class imbalance bug)
# python train_student.py ...

The fixed script (train_mse_distill.py):

  • Uses MSE distillation on all 4 output channels (including teacher's raw cell_prob logits)
  • Uses a RawPseudoLabelDataset that loads teacher logits directly (no binarization)
  • Achieves proper foreground segmentation instead of all-background predictions

Stage 3: Fast Inference

python inference.py \
    --model_path ./checkpoints/best_model.pt \
    --image_path /path/to/image.png \
    --output_dir ./results/

Files

File Description Status
model.py TinyBubbleNet architecture (depthwise-separable U-Net) βœ…
losses.py Original distillation loss (BCE+Dice β€” has bug) ⚠️ See train_mse_distill.py for fix
dataset.py Original dataset (binary masks β€” has bug) ⚠️ See train_mse_distill.py for fix
train_student.py Original training (BCE-based β€” has bug) ⚠️ Deprecated
train_mse_distill.py Fixed training with MSE on teacher logits βœ… Use this!
generate_pseudolabels.py Stage 1: Teacher β†’ pseudo-labels βœ…
inference.py Stage 3: Fast inference + bubble measurements βœ…

Model Variants

base_ch Params Size GPU Speed Use Case
16 389K 1.5 MB 3 ms @ 256Β² Default β€” fast & tiny
32 1.5M 5.8 MB 5 ms @ 256Β² More capacity if needed

Use --no_depthwise for standard convolutions (more params, possibly better accuracy on complex images).

Key Design Decisions

  1. Why Cellpose flows instead of direct mask prediction? Flows handle overlapping/touching bubbles via convergence β€” each pixel flows toward its instance center. Direct mask prediction can't separate touching instances.

  2. Why distance transform head? For circles, the DT peak = radius. This gives you sizing "for free" without post-processing the mask.

  3. Why depthwise-separable convs? ~8Γ— fewer params than standard convs. For a narrow domain (your lab slides), this compression is lossless.

  4. Why MSE on logits instead of BCE on masks? See "The Bug and The Fix" section above. BCE on sparse binary masks fails due to extreme class imbalance. MSE on teacher logits gives gradients everywhere.

When to Re-train

The student is specialized to your current lab setup. Re-train when:

  • Microscope/camera settings change significantly
  • Bubble preparation protocol changes
  • Image resolution changes

Re-training is fast: ~30 min for 400 epochs on 50 images with a GPU.

References

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Papers for callumtilbury/bubble-distill