RedDino-large / README.md

Improve model card: Update pipeline tag, license, and add comprehensive details from GitHub (#1)

4b86474 verified 6 months ago

7.76 kB

	---
	datasets:
	- Elsafty
	- Chula
	- DSE
	library_name: timm
	license: cc-by-nc-4.0
	pipeline_tag: image-feature-extraction
	tags:
	- red-blood-cells
	- hematology
	- medical-imaging
	- vision-transformer
	- dino
	- dinov2
	- feature-extraction
	- foundation-model
	model-index:
	- name: RedDino-large
	results:
	- task:
	type: image-classification
	name: RBC Shape Classification
	dataset:
	name: Elsafty
	type: Classification
	metrics:
	- type: Weighted F1
	value: 88.5
	- type: Balanced Accuracy
	value: 89.1
	- type: Accuracy
	value: 88.4
	- type: Weighted F1
	value: 83.9
	- type: Balanced Accuracy
	value: 79.0
	- type: Accuracy
	value: 85.0
	- type: Weighted F1
	value: 86.6
	- type: Balanced Accuracy
	value: 60.1
	- type: Accuracy
	value: 86.6
	---

	# RedDino: A Foundation Model for Red Blood Cell Analysis

	RedDino is a self-supervised Vision Transformer foundation model specifically designed for red blood cell (RBC) image analysis, as presented in the paper [RedDino: A foundation model for red blood cell analysis](https://arxiv.org/abs/2508.08180).

	It leverages a tailored version of the DINOv2 framework, trained on a meticulously curated dataset of 1.25 million RBC images from diverse acquisition modalities and sources. This model excels at extracting robust, general-purpose features for downstream hematology tasks such as shape classification, morphological subtype recognition, and batch-effect–robust analysis.

	Unlike general-purpose models pretrained on natural images, RedDino incorporates hematology-specific augmentations, architectural tweaks, and RBC-tailored data preprocessing, enabling state-of-the-art performance on multiple RBC benchmarks.

	> 🧠 Developed by [Luca Zedda](https://orcid.org/0009-0001-8488-1612), [Andrea Loddo](https://orcid.org/0000-0002-6571-3816), [Cecilia Di Ruberto](https://orcid.org/0000-0003-4641-0307), and [Carsten Marr](https://orcid.org/0000-0003-2154-4552)
	> 🏥 University of Cagliari & Helmholtz Munich
	> 📄 Preprint: [arXiv:2508.08180](https://arxiv.org/abs/2508.08180)
	> 💻 Code: [https://github.com/Snarci/RedDino](https://github.com/Snarci/RedDino)

	---

	## Model Details

	- Architecture: ViT-large, patch size 14
	- SSL framework: DINOv2 (customized for RBC morphology)
	- Pretraining dataset: Curated RBC images from 18 datasets (multiple modalities and sources)
	- Embedding size: 1024
	- Intended use: RBC morphology classification, feature extraction, batch-effect–robust analysis
	Notes:
	- RBC-specific training strategy including removal of KoLeo regularizer and Sinkhorn-Knopp centering.
	- Training on smear patches (not only single cells) to enhance cross-source generalization.

	## Example Usage
	```python
	from PIL import Image
	from torchvision import transforms
	import timm
	import torch
	# Load model from Hugging Face Hub
	model = timm.create_model("hf_hub:Snarcy/RedDino-large", pretrained=True)
	model.eval()
	device = "cuda" if torch.cuda.is_available() else "cpu"
	model.to(device)
	# Load and preprocess image
	image = Image.open("path/to/rbc_image.jpg").convert("RGB")
	transform = transforms.Compose([
	transforms.Resize((224, 224)),
	transforms.ToTensor(),
	transforms.Normalize(mean=[0.485, 0.456, 0.406],
	std=[0.229, 0.224, 0.225]),
	])
	input_tensor = transform(image).unsqueeze(0).to(device)
	# Extract features
	with torch.no_grad():
	embedding = model(input_tensor)
	```

	## Model Variants

	RedDino comes in three sizes to suit different computational requirements and performance needs:

	\| Model Variant \| Embedding Size \| Parameters \| Usage \|
	\|---------------\|----------------\|------------\|--------\|
	\| RedDino-small \| 384 \| 22M \| `timm.create_model("hf_hub:Snarcy/RedDino-small", pretrained=True)` \|
	\| RedDino-base \| 768 \| 86M \| `timm.create_model("hf_hub:Snarcy/RedDino-base", pretrained=True)` \|
	\| RedDino-large \| 1024 \| 304M \| `timm.create_model("hf_hub:Snarcy/RedDino-large", pretrained=True)` \|

	Choose the variant that best fits your computational budget and performance requirements. Larger models generally provide richer feature representations at the cost of increased computational overhead.

	---

	## Benchmark Results

	RedDino was benchmarked on major RBC classification datasets—including Elsafty, Chula, and DSE—outperforming state-of-the-art baselines such as ResNet50, DinoBloom, and DINOv2.

	\| Model \| Dataset \| Metric \| Linear Probing (wF1) \| 1-NN (wF1) \| 20-NN (wF1) \|
	\|-------------------\|-----------\|-------------\|----------------------\|------------\|-------------\|
	\| ResNet50 \| Elsafty \| Weighted F1 \| 77.6 ± 8.1 \| 64.3 ± 4.8 \| 66.2 ± 4.9 \|
	\| DinoBloom-S \| Elsafty \| Weighted F1 \| 83.2 ± 8.2 \| 73.1 ± 5.1 \| 76.5 ± 4.2 \|
	\| DINOv2 (small) \| Elsafty \| Weighted F1 \| 82.1 ± 8.2 \| 73.5 ± 4.8 \| 77.2 ± 4.6 \|
	\| RedDino small \| Elsafty \| Weighted F1 \| 86.0 ± 7.0 \| 76.8 ± 4.9 \| 80.0 ± 4.5 \|
	\| RedDino base \| Elsafty \| Weighted F1 \| 88.1 ± 4.9 \| 78.8 ± 3.6 \| 82.6 ± 2.8 \|
	\| RedDino large \| Elsafty \| Weighted F1 \| 88.5 ± 5.5 \| 78.5 ± 4.6 \| 81.6 ± 4.7 \|

	On Chula and DSE datasets, RedDino consistently surpassed all other models in feature quality (linear probing) with average improvements of 2–4% over prior approaches in key metrics.

	---

	## Highlights

	- Foundation model for RBC analysis trained on the largest available multi-source RBC image set: 1.25M+ images, using advanced CellPose-based instance segmentation and patch extraction.
	- DINOv2-based self-supervised learning for label-efficient pretraining and robust, transferable features.
	- Model architecture and key innovations:
	- Patch-based training (224×224 px) shown to outperform single-cell training.
	- Novel data augmentation via Albumentations (32 pixel-level strategies).
	- Removal of the Koleo regularizer and adoption of Sinkhorn-Knopp centering for improved representation in RBC-specific domains.
	- Suite of models (small, base, large) covering 22M–304M parameters.
	- Generalization: Strong adaptation across varied protocols, microscopes, and imaging sites. Demonstrated resistance to batch effects and out-of-domain variance.
	- Interpretability tools: PCA/UMAP visualizations reveal clustering by phenotype and batch, distinguishing abnormal cells (e.g., malaria, echinocytes).
	- Easy deployment: Models and code are available on [GitHub](https://github.com/Snarci/RedDino) and [Hugging Face](https://huggingface.co/collections/Snarcy/reddino-689a13e29241d2e5690202fc).

	---

	## 📝 Citation

	If you use this model, please cite the following paper:

	RedDino: A foundation model for red blood cell analysis
	Luca Zedda, Andrea Loddo, Cecilia Di Ruberto, Carsten Marr — 2025
	Preprint: arXiv:2508.08180. https://arxiv.org/abs/2508.08180

	```bibtex
	@misc{zedda2025reddinofoundationmodelred,
	title={RedDino: A foundation model for red blood cell analysis},
	author={Luca Zedda and Andrea Loddo and Cecilia Di Ruberto and Carsten Marr},
	year={2025},
	eprint={2508.08180},
	archivePrefix={arXiv},
	primaryClass={cs.CV},
	url={https://arxiv.org/abs/2508.08180},
	}
	```

	---

	## Summary

	RedDino is the first family of foundation models tailored for comprehensive red blood cell image analysis, using large-scale self-supervised learning to set new performance benchmarks and generalization standards for computational hematology. Models and pretrained weights are available for research and practical deployment.