File size: 4,474 Bytes

e025b68

---
tags:
  - vision
  - dinov2
  - hematology
  - cytomorphology
  - foundation-model
license: apache-2.0
citation: |
  @inproceedings{koch2024dinobloom,
    title={DinoBloom: a foundation model for generalizable cell embeddings in hematology},
    author={Koch, Valentin and Wagner, Sophia J and Kazeminia, Salome and Sancar, Ece and Hehr, Matthias and Schnabel, Julia A and Peng, Tingying and Marr, Carsten},
    booktitle={International Conference on Medical Image Computing and Computer-Assisted Intervention},
    pages={520--530},
    year={2024},
    organization={Springer}
  }
---

# DinoBloom: A Foundation Model for Generalizable Cell Embeddings in Hematology

<div align="center">
  <img src="https://raw.githubusercontent.com/MarrLab/DinoBloom/main/media/logo.png" width="160" alt="DinoBloom logo"/>
  <br><br>

  **DinoBloom** builds upon [DINOv2](https://arxiv.org/abs/2304.07193) (Meta AI) and is trained on **13 diverse publicly available datasets** of single cells from peripheral blood and bone marrow.

  <br>
  <a href="https://arxiv.org/abs/2404.05022">📄 Paper</a> •
  <a href="https://github.com/MarrLab/DinoBloom">💻 GitHub</a> •
  <a href="https://zenodo.org/records/10908163">📦 Zenodo</a>
</div>

---

## 🧠 Model Variants

DinoBloom is available in **four sizes**:

| Model | Feature Dim | Parameters | Checkpoint |
|-------|-------------|------------|------------|
| **DinoBloom-S** | 384 | 22M | `pytorch_model_s.bin` |
| **DinoBloom-B** | 768 | 86M | `pytorch_model_b.bin` |
| **DinoBloom-L** | 1024 | 304M | `pytorch_model_l.bin` |
| **DinoBloom-G** | 1536 | 1136M | `pytorch_model_g.bin` |

---

## 🚀 Usage

```python
from huggingface_hub import hf_hub_download
import torch
import torch.nn as nn

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

# Choose variant: "s", "b", "l", or "g"
variant = "b"

# Configuration
variant_config = {
    "s": ("dinov2_vits14", 384),
    "b": ("dinov2_vitb14", 768),
    "l": ("dinov2_vitl14", 1024),
    "g": ("dinov2_vitg14", 1536),
}

dinov2_model, embed_dim = variant_config[variant]

# Load base DINOv2 model
model = torch.hub.load("facebookresearch/dinov2", dinov2_model)

# Download DinoBloom weights
ckpt_path = hf_hub_download(
    repo_id="MarrLab/DinoBloom",
    filename=f"pytorch_model_{variant}.bin"
)
ckpt = torch.load(ckpt_path, map_location="cpu")

num_tokens = int(1 + (224 / 14) ** 2)
model.pos_embed = nn.Parameter(torch.zeros(1, num_tokens, embed_dim))
model.load_state_dict(ckpt, strict=True)
model.to(device)
model.eval()

# Get transforms
from torchvision import transforms
transform = transforms.Compose([
    transforms.Resize((224,224)),
    transforms.ToTensor(),
    transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]),
])

# Apply to image
from PIL import Image
img = Image.open("path/to/cell_image")
img_tensor = transform(img).unsqueeze(0).to(device)

# Get features
with torch.no_grad():
    features = model(img_tensor)

print(f"Features shape: {features.shape}")  # [1, 768] for DinoBloom-B
```

---

## 📊 Model Performance

DinoBloom outperforms existing medical and non-medical vision models in:

1. **Linear probing** and **k-nearest neighbor** evaluations for cell-type classification
2. **Weakly supervised multiple-instance learning (MIL)** for acute myeloid leukemia subtyping

See our [paper](https://arxiv.org/abs/2404.05022) for detailed benchmarks.

---

## 🔧 Requirements

```bash
pip install torch torchvision huggingface_hub
```

---

## 📚 Citation

If you use DinoBloom in your research, please cite:

```bibtex
@inproceedings{koch2024dinobloom,
  title={DinoBloom: a foundation model for generalizable cell embeddings in hematology},
  author={Koch, Valentin and Wagner, Sophia J and Kazeminia, Salome and Sancar, Ece and Hehr, Matthias and Schnabel, Julia A and Peng, Tingying and Marr, Carsten},
  booktitle={International Conference on Medical Image Computing and Computer-Assisted Intervention},
  pages={520--530},
  year={2024},
  organization={Springer}
}
```

---

## 📖 Related Work

DinoBloom builds upon:
- [DINOv2](https://arxiv.org/abs/2304.07193) - Self-supervised vision transformers
- [Original DinoBloom Paper](https://arxiv.org/abs/2404.05022) - MICCAI 2024

---

## 📄 License

Apache 2.0 - See [LICENSE](LICENSE) file for details.

---


For questions or issues, please open an issue on [GitHub](https://github.com/MarrLab/DinoBloom) or contact the authors.