|
|
--- |
|
|
tags: |
|
|
- vision |
|
|
- dinov2 |
|
|
- hematology |
|
|
- cytomorphology |
|
|
- foundation-model |
|
|
license: apache-2.0 |
|
|
citation: | |
|
|
@inproceedings{koch2024dinobloom, |
|
|
title={DinoBloom: a foundation model for generalizable cell embeddings in hematology}, |
|
|
author={Koch, Valentin and Wagner, Sophia J and Kazeminia, Salome and Sancar, Ece and Hehr, Matthias and Schnabel, Julia A and Peng, Tingying and Marr, Carsten}, |
|
|
booktitle={International Conference on Medical Image Computing and Computer-Assisted Intervention}, |
|
|
pages={520--530}, |
|
|
year={2024}, |
|
|
organization={Springer} |
|
|
} |
|
|
--- |
|
|
|
|
|
# DinoBloom: A Foundation Model for Generalizable Cell Embeddings in Hematology |
|
|
|
|
|
<div align="center"> |
|
|
<img src="https://raw.githubusercontent.com/MarrLab/DinoBloom/main/media/logo.png" width="160" alt="DinoBloom logo"/> |
|
|
<br><br> |
|
|
|
|
|
**DinoBloom** builds upon [DINOv2](https://arxiv.org/abs/2304.07193) (Meta AI) and is trained on **13 diverse publicly available datasets** of single cells from peripheral blood and bone marrow. |
|
|
|
|
|
<br> |
|
|
<a href="https://arxiv.org/abs/2404.05022">π Paper</a> β’ |
|
|
<a href="https://github.com/MarrLab/DinoBloom">π» GitHub</a> β’ |
|
|
<a href="https://zenodo.org/records/10908163">π¦ Zenodo</a> |
|
|
</div> |
|
|
|
|
|
--- |
|
|
|
|
|
## π§ Model Variants |
|
|
|
|
|
DinoBloom is available in **four sizes**: |
|
|
|
|
|
| Model | Feature Dim | Parameters | Checkpoint | |
|
|
|-------|-------------|------------|------------| |
|
|
| **DinoBloom-S** | 384 | 22M | `pytorch_model_s.bin` | |
|
|
| **DinoBloom-B** | 768 | 86M | `pytorch_model_b.bin` | |
|
|
| **DinoBloom-L** | 1024 | 304M | `pytorch_model_l.bin` | |
|
|
| **DinoBloom-G** | 1536 | 1136M | `pytorch_model_g.bin` | |
|
|
|
|
|
--- |
|
|
|
|
|
## π Usage |
|
|
|
|
|
```python |
|
|
from huggingface_hub import hf_hub_download |
|
|
import torch |
|
|
import torch.nn as nn |
|
|
|
|
|
device = torch.device("cuda" if torch.cuda.is_available() else "cpu") |
|
|
|
|
|
# Choose variant: "s", "b", "l", or "g" |
|
|
variant = "b" |
|
|
|
|
|
# Configuration |
|
|
variant_config = { |
|
|
"s": ("dinov2_vits14", 384), |
|
|
"b": ("dinov2_vitb14", 768), |
|
|
"l": ("dinov2_vitl14", 1024), |
|
|
"g": ("dinov2_vitg14", 1536), |
|
|
} |
|
|
|
|
|
dinov2_model, embed_dim = variant_config[variant] |
|
|
|
|
|
# Load base DINOv2 model |
|
|
model = torch.hub.load("facebookresearch/dinov2", dinov2_model) |
|
|
|
|
|
# Download DinoBloom weights |
|
|
ckpt_path = hf_hub_download( |
|
|
repo_id="MarrLab/DinoBloom", |
|
|
filename=f"pytorch_model_{variant}.bin" |
|
|
) |
|
|
ckpt = torch.load(ckpt_path, map_location="cpu") |
|
|
|
|
|
num_tokens = int(1 + (224 / 14) ** 2) |
|
|
model.pos_embed = nn.Parameter(torch.zeros(1, num_tokens, embed_dim)) |
|
|
model.load_state_dict(ckpt, strict=True) |
|
|
model.to(device) |
|
|
model.eval() |
|
|
|
|
|
# Get transforms |
|
|
from torchvision import transforms |
|
|
transform = transforms.Compose([ |
|
|
transforms.Resize((224,224)), |
|
|
transforms.ToTensor(), |
|
|
transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]), |
|
|
]) |
|
|
|
|
|
# Apply to image |
|
|
from PIL import Image |
|
|
img = Image.open("path/to/cell_image") |
|
|
img_tensor = transform(img).unsqueeze(0).to(device) |
|
|
|
|
|
# Get features |
|
|
with torch.no_grad(): |
|
|
features = model(img_tensor) |
|
|
|
|
|
print(f"Features shape: {features.shape}") # [1, 768] for DinoBloom-B |
|
|
``` |
|
|
|
|
|
--- |
|
|
|
|
|
## π Model Performance |
|
|
|
|
|
DinoBloom outperforms existing medical and non-medical vision models in: |
|
|
|
|
|
1. **Linear probing** and **k-nearest neighbor** evaluations for cell-type classification |
|
|
2. **Weakly supervised multiple-instance learning (MIL)** for acute myeloid leukemia subtyping |
|
|
|
|
|
See our [paper](https://arxiv.org/abs/2404.05022) for detailed benchmarks. |
|
|
|
|
|
--- |
|
|
|
|
|
## π§ Requirements |
|
|
|
|
|
```bash |
|
|
pip install torch torchvision huggingface_hub |
|
|
``` |
|
|
|
|
|
--- |
|
|
|
|
|
## π Citation |
|
|
|
|
|
If you use DinoBloom in your research, please cite: |
|
|
|
|
|
```bibtex |
|
|
@inproceedings{koch2024dinobloom, |
|
|
title={DinoBloom: a foundation model for generalizable cell embeddings in hematology}, |
|
|
author={Koch, Valentin and Wagner, Sophia J and Kazeminia, Salome and Sancar, Ece and Hehr, Matthias and Schnabel, Julia A and Peng, Tingying and Marr, Carsten}, |
|
|
booktitle={International Conference on Medical Image Computing and Computer-Assisted Intervention}, |
|
|
pages={520--530}, |
|
|
year={2024}, |
|
|
organization={Springer} |
|
|
} |
|
|
``` |
|
|
|
|
|
--- |
|
|
|
|
|
## π Related Work |
|
|
|
|
|
DinoBloom builds upon: |
|
|
- [DINOv2](https://arxiv.org/abs/2304.07193) - Self-supervised vision transformers |
|
|
- [Original DinoBloom Paper](https://arxiv.org/abs/2404.05022) - MICCAI 2024 |
|
|
|
|
|
--- |
|
|
|
|
|
## π License |
|
|
|
|
|
Apache 2.0 - See [LICENSE](LICENSE) file for details. |
|
|
|
|
|
--- |
|
|
|
|
|
|
|
|
For questions or issues, please open an issue on [GitHub](https://github.com/MarrLab/DinoBloom) or contact the authors. |
|
|
|