--- language: - en license: apache-2.0 pipeline_tag: image-feature-extraction tags: - cytology - hematology - pathology - vision - pytorch - self-supervised - vit --- # GenBloom [GenBloom](https://huggingface.co/papers/2605.29980) is a genetically-aligned foundation model for peripheral blood smears. It aligns single white blood cell images with chromosomal aberrations (karyotype) and somatic mutations from targeted gene panels. For the source code, setup, and evaluation scripts, see the [GenBloom GitHub repository](https://github.com/marrlab/GenBloom). ## Model Description GenBloom is a patient-level encoder trained using a two-stage approach: 1. **GenBloom-V (Self-supervised Pretraining)**: Vision-only pretraining of a transformer aggregator using an iBOT head on a cohort of over 1,500 patients. 2. **GenBloom-G (Genetic Alignment)**: Further alignment of visual features with chromosomal aberrations and somatic mutations via supervised contrastive loss on acute myeloid leukemia patients. The model provides improved representations for hematological diagnostic tasks and provides off-the-shelf retrieval capabilities for diseases and genetic alterations. ## Checkpoints This repository contains the model weights used for the public visual downstream reproduction: ```text checkpoints/ genbloom_v/ genbloom_v.pth genbloom_g/ genbloom_g_fold0.pth genbloom_g_fold1.pth genbloom_g_fold2.pth genbloom_g_fold3.pth genbloom_g_fold4.pth ``` - The `genbloom_v` checkpoint corresponds to image-only pretraining. - The `genbloom_g` checkpoints were further genetically aligned. ## Usage ### Download Checkpoints You can download the checkpoints using the `huggingface_hub` library: ```python from huggingface_hub import snapshot_download snapshot_download("MarrLab/GenBloom", local_dir="checkpoints") ``` ### Inference A minimal end-to-end inference example is available in the [`inference_genbloom.ipynb`](https://github.com/marrlab/GenBloom/blob/main/inference_genbloom.ipynb) notebook in the official repository. ## Citation If you use GenBloom in your research, please cite: ```bibtex @article{genbloom2024, title={Genetically Aligned Patient Representations Improve Hematological Diagnosis}, author={Adelpantidis, Georgios and others}, journal={arXiv preprint arXiv:2605.29980}, year={2024} } ```