GenBloom / README.md
nielsr's picture
nielsr HF Staff
Improve model card: add paper link, GitHub link, and summary
98a215b verified
|
raw
history blame
2.36 kB
metadata
language:
  - en
license: apache-2.0
pipeline_tag: image-feature-extraction
tags:
  - cytology
  - hematology
  - pathology
  - vision
  - pytorch
  - self-supervised
  - vit

GenBloom

GenBloom is a genetically-aligned foundation model for peripheral blood smears. It aligns single white blood cell images with chromosomal aberrations (karyotype) and somatic mutations from targeted gene panels.

For the source code, setup, and evaluation scripts, see the GenBloom GitHub repository.

Model Description

GenBloom is a patient-level encoder trained using a two-stage approach:

  1. GenBloom-V (Self-supervised Pretraining): Vision-only pretraining of a transformer aggregator using an iBOT head on a cohort of over 1,500 patients.
  2. GenBloom-G (Genetic Alignment): Further alignment of visual features with chromosomal aberrations and somatic mutations via supervised contrastive loss on acute myeloid leukemia patients.

The model provides improved representations for hematological diagnostic tasks and provides off-the-shelf retrieval capabilities for diseases and genetic alterations.

Checkpoints

This repository contains the model weights used for the public visual downstream reproduction:

checkpoints/
  genbloom_v/
    genbloom_v.pth
  genbloom_g/
    genbloom_g_fold0.pth
    genbloom_g_fold1.pth
    genbloom_g_fold2.pth
    genbloom_g_fold3.pth
    genbloom_g_fold4.pth
  • The genbloom_v checkpoint corresponds to image-only pretraining.
  • The genbloom_g checkpoints were further genetically aligned.

Usage

Download Checkpoints

You can download the checkpoints using the huggingface_hub library:

from huggingface_hub import snapshot_download

snapshot_download("MarrLab/GenBloom", local_dir="checkpoints")

Inference

A minimal end-to-end inference example is available in the inference_genbloom.ipynb notebook in the official repository.

Citation

If you use GenBloom in your research, please cite:

@article{genbloom2024,
  title={Genetically Aligned Patient Representations Improve Hematological Diagnosis},
  author={Adelpantidis, Georgios and others},
  journal={arXiv preprint arXiv:2605.29980},
  year={2024}
}