You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

These weights are released under CC BY-NC 4.0 — strictly non-commercial,
research and educational use only. By requesting access you agree to:

  1. Use the weights only for non-commercial research.
  2. Cite the SPARC paper in any derived publication.
  3. Not redistribute the weights to third parties.

Log in or Sign Up to review the conditions and access this model content.

SPARC

Gene-program-aware survival modelling from H&E whole-slide images.

This repository hosts the trained model weights for the SPARC paper (Ayed, Cohn, et al.). Code, configs, training scripts, and figure-regeneration notebooks live at github.com/aziz-ayed/SPARC.

SPARC is a two-stage pipeline:

  1. SPARC-Map predicts 40 hallmark gene-expression-program (GEP) scores per H&E patch, recovering a spatial molecular map of each slide.
  2. SPARC-Risk fuses those per-patch GEP scores with the same H&E features through a signature-query attention head and a cancer-aware gate, producing a single per-patient risk score.

These weights cover the SPARC-Risk model and the image-only baseline used for ablations.

SPARC pipeline

What you get

Folder Model Description
sparc_risk/ SPARC-Risk (canonical) Signature-query fusion + H&E. The model reported throughout the paper.
image_only/ Image-only baseline Same backbone, GEP pathway disabled. Use for direct ablation against SPARC-Risk.

Each folder contains 5 checkpoints — fold_0_best.pt through fold_4_best.pt — corresponding to the 5-fold cross-validation splits described in the paper and in data/mmp_hybrid_splits_v2_20cancer.csv.

Every .pt carries both model_state_dict and the original training config, so the model can be rebuilt with one line:

import torch
from sparc.models.factory import build_model

ckpt = torch.load("sparc_risk/fold_0_best.pt", map_location="cpu", weights_only=False)
model = build_model(ckpt["config"])
model.load_state_dict(ckpt["model_state_dict"])
model.eval()

Quick start

# 1. Install the SPARC package
git clone https://github.com/aziz-ayed/SPARC.git && cd SPARC
conda env create -f environment.yml
conda activate sparc

# 2. Accept the license on https://huggingface.co/azizayed/SPARC, then:
pip install -U "huggingface_hub[cli]"
hf auth login
hf download azizayed/SPARC --local-dir checkpoints

# 3. Inference on an external cohort (e.g. NLST lung)
python -m inference.run \
    --cohort nlst \
    --checkpoint_dir checkpoints/sparc_risk \
    --gpus 0,1,2,3

The download produces:

checkpoints/
├── sparc_risk/   fold_{0..4}_best.pt
└── image_only/   fold_{0..4}_best.pt

Architecture (SPARC-Risk)

Component Setting
Image backbone H-optimus-1 (1536-dim)
Patch size / magnification 224 px @ 20×
Max patches per slide 4096
Fusion Signature-query cross-attention (64-NN, 4 heads)
Aggregator Gated attention MIL
Head Discrete-time NLL survival, 4 bins
Cancer conditioning Per-cancer learned gate
Hidden dim 256
Trainable params ≈ 2.6 M
Optimiser / schedule Adam, lr 1 × 10⁻⁴, cosine T_max 20
Random seed 1337

Full config + reproduction recipe: configs/sparc_risk.yaml.

Training data

5-fold patient-level cross-validation over 20 TCGA cancer types (BLCA, BRCA, CESC, COAD, ESCA, GBM, HNSC, KIRC, KIRP, LGG, LIHC, LUAD, LUSC, PAAD, READ, SARC, SKCM, STAD, UCEC, plus a held-out evaluation split). Splits derive from the MMP hybrid scheme of Mahmood et al. and are released alongside the code at data/mmp_hybrid_splits_v2_20cancer.csv.

External validation cohorts (not used for training) — NLST lung, SurGen CRC, Yale breast, ovarian — are described in the paper.

Intended use

These weights are intended for non-commercial biomedical research and education only. Acceptable uses include:

  • Reproducing the SPARC paper's results.
  • Benchmarking against SPARC-Risk in computational-pathology research.
  • Methodological extensions (new fusion designs, additional cohorts, ablation studies).

Citation

The SPARC paper is currently under review. Once a preprint or accepted version is available, a BibTeX entry will be added here. In the meantime, if you use these weights, please link back to github.com/aziz-ayed/SPARC and contact the corresponding author at azizayed@mit.edu.

License

These weights are released under the Creative Commons Attribution-NonCommercial 4.0 International License (CC BY-NC 4.0). For commercial licensing, please contact the authors via the corresponding GitHub issues page.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support