OTSurv / README.md

nielsr HF Staff

Enhance model card with metadata, abstract, and GitHub link

9318f99 verified 6 months ago

preview code

raw

history blame

8.95 kB

metadata

language: en
license: cc-by-nc-sa-4.0
tags:
  - survival-analysis
  - multiple-instance-learning
  - optimal-transport
  - medical-imaging
  - deep-learning
  - pytorch
pipeline_tag: image-classification
library_name: pytorch
model-index:
  - name: OTSurv
    results:
      - task:
          type: survival-analysis
          name: Survival Prediction
        dataset:
          name: TCGA (BLCA, BRCA, LUAD, STAD, COADREAD, KIRC)
          type: TCGA
        metrics:
          - type: c-index
            value: 0.646

OTSurv: A Novel Multiple Instance Learning Framework for Survival Prediction with Heterogeneity-aware Optimal Transport

🌟 MICCAI 2025 🌟

Qin Ren^{1 ★} Yifan Wang¹ Ruogu Fang² Haibin Ling¹ Chenyu You^{1 ★}

¹ Stony Brook University ² University of Florida
★ Corresponding authors

Abstract

Survival prediction using whole slide images (WSIs) can be formulated as a multiple instance learning (MIL) problem. However, existing MIL methods often fail to explicitly capture pathological heterogeneity within WSIs, both globally -- through long-tailed morphological distributions, and locally through -- tile-level prediction uncertainty. Optimal transport (OT) provides a principled way of modeling such heterogeneity by incorporating marginal distribution constraints. Building on this insight, we propose OTSurv, a novel MIL framework from an optimal transport perspective. Specifically, OTSurv formulates survival predictions as a heterogeneity-aware OT problem with two constraints: (1) global long-tail constraint that models prior morphological distributions to avert both mode collapse and excessive uniformity by regulating transport mass allocation, and (2) local uncertainty-aware constraint that prioritizes high-confidence patches while suppressing noise by progressively raising the total transport mass. We then recast the initial OT problem, augmented by these constraints, into an unbalanced OT formulation that can be solved with an efficient, hardware-friendly matrix scaling algorithm. Empirically, OTSurv sets new state-of-the-art results across six popular benchmarks, achieving an absolute 3.6% improvement in average C-index. In addition, OTSurv achieves statistical significance in log-rank tests and offers high interpretability, making it a powerful tool for survival prediction in digital pathology. Our codes are available at this https URL .

🧠 DL;TR

Welcome to the official repository of OTSurv, a novel framework that integrates Multiple Instance Learning (MIL) with Heterogeneity-aware Optimal Transport (OT) to tackle the challenges of survival prediction in medical imaging and clinical data.

📍 To be presented at MICCAI 2025
🧠 Focus: Survival Analysis · Multiple Instance Learning · Optimal Transport

📁 Data Organization

Project Structure

OTSurv/
├── checkpoints/
│   ├── model_blca_fold0.pth
│   ├── model_blca_fold1.pth
│   └── ...
│ 
├── data/
│   ├── tcga_blca/
│   ├── tcga_brca/
│   ├── tcga_coadread/
│   ├── tcga_kirc/
│   ├── tcga_luad/
│   └── tcga_stad/
│ 
├── result/
│   ├── exp_otsurv_test/
│   ├── exp_otsurv_train/
│   └── visualization/
│
├── src/
│   ├── scripts/
│   ├── analysis/
│   └── ...
│
└── docs/
│   ├── OTSurv_main.png
│   └── OTSurv_heatmap.png

Feature Format

H5 Format: Features are stored in .h5 files (directories ending with feats_h5/)

For patch feature extraction, please refer to CLAM.

You can download the preprocessed features from this link.

🚀 Quick Start

Prerequisites

Python 3.8+
GPU or CPU-only
Conda package manager

Installation

# Clone the repository
git clone https://github.com/Y-Research-SBU/OTSurv.git
cd OTSurv

# Create conda environment
conda env create -f env.yaml
conda activate otsurv

Training

# Training results will be saved under result/exp_otsurv_train

cd src
# Train on all datasets
bash scripts/train_otsurv.sh

# Train on TCGA-BLCA dataset specifically
bash scripts/train_blca.sh

Evaluation

You can download all trained checkpoints from this link.

# Test results will be saved under result/exp_otsurv_test

cd src
# Test on all datasets
bash scripts/test_otsurv.sh

# Test on TCGA-BLCA dataset specifically
bash scripts/test_blca.sh

cd src
# Calculate performance metrics
python analysis/calculate_CIndex_mean_std.py

# Generated figures will be saved under result/visualization

cd src
# Generate survival curves
python analysis/plot_survival_curv.py

The survival curve for TCGA-BLCA looks like this:

📊 Performance Results

Below are the C-Index performance results of OTSurv across different cancer types:

Cancer Type	Mean C-Index	Std Dev
BRCA	0.621	±0.071
BLCA	0.637	±0.065
LUAD	0.638	±0.077
STAD	0.565	±0.057
COADREAD	0.667	±0.111
KIRC	0.750	±0.149

Overall Performance: Average C-Index across all datasets is 0.646

💡 Note: C-Index (Concordance Index) is a commonly used performance metric in survival analysis, where values closer to 1.0 indicate better prediction performance.

📚 Citation

If you find this work useful, please cite our paper:

@misc{ren2025otsurvnovelmultipleinstance,
      title={OTSurv: A Novel Multiple Instance Learning Framework for Survival Prediction with Heterogeneity-aware Optimal Transport}, 
      author={Qin Ren and Yifan Wang and Ruogu Fang and Haibin Ling and Chenyu You},
      year={2025},
      eprint={2506.20741},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2506.20741}, 
}

📝 Note: This paper has been accepted at MICCAI 2025. The citation details will be updated once the paper is officially published.

🙏 Acknowledgements

This work builds upon the excellent research from:

📄 License

This project is licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License - see the LICENSE.md file for details.

🤝 Contributing

We welcome contributions to OTSurv! If you have suggestions, bug reports, or want to add features or experiments, feel free to:

🐞 Submit an issue
🔧 Open a pull request
💬 Start a discussion

⭐ If you find this repository helpful, please consider starring it! ⭐