language: en
license: cc-by-nc-sa-4.0
tags:
- survival-analysis
- multiple-instance-learning
- optimal-transport
- medical-imaging
- deep-learning
- pytorch
pipeline_tag: image-classification
library_name: pytorch
model-index:
- name: OTSurv
results:
- task:
type: survival-analysis
name: Survival Prediction
dataset:
name: TCGA (BLCA, BRCA, LUAD, STAD, COADREAD, KIRC)
type: TCGA
metrics:
- type: c-index
value: 0.646
OTSurv: A Novel Multiple Instance Learning Framework for Survival Prediction with Heterogeneity-aware Optimal Transport
π MICCAI 2025 π
Qin Ren1 β Yifan Wang1 Ruogu Fang2 Haibin Ling1 Chenyu You1 β
1 Stony Brook University
2 University of Florida
β
Corresponding authors
Abstract
Survival prediction using whole slide images (WSIs) can be formulated as a multiple instance learning (MIL) problem. However, existing MIL methods often fail to explicitly capture pathological heterogeneity within WSIs, both globally -- through long-tailed morphological distributions, and locally through -- tile-level prediction uncertainty. Optimal transport (OT) provides a principled way of modeling such heterogeneity by incorporating marginal distribution constraints. Building on this insight, we propose OTSurv, a novel MIL framework from an optimal transport perspective. Specifically, OTSurv formulates survival predictions as a heterogeneity-aware OT problem with two constraints: (1) global long-tail constraint that models prior morphological distributions to avert both mode collapse and excessive uniformity by regulating transport mass allocation, and (2) local uncertainty-aware constraint that prioritizes high-confidence patches while suppressing noise by progressively raising the total transport mass. We then recast the initial OT problem, augmented by these constraints, into an unbalanced OT formulation that can be solved with an efficient, hardware-friendly matrix scaling algorithm. Empirically, OTSurv sets new state-of-the-art results across six popular benchmarks, achieving an absolute 3.6% improvement in average C-index. In addition, OTSurv achieves statistical significance in log-rank tests and offers high interpretability, making it a powerful tool for survival prediction in digital pathology. Our codes are available at this https URL .
π§ DL;TR
Welcome to the official repository of OTSurv, a novel framework that integrates Multiple Instance Learning (MIL) with Heterogeneity-aware Optimal Transport (OT) to tackle the challenges of survival prediction in medical imaging and clinical data.
π To be presented at MICCAI 2025
π§ Focus: Survival Analysis Β· Multiple Instance Learning Β· Optimal Transport
π Data Organization
Project Structure
OTSurv/
βββ checkpoints/
β βββ model_blca_fold0.pth
β βββ model_blca_fold1.pth
β βββ ...
β
βββ data/
β βββ tcga_blca/
β βββ tcga_brca/
β βββ tcga_coadread/
β βββ tcga_kirc/
β βββ tcga_luad/
β βββ tcga_stad/
β
βββ result/
β βββ exp_otsurv_test/
β βββ exp_otsurv_train/
β βββ visualization/
β
βββ src/
β βββ scripts/
β βββ analysis/
β βββ ...
β
βββ docs/
β βββ OTSurv_main.png
β βββ OTSurv_heatmap.png
Feature Format
- H5 Format: Features are stored in
.h5files (directories ending withfeats_h5/)
For patch feature extraction, please refer to CLAM.
You can download the preprocessed features from this link.
π Quick Start
Prerequisites
- Python 3.8+
- GPU or CPU-only
- Conda package manager
Installation
# Clone the repository
git clone https://github.com/Y-Research-SBU/OTSurv.git
cd OTSurv
# Create conda environment
conda env create -f env.yaml
conda activate otsurv
Training
# Training results will be saved under result/exp_otsurv_train
cd src
# Train on all datasets
bash scripts/train_otsurv.sh
# Train on TCGA-BLCA dataset specifically
bash scripts/train_blca.sh
Evaluation
You can download all trained checkpoints from this link.
# Test results will be saved under result/exp_otsurv_test
cd src
# Test on all datasets
bash scripts/test_otsurv.sh
# Test on TCGA-BLCA dataset specifically
bash scripts/test_blca.sh
cd src
# Calculate performance metrics
python analysis/calculate_CIndex_mean_std.py
# Generated figures will be saved under result/visualization
cd src
# Generate survival curves
python analysis/plot_survival_curv.py
The survival curve for TCGA-BLCA looks like this:
π Performance Results
Below are the C-Index performance results of OTSurv across different cancer types:
| Cancer Type | Mean C-Index | Std Dev |
|---|---|---|
| BRCA | 0.621 | Β±0.071 |
| BLCA | 0.637 | Β±0.065 |
| LUAD | 0.638 | Β±0.077 |
| STAD | 0.565 | Β±0.057 |
| COADREAD | 0.667 | Β±0.111 |
| KIRC | 0.750 | Β±0.149 |
Overall Performance: Average C-Index across all datasets is 0.646
π‘ Note: C-Index (Concordance Index) is a commonly used performance metric in survival analysis, where values closer to 1.0 indicate better prediction performance.
π Citation
If you find this work useful, please cite our paper:
@misc{ren2025otsurvnovelmultipleinstance,
title={OTSurv: A Novel Multiple Instance Learning Framework for Survival Prediction with Heterogeneity-aware Optimal Transport},
author={Qin Ren and Yifan Wang and Ruogu Fang and Haibin Ling and Chenyu You},
year={2025},
eprint={2506.20741},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2506.20741},
}
π Note: This paper has been accepted at MICCAI 2025. The citation details will be updated once the paper is officially published.
π Acknowledgements
This work builds upon the excellent research from:
π License
This project is licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License - see the LICENSE.md file for details.
π€ Contributing
We welcome contributions to OTSurv! If you have suggestions, bug reports, or want to add features or experiments, feel free to:
- π Submit an issue
- π§ Open a pull request
- π¬ Start a discussion
β If you find this repository helpful, please consider starring it! β