--- language: en license: cc-by-nc-sa-4.0 tags: - survival-analysis - multiple-instance-learning - optimal-transport - medical-imaging - deep-learning - pytorch pipeline_tag: image-classification library_name: pytorch model-index: - name: OTSurv results: - task: type: survival-analysis name: Survival Prediction dataset: name: TCGA (BLCA, BRCA, LUAD, STAD, COADREAD, KIRC) type: TCGA metrics: - type: c-index value: 0.646 ---
OTSurv Logo

OTSurv: A Novel Multiple Instance Learning Framework for Survival Prediction with Heterogeneity-aware Optimal Transport

๐ŸŒŸ MICCAI 2025 ๐ŸŒŸ


Qin Ren1 โ˜…  Yifan Wang1  Ruogu Fang2  Haibin Ling1  Chenyu You1 โ˜…

1 Stony Brook University    2 University of Florida   
โ˜… Corresponding authors

Paper Hugging Face Model Hugging Face Dataset GitHub Code PyTorch 2.3

## Abstract Survival prediction using whole slide images (WSIs) can be formulated as a multiple instance learning (MIL) problem. However, existing MIL methods often fail to explicitly capture pathological heterogeneity within WSIs, both globally -- through long-tailed morphological distributions, and locally through -- tile-level prediction uncertainty. Optimal transport (OT) provides a principled way of modeling such heterogeneity by incorporating marginal distribution constraints. Building on this insight, we propose OTSurv, a novel MIL framework from an optimal transport perspective. Specifically, OTSurv formulates survival predictions as a heterogeneity-aware OT problem with two constraints: (1) global long-tail constraint that models prior morphological distributions to avert both mode collapse and excessive uniformity by regulating transport mass allocation, and (2) local uncertainty-aware constraint that prioritizes high-confidence patches while suppressing noise by progressively raising the total transport mass. We then recast the initial OT problem, augmented by these constraints, into an unbalanced OT formulation that can be solved with an efficient, hardware-friendly matrix scaling algorithm. Empirically, OTSurv sets new state-of-the-art results across six popular benchmarks, achieving an absolute 3.6% improvement in average C-index. In addition, OTSurv achieves statistical significance in log-rank tests and offers high interpretability, making it a powerful tool for survival prediction in digital pathology. Our codes are available at this https URL . ## ๐Ÿง  DL;TR

Welcome to the official repository of OTSurv, a novel framework that integrates Multiple Instance Learning (MIL) with Heterogeneity-aware Optimal Transport (OT) to tackle the challenges of survival prediction in medical imaging and clinical data.

๐Ÿ“ To be presented at MICCAI 2025
๐Ÿง  Focus: Survival Analysis ยท Multiple Instance Learning ยท Optimal Transport
OTSurv Framework Overview
## ๐Ÿ“ Data Organization ### Project Structure ``` OTSurv/ โ”œโ”€โ”€ checkpoints/ โ”‚ โ”œโ”€โ”€ model_blca_fold0.pth โ”‚ โ”œโ”€โ”€ model_blca_fold1.pth โ”‚ โ””โ”€โ”€ ... โ”‚ โ”œโ”€โ”€ data/ โ”‚ โ”œโ”€โ”€ tcga_blca/ โ”‚ โ”œโ”€โ”€ tcga_brca/ โ”‚ โ”œโ”€โ”€ tcga_coadread/ โ”‚ โ”œโ”€โ”€ tcga_kirc/ โ”‚ โ”œโ”€โ”€ tcga_luad/ โ”‚ โ””โ”€โ”€ tcga_stad/ โ”‚ โ”œโ”€โ”€ result/ โ”‚ โ”œโ”€โ”€ exp_otsurv_test/ โ”‚ โ”œโ”€โ”€ exp_otsurv_train/ โ”‚ โ””โ”€โ”€ visualization/ โ”‚ โ”œโ”€โ”€ src/ โ”‚ โ”œโ”€โ”€ scripts/ โ”‚ โ”œโ”€โ”€ analysis/ โ”‚ โ””โ”€โ”€ ... โ”‚ โ””โ”€โ”€ docs/ โ”‚ โ”œโ”€โ”€ OTSurv_main.png โ”‚ โ””โ”€โ”€ OTSurv_heatmap.png ``` ### Feature Format - **H5 Format**: Features are stored in `.h5` files (directories ending with `feats_h5/`) For patch feature extraction, please refer to [CLAM](https://github.com/mahmoodlab/CLAM). You can download the preprocessed features from [this link](https://huggingface.co/datasets/Y-Research-Group/OTSurv_Dataset).
## ๐Ÿš€ Quick Start ### Prerequisites - Python 3.8+ - GPU or CPU-only - Conda package manager ### Installation ```bash # Clone the repository git clone https://github.com/Y-Research-SBU/OTSurv.git cd OTSurv # Create conda environment conda env create -f env.yaml conda activate otsurv ``` ### Training ```bash # Training results will be saved under result/exp_otsurv_train cd src # Train on all datasets bash scripts/train_otsurv.sh # Train on TCGA-BLCA dataset specifically bash scripts/train_blca.sh ``` ### Evaluation You can download all trained checkpoints from [this link](https://huggingface.co/Y-Research-Group/OTSurv). ```bash # Test results will be saved under result/exp_otsurv_test cd src # Test on all datasets bash scripts/test_otsurv.sh # Test on TCGA-BLCA dataset specifically bash scripts/test_blca.sh ``` ```bash cd src # Calculate performance metrics python analysis/calculate_CIndex_mean_std.py ``` ```bash # Generated figures will be saved under result/visualization cd src # Generate survival curves python analysis/plot_survival_curv.py ``` The survival curve for TCGA-BLCA looks like this:
TCGA-BLCA Survival Curve

## ๐Ÿ“Š Performance Results Below are the C-Index performance results of OTSurv across different cancer types: | Cancer Type | Mean C-Index | Std Dev | |-------------|-------------|---------| | **BRCA** | 0.621 | ยฑ0.071 | | **BLCA** | 0.637 | ยฑ0.065 | | **LUAD** | 0.638 | ยฑ0.077 | | **STAD** | 0.565 | ยฑ0.057 | | **COADREAD** | 0.667 | ยฑ0.111 | | **KIRC** | 0.750 | ยฑ0.149 | **Overall Performance**: Average C-Index across all datasets is **0.646** > ๐Ÿ’ก **Note**: C-Index (Concordance Index) is a commonly used performance metric in survival analysis, where values closer to 1.0 indicate better prediction performance.
## ๐Ÿ“š Citation If you find this work useful, please cite our paper: ```bibtex @misc{ren2025otsurvnovelmultipleinstance, title={OTSurv: A Novel Multiple Instance Learning Framework for Survival Prediction with Heterogeneity-aware Optimal Transport}, author={Qin Ren and Yifan Wang and Ruogu Fang and Haibin Ling and Chenyu You}, year={2025}, eprint={2506.20741}, archivePrefix={arXiv}, primaryClass={cs.CV}, url={https://arxiv.org/abs/2506.20741}, } ``` > ๐Ÿ“ **Note**: This paper has been accepted at MICCAI 2025. The citation details will be updated once the paper is officially published. >
## ๐Ÿ™ Acknowledgements This work builds upon the excellent research from: - [PANTHER](https://openaccess.thecvf.com/content/CVPR2024/html/Song_Morphological_Prototyping_for_Unsupervised_Slide_Representation_Learning_in_Computational_Pathology_CVPR_2024_paper.html) - [MMP](https://github.com/mahmoodlab/MMP) - [CLAM](https://github.com/mahmoodlab/CLAM) - [PPOT](https://github.com/rhfeiyang/PPOT)
## ๐Ÿ“„ License This project is licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License - see the [LICENSE.md](LICENSE.md) file for details.
## ๐Ÿค Contributing We welcome contributions to **OTSurv**! If you have suggestions, bug reports, or want to add features or experiments, feel free to: - ๐Ÿž Submit an issue - ๐Ÿ”ง Open a pull request - ๐Ÿ’ฌ Start a discussion ---

โญ If you find this repository helpful, please consider starring it! โญ