MCP-MedSAM

Pytorch Implementation of the paper: "MCP-MedSAM: A Powerful Lightweight Medical Segment Anything Model Trained with a Single GPU in Just One Day"

MCP-MedSAM Architecture

πŸ“„ Overview

This work proposes a lightweight variant of MedSAM by integrating:

  • A pre-trained Tiny ViT as the vision backbone
  • Two novel prompt types:
    • Modality Prompt
    • Content Prompt
  • A modified mask decoder adapted to these prompts

To further improve performance across imaging modalities, we introduce a modality-aware data sampling strategy that ensures better balance and generalization.

With these enhancements, our model achieves strong multi-modality segmentation performance, and can be trained in approximately 1 day on a single A100 (40GB) GPU.

Requirements

  • Python==3.10.14
  • torch==2.0.0
  • torchvision==0.15.0
  • transformers==4.49.0

Training and Inference

Training and inference can be done by running train.py and infer.py. Model weights are stored in the pytorch_model.bin file, which can be loaded for inference.

Citation

@article{lyu2024mcp,
  title={MCP-MedSAM: A Powerful Lightweight Medical Segment Anything Model Trained with a Single GPU in Just One Day},
  author={Lyu, Donghang and Gao, Ruochen and Staring, Marius},
  journal={arXiv preprint arXiv:2412.05888},
  year={2024}
}
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Paper for Leo-Lyu/MCP-MedSAM