SegMoTE: Token-Level Mixture of Experts for Medical Image Segmentation

Yujie Lu¹, Jingwen Li², Sibo Ju³, Yanzhou Su⁴, He Yao¹, Yisong Liu¹, Min Zhu^1†, Junlong Cheng^1†

¹Sichuan University ²Xinjiang University ³Fuzhou University ⁴Alibaba DAMO Academy

CVPR 2026 (Oral)

Abstract

Medical image segmentation requires robust adaptation across heterogeneous modalities and anatomical structures, while pixel-level annotation remains expensive. SegMoTE is an efficient adaptation framework built on the Segment Anything Model (SAM). It introduces a token-level Mixture of Experts (MoTE) mechanism that dynamically selects modality-adaptive expert tokens, and a Progressive Prompt Tokenization (PPT) module that learns feature-conditioned prompts for prompt-free segmentation on suitable foreground-background tasks. Trained on the curated MedSeg-HQ dataset, SegMoTE aims to retain the flexible prompt interface and generalization ability of SAM while providing lightweight adaptation for multimodal medical image segmentation.

Citation

The BibTeX entry will be updated after the public paper record is available:

@article{lu2026segmote,
  title={SegMoTE: Token-Level Mixture of Experts for Medical Image Segmentation},
  author={Lu, Yujie and Li, Jingwen and Ju, Sibo and Su, Yanzhou and Liu, Yisong and Zhu, Min and Cheng, Junlong and others},
  journal={arXiv preprint arXiv:2602.19213},
  year={2026}
}

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Paper for yujielu/SegMoTE

SegMoTE: Token-Level Mixture of Experts for Medical Image Segmentation

Paper • 2602.19213 • Published Feb 22