SegMoTE: Token-Level Mixture of Experts for Medical Image Segmentation

**Yujie Lu1*, Jingwen Li2*, Sibo Ju3, Yanzhou Su4, He Yao1, Yisong Liu1, Min Zhu1†, Junlong Cheng1† 1Sichuan University    2Xinjiang University    3Fuzhou University    4Alibaba DAMO Academy **CVPR 2026 (Oral)** [![Paper](https://img.shields.io/badge/Paper-arXiv-b31b1b.svg)](https://arxiv.org/abs/2602.19213) [![Code](https://img.shields.io/badge/Code-PyTorch-blue.svg)](#installation) [![Model](https://img.shields.io/badge/Model-Download-orange.svg)](#checkpoint)
## Abstract Medical image segmentation requires robust adaptation across heterogeneous modalities and anatomical structures, while pixel-level annotation remains expensive. SegMoTE is an efficient adaptation framework built on the Segment Anything Model (SAM). It introduces a **token-level Mixture of Experts (MoTE)** mechanism that dynamically selects modality-adaptive expert tokens, and a **Progressive Prompt Tokenization (PPT)** module that learns feature-conditioned prompts for prompt-free segmentation on suitable foreground-background tasks. Trained on the curated **MedSeg-HQ** dataset, SegMoTE aims to retain the flexible prompt interface and generalization ability of SAM while providing lightweight adaptation for multimodal medical image segmentation. ## Citation The BibTeX entry will be updated after the public paper record is available: ```bibtex @article{lu2026segmote, title={SegMoTE: Token-Level Mixture of Experts for Medical Image Segmentation}, author={Lu, Yujie and Li, Jingwen and Ju, Sibo and Su, Yanzhou and Liu, Yisong and Zhu, Min and Cheng, Junlong and others}, journal={arXiv preprint arXiv:2602.19213}, year={2026} } ```