Improve model card: Add pipeline tag, license, and update content (#1)

Browse files

- Improve model card: Add pipeline tag, license, and update content (0977f68f19054421eade42b44f195cd6245794d5)

Co-authored-by: Niels Rogge <nielsr@users.noreply.huggingface.co>

Files changed (1) hide show

README.md +57 -70

README.md CHANGED Viewed

@@ -1,100 +1,87 @@
-# Medal-S: Spatio-Textual Prompt Model for Medical Segmentation
 [![Paper](https://img.shields.io/badge/Paper-Arxiv-b31b1b.svg)](https://arxiv.org/abs/2511.13001)
 [![OpenReview](https://img.shields.io/badge/OpenReview-Discussion-4CAF50.svg)](https://openreview.net/forum?id=9vCx66pnLn#discussion)
 [![Docker](https://img.shields.io/badge/Docker-Image-2496ED.svg)](https://huggingface.co/spc819/Medal-S-V1.0/resolve/main/teamx.tar.gz)
-Official repository for **Medal-S**, a spatio-textual prompt model for medical image segmentation, developed for the CVPR 2025 Foundation Models for Text-Guided 3D Biomedical Image Segmentation challenge.
-## Paper
-**Medal-S: Spatio-Textual Prompt Model for Medical Segmentation**
-*CVPR 2025 Workshop MedSegFM*
-[arXiv Paper](https://arxiv.org/abs/2511.13001) | [OpenReview Discussion](https://openreview.net/forum?id=9vCx66pnLn#discussion)
-## Quick Start
-### Docker Image
-Download the pre-built Docker image for testing submission (2025/05/30):
-```bash
-# Download from Hugging Face
-wget https://huggingface.co/spc819/Medal-S-V1.0/resolve/main/teamx.tar.gz
-```
-### Installation
-1. **Install nnU-Net v2.4.1:**
 ```bash
 wget https://github.com/MIC-DKFZ/nnUNet/archive/refs/tags/v2.4.1.tar.gz
 tar -xvf v2.4.1.tar.gz
 pip install -e nnUNet-2.4.1
-```
-2. **Install customized dynamic-network-architectures:**
-```bash
 cd model
 pip install -e dynamic-network-architectures-main
-```
-### Requirements
-- **Python:** 3.10.16
-- **Key Packages:**
-  ```
-  torch==2.2.0
-  transformers==4.51.3
-  monai==1.4.0
-  nibabel==5.3.2
-  tensorboard
-  einops
-  positional_encodings
-  scipy
-  pandas
-  scikit-learn
-  scikit-image
-  batchgenerators
-  acvl_utils
-  ```
-## Dataset
-The model is trained on the [CVPR-BiomedSegFM](https://huggingface.co/datasets/junma/CVPR-BiomedSegFM) dataset available on Hugging Face:
-```python
-from datasets import load_dataset
-dataset = load_dataset("junma/CVPR-BiomedSegFM")
 ```
-## Training
-1. **Data Preparation:** Preprocess training data using `data/challenge_data/get_train_jsonl.py` to generate `train_all.jsonl`
-2. **Knowledge Enhancement:** Use pre-trained text encoder from [SAT](https://github.com/zhaoziheng/SAT/tree/cvpr2025challenge) available on [Hugging Face](https://huggingface.co/zzh99/SAT/tree/main/Pretrain)
-3. **Segmentation Training:** Run the training script:
-```bash
-sh/cvpr2025_Blosc2_pretrain_1.0_1.0_1.0_UNET_ps192.sh
-```
-**Training Requirements:**
-- **224×224×128 (1.5,1.5,3.0) spacing:** 2× H100-80GB GPUs, ~7 days, batch size 2/GPU
-- **192×192×192 (1.0,1.0,1.0) spacing:** 4× H100-80GB GPUs, batch size 2/GPU
-## Inference
-Run inference on test data:
 ```bash
 python inference.py
 ```
-## Acknowledgements
-This project builds upon and significantly improves:
-- **[nnU-Net](https://github.com/MIC-DKFZ/nnUNet/tree/master)**
-- **[SAT](https://github.com/zhaoziheng/SAT/tree/cvpr2025challenge)**
-## Maintainers
-**Medal-S** is developed and maintained by **Medical Image Insights**.

+---
+pipeline_tag: image-segmentation
+license: apache-2.0
+---
+# Medal S: Spatio-Textual Prompt Model for Medical Segmentation
 [![Paper](https://img.shields.io/badge/Paper-Arxiv-b31b1b.svg)](https://arxiv.org/abs/2511.13001)
 [![OpenReview](https://img.shields.io/badge/OpenReview-Discussion-4CAF50.svg)](https://openreview.net/forum?id=9vCx66pnLn#discussion)
+[![HuggingFace](https://img.shields.io/badge/🤗%20HuggingFace-Model-yellow.svg)](https://huggingface.co/spc819/Medal-S-V1.0)
+[![GitHub](https://img.shields.io/badge/GitHub-Code-blue.svg?logo=github)](https://github.com/yinghemedical/Medal-S)
 [![Docker](https://img.shields.io/badge/Docker-Image-2496ED.svg)](https://huggingface.co/spc819/Medal-S-V1.0/resolve/main/teamx.tar.gz)
+This repository provides guidance for training and inference of Medal S within the [CVPR 2025: Foundation Models for Text-Guided 3D biomedical image segmentation](https://www.codabench.org/competitions/5651/)
+Docker link for the 2025/05/30 testing submission: [Medal S](https://drive.google.com/file/d/1HRJqYUXajptGsKaXEhn-s3rGcnKIwGs7/view)
+## Requirements
+The U-Net implementation relies on a customized version of [dynamic-network-architectures](https://github.com/MIC-DKFZ/dynamic-network-architectures). To install it, navigate to the `model` directory and run:
 ```bash
+# Install nnU-Net v2.4.1:
 wget https://github.com/MIC-DKFZ/nnUNet/archive/refs/tags/v2.4.1.tar.gz
 tar -xvf v2.4.1.tar.gz
 pip install -e nnUNet-2.4.1
 cd model
 pip install -e dynamic-network-architectures-main
+````
+**Python Version:** 3.10.16
+**Key Python Packages:**
+```
+torch==2.2.0
+transformers==4.51.3
+monai==1.4.0
+nibabel==5.3.2
+tensorboard
+einops
+positional_encodings
+scipy
+pandas
+scikit-learn
+scikit-image
+batchgenerators
+acvl_utils
 ```
+## Training Guidance
+First, download the dataset from [Hugging Face: junma/CVPR-BiomedSegFM](https://huggingface.co/datasets/junma/CVPR-BiomedSegFM).
+*   **Data Preparation**: Preprocess and organize all training data into a `train_all.jsonl` file using the provided script: `data/challenge_data/get_train_jsonl.py`.
+*   **Knowledge Enhancement**: You can either use the pre-trained text encoder from SAT ([https://github.com/zhaoziheng/SAT/tree/cvpr2025challenge](https://github.com/zhaoziheng/SAT/tree/cvpr2025challenge)) available on [Hugging Face](https://huggingface.co/zzh99/SAT/tree/main/Pretrain), or pre-train it yourself following the guidance in this [repository](https://github.com/zhaoziheng/SAT-Pretrain/tree/master). As recommended by SAT, we **freeze** the text encoder when training the segmentation model.
+*   **Segmentation**: The training script is located at `sh/cvpr2025_Blosc2_pretrain_1.0_1.0_1.0_UNET_ps192.sh`. Before training, NPZ files will be converted to the Blosc2 compressed format (from the nnU-Net framework).
+Training takes approximately 7 days with 2x H100-80GB GPUs for a 224x224x128 (1.5, 1.5, 3.0) spacing model, using a batch size of 2 per GPU. For a 192x192x192 (1.0, 1.0, 1.0) spacing model, it requires 4x H100-80GB GPUs with a batch size of 2 per GPU. You may modify the patch size and batch size to train on GPUs with less memory.
+## Inference Guidance
+We provide inference code for test data:
 ```bash
 python inference.py
 ```
+## Citation
+```
+@misc{shi2025medalsspatiotextualprompt,
+      title={Medal S: Spatio-Textual Prompt Model for Medical Segmentation},
+      author={Pengcheng Shi and Jiawei Chen and Jiaqi Liu and Xinglin Zhang and Tao Chen and Lei Li},
+      year={2025},
+      eprint={2511.13001},
+      archivePrefix={arXiv},
+      primaryClass={cs.CV},
+      url={https://arxiv.org/abs/2511.13001},
+}
+```
+## Acknowledgements
+This project is significantly improved based on [nnU-Net](https://github.com/MIC-DKFZ/nnUNet/tree/master) and [SAT](https://github.com/zhaoziheng/SAT/tree/cvpr2025challenge). We extend our gratitude to both projects.
+Medal-S is developed and maintained by Medical Image Insights.
+<img src="https://github.com/yinghemedical/Medal-S/raw/main/assets/yh_logo.png" height="100px" />