Spaces:
Sleeping
Sleeping
| title: LoGoSAM_demo | |
| app_file: app.py | |
| sdk: gradio | |
| sdk_version: 5.29.0 | |
| # ProtoSAM - One shot segmentation with foundational models | |
| Link to our paper [here](https://arxiv.org/abs/2407.07042). \ | |
| This work is the successor of [DINOv2-based-Self-Supervised-Learning](https://github.com/levayz/DINOv2-based-Self-Supervised-Learning) (Link to [Paper](arxiv.org/abs/2403.03273)). | |
| ## Demo Application | |
| A Gradio-based demo application is now available for interactive inference with ProtoSAM. You can upload your own images and masks to test the model. See [README_DEMO.md](README_DEMO.md) for instructions on running the demo. | |
| ## Abstract | |
| This work introduces a new framework, ProtoSAM, for one-shot image segmentation. It combines DINOv2, a vision transformer that extracts features from images, with an Adaptive Local Prototype Pooling (ALP) layer, which generates prototypes from a support image and its mask. These prototypes are used to create an initial coarse segmentation mask by comparing the query image's features with the prototypes. | |
| Following the extraction of an initial mask, we use numerical methods to generate prompts, such as points and bounding boxes, which are then input into the Segment Anything Model (SAM), a prompt-based segmentation model trained on natural images. This allows segmenting new classes automatically and effectively without the need for additional training. | |
| ## How To Run | |
| ### 1. Data preprocessing | |
| #### 1.1 CT and MRI Dataset | |
| Please see the notebook `data/data_processing.ipynb` for instructions. | |
| For convenience i've compiled the data processing instructions from https://github.com/cheng-01037/Self-supervised-Fewshot-Medical-Image-Segmentation to a single notebook. \ | |
| The CT dataset is available here: https://www.synapse.org/Synapse:syn3553734 \ | |
| The MRI dataset is availabel here: https://chaos.grand-challenge.org | |
| run `./data/CHAOST2/dcm_img_to_nii.sh` to convert dicom images to nifti files. | |
| #### 1.2 Polyp Dataset | |
| Data is available here: https://www.kaggle.com/datasets/hngphmv/polypdataset?select=train.csv | |
| Put the dataset `data/PolypDataset/` | |
| ### 2. Running | |
| #### 2.1 (Optional) Training and Validation of the coarse segmentation networks | |
| ``` | |
| ./backbone.sh [MODE] [MODALITY] [LABEL_SET] | |
| ``` | |
| MODE - validation or training \ | |
| MODALITY - ct or mri \ | |
| LABEL_SET - 0 (kidneys), 1 (liver spleen) | |
| for example: | |
| ``` | |
| ./backbone.sh training mri 1 | |
| ``` | |
| Please refer to `backbone.sh` for further configurations. | |
| #### 2.1 Running ProtoSAM | |
| Put all SAM checkpoint like sam_vit_b.pth, sam_vit_h.pth, medsam_vit_b.pth into the `pretrained_model` directory. \ | |
| Checkpoints are available at [SAM](https://github.com/facebookresearch/segment-anything) and [MedSAM](https://github.com/bowang-lab/MedSAM). | |
| ``` | |
| ./run_protosam.sh [MODALITY] [LABEL_SET] | |
| ``` | |
| MODALITY - ct, mri or polyp \ | |
| LABEL_SET (only relevant if doing ct or mri) - 0 (kidneys), 1 (liver spleen) | |
| Please refer to the `run_protosam.sh` script for further configurations. | |
| ## Acknowledgements | |
| This work is largely based on [ALPNet](https://github.com/cheng-01037/Self-supervised-Fewshot-Medical-Image-Segmentation), [DINOv2](https://github.com/facebookresearch/dinov2), [SAM](https://github.com/facebookresearch/segment-anything) and is a continuation of [DINOv2-based-Self-Supervised-Learning](https://github.com/levayz/DINOv2-based-Self-Supervised-Learning). | |
| ## Cite | |
| If you found this repo useful, please consider giving us a citation and a star! | |
| ```bibtex | |
| @article{ayzenberg2024protosam, | |
| title={ProtoSAM-One Shot Medical Image Segmentation With Foundational Models}, | |
| author={Ayzenberg, Lev and Giryes, Raja and Greenspan, Hayit}, | |
| journal={arXiv preprint arXiv:2407.07042}, | |
| year={2024} | |
| } | |
| @misc{ayzenberg2024dinov2, | |
| title={DINOv2 based Self Supervised Learning For Few Shot Medical Image Segmentation}, | |
| author={Lev Ayzenberg and Raja Giryes and Hayit Greenspan}, | |
| year={2024}, | |
| eprint={2403.03273}, | |
| archivePrefix={arXiv}, | |
| primaryClass={cs.CV} | |
| } | |
| ``` | |
| # ProtoSAM Segmentation Demo | |
| This Streamlit application demonstrates the capabilities of the ProtoSAM model for few-shot segmentation. Users can upload a query image, support image, and support mask to generate a segmentation prediction. | |
| ## Requirements | |
| - Python 3.8 or higher | |
| - CUDA-compatible GPU | |
| - Required Python packages (see `requirements.txt`) | |
| ## Setup Instructions | |
| 1. Clone this repository: | |
| ```bash | |
| git clone <your-repository-url> | |
| cd <repository-name> | |
| ``` | |
| 2. Create and activate a virtual environment (optional but recommended): | |
| ```bash | |
| python -m venv venv | |
| source venv/bin/activate # On Windows: venv\Scripts\activate | |
| ``` | |
| 3. Install the required dependencies: | |
| ```bash | |
| pip install -r requirements.txt | |
| ``` | |
| 4. Download the pretrained models: | |
| ```bash | |
| mkdir -p pretrained_model | |
| # Download SAM ViT-H model | |
| wget -P pretrained_model https://dl.fbaipublicfiles.com/segment_anything/sam_vit_h_4b8939.pth | |
| mv pretrained_model/sam_vit_h_4b8939.pth pretrained_model/sam_vit_h.pth | |
| ``` | |
| 5. Update the model path in `app.py`: | |
| - Set the `reload_model_path` in the config dictionary to the path of your trained ProtoSAM model. | |
| ## Running the App | |
| Start the Streamlit app with: | |
| ```bash | |
| streamlit run app.py | |
| ``` | |
| This will open a browser window with the interface for the segmentation demo. | |
| ## Usage | |
| 1. Upload a query image (the image you want to segment) | |
| 2. Upload a support image (an example image with a similar object) | |
| 3. Upload a support mask (the segmentation mask for the support image) | |
| 4. Use the sidebar to configure the model parameters if needed | |
| 5. Click "Run Inference" to generate the segmentation result | |
| ## Model Configuration | |
| The app allows you to configure several model parameters via the sidebar: | |
| - Use Bounding Box: Enable/disable bounding box input | |
| - Use Points: Enable/disable point input | |
| - Use Mask: Enable/disable mask input | |
| - Use CCA: Enable/disable Connected Component Analysis | |
| - Coarse Prediction Only: Use only the coarse segmentation model without SAM refinement | |
| ## Notes | |
| - This demo requires a GPU with CUDA support | |
| - Large images may require more GPU memory | |
| - For optimal results, use high-quality support images and masks | |