File size: 2,039 Bytes
3eb72e3 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 |
---
license: apache-2.0
pipeline_tag: image-text-to-text
library_name: transformers
---
This repository contains the official implementation of the paper: [Med3DVLM: An Efficient Vision-Language Model for 3D Medical Image Analysis](https://huggingface.co/papers/2503.20047).
Med3DVLM is a 3D VLM designed to address the challenges of 3D medical image analysis through efficient encoding, improved image-text alignment with a pairwise sigmoid loss, and a dual-stream MLP-Mixer projector for richer multi-modal representations. It achieves superior performance across multiple benchmarks including image-text retrieval, report generation, and open/closed-ended visual question answering.
Code: https://github.com/mirthAI/Med3DVLM

## Installation
First, clone the repository to your local machine:
```bash
git clone https://github.com/mirthAI/Med3DVLM.git
cd Med3DVLM
```
To install the required packages, you can use the following command:
```bash
conda create -n Med3DVLM -f env.yaml
conda activate Med3DVLM
```
or
```bash
pip install -r requirements.txt
```
You need to set the `PYTHONPATH` environment variable to the root directory of the project. You can do this by running the following command in your terminal:
```bash
export PYTHONPATH=$(pwd):$PYTHONPATH
```
## Sample Usage
To run a demo in the terminal, use the following command (replace `path_to_model` and `path_to_image` with your actual paths):
```bash
python scr/demo/demo.py --model_name_or_path path_to_model --image_path path_to_image --question "Describe the findings of the medical image you see."
```
## Citation
If you use our code or find our work helpful, please consider citing our paper:
```bibtex
@article{xin2025med3dvlm,
title={Med3DVLM: An Efficient Vision-Language Model for 3D Medical Image Analysis},
author={Xin, Yu and Ates, Gorkem Can and Gong, Kuang and Shao, Wei},
journal={IEEE Journal of Biomedical and Health Informatics},
year={2025}
}
``` |