Update README.md
Browse files
README.md
CHANGED
|
@@ -1,23 +1,88 @@
|
|
| 1 |
---
|
| 2 |
license: apache-2.0
|
| 3 |
---
|
| 4 |
-
#
|
| 5 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 6 |
|
| 7 |
-
|
| 8 |
|
| 9 |
-
|
|
|
|
|
|
|
| 10 |
|
| 11 |
-
##
|
| 12 |
-
|
| 13 |
|
| 14 |
-
|
| 15 |
-
You can find our dataset in [3D-RAD_Dataset](https://huggingface.co/datasets/Tang-xiaoxiao/3D-RAD).
|
| 16 |
|
| 17 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 18 |
|
| 19 |
| Model | Paper |
|
| 20 |
| ----- | ------------------------------------------------------------ |
|
| 21 |
| [RadFM](https://github.com/chaoyi-wu/RadFM) | Towards Generalist Foundation Model for Radiology by Leveraging Web-scale 2D&3D Medical Data | https://github.com/chaoyi-wu/RadFM |
|
| 22 |
| [M3D](https://github.com/BAAI-DCAI/M3D) | M3D: Advancing 3D Medical Image Analysis with Multi-Modal Large Language Models |
|
| 23 |
-
| OmniV(not open) | OmniV-Med: Scaling Medical Vision-Language Model for Universal Visual Understanding |
|
|
|
|
| 1 |
---
|
| 2 |
license: apache-2.0
|
| 3 |
---
|
| 4 |
+
# [ π― NeurIPS 2025 ] 3D-RAD π©»: A Comprehensive 3D Radiology Med-VQA Dataset with Multi-Temporal Analysis and Diverse Diagnostic Tasks
|
| 5 |
+
<div align="center">
|
| 6 |
+
<a href="https://github.com/Tang-xiaoxiao/3D-RAD/stargazers">
|
| 7 |
+
<img src="https://img.shields.io/github/stars/Tang-xiaoxiao/3D-RAD?style=social" />
|
| 8 |
+
</a>
|
| 9 |
+
<a href="https://arxiv.org/abs/2506.11147">
|
| 10 |
+
<img src="https://img.shields.io/badge/arXiv-Paper-b31b1b.svg?logo=arxiv" />
|
| 11 |
+
</a>
|
| 12 |
+
<a href="https://GitHub.com/Naereen/StrapDown.js/graphs/commit-activity">
|
| 13 |
+
<img src="https://img.shields.io/badge/Maintained%3F-yes-green.svg" />
|
| 14 |
+
</a>
|
| 15 |
+
</div>
|
| 16 |
+
## π’ News
|
| 17 |
|
| 18 |
+
<summary><strong>What's New in This Update π</strong></summary>
|
| 19 |
|
| 20 |
+
- **2025.10.23**: π₯ Updated **the latest version** of the paper!
|
| 21 |
+
- **2025.09.19**: π₯ Paper accepted to **NeurIPS 2025**! π―
|
| 22 |
+
- **2025.05.16**: π₯ Set up the repository and committed the dataset!
|
| 23 |
|
| 24 |
+
## π Overview
|
| 25 |
+
π‘ In this repository, we present the model for **["3D-RAD: A Comprehensive 3D Radiology Med-VQA Dataset with Multi-Temporal Analysis and Diverse Diagnostic Tasks"](https://arxiv.org/pdf/2506.11147)**.
|
| 26 |
|
| 27 |
+
In our project, we collect a large-scale dataset designed to advance 3D Med-VQA using radiology CT scans, 3D-RAD, encompasses six diverse VQA tasks: **Anomaly Detection** (task 1), **Image Observation** (task 2), **Medical Computation** (task 3), **Existence Detection** (task 4), **Static Temporal Diagnosis** (task 5), and **Longitudinal Temporal Diagnosis** (task 6).
|
|
|
|
| 28 |
|
| 29 |
+

|
| 30 |
+

|
| 31 |
+
|
| 32 |
+
## π€ M3D-RAD Model
|
| 33 |
+
To assess the utility of 3D-RAD, we **finetuned two M3D model variants** with different parameter scales, thereby constructing the M3D-RAD models. You can find our finetuned model in [M3D-RAD_Models](https://huggingface.co/Tang-xiaoxiao/M3D-RAD).
|
| 34 |
+
|
| 35 |
+

|
| 36 |
+
|
| 37 |
+
## π Evaluation
|
| 38 |
+
|
| 39 |
+
### Zero-Shot Evaluation.
|
| 40 |
+
We conducted **zero-shot evaluation** of several stateof-the-art 3D medical vision-language models on our benchmark to assess their generalization capabilities.
|
| 41 |
+
|
| 42 |
+

|
| 43 |
+
|
| 44 |
+
In the `RadFM` and `M3D` directory, there are code for evaluating RadFM and M3D models on our 3D-RAD benchmark. Note that, the base code is [RadFM](https://github.com/chaoyi-wu/RadFM), and the base code is [M3D](https://github.com/BAAI-DCAI/M3D). To run our evaluation, you should first satisfy the requirements and download the models according to the base code of these models.
|
| 45 |
+
|
| 46 |
+
Compare to the base code, we make the following modifications: In the `RadFM` directory, we add a new Dataset in `RadFM/src/Dataset/dataset/rad_dataset.py` and modify the Dataset to test in `RadFM/src/Dataset/multi_dataset_test.py`. Then we add a new python file to evaluate our benchmark in `RadFM/src/eval_3DRAD.py`. In the `M3D` directory, we add a new Dataset in `M3D/Bench/dataset/multi_dataset.py` and add a new python file to evaluate our benchmark in `M3D/Bench/eval/eval_3DRAD.py`.
|
| 47 |
+
|
| 48 |
+
You can evaluate RadFM on our 3D-RAD benchmark by running:
|
| 49 |
+
|
| 50 |
+
```python
|
| 51 |
+
cd 3D-RAD/RadFM/src
|
| 52 |
+
python eval_3DRAD.py \
|
| 53 |
+
--file_path={your test file_path} \
|
| 54 |
+
--output_path={your saved output_path}
|
| 55 |
+
```
|
| 56 |
+
|
| 57 |
+
You can evaluate M3D on our 3D-RAD benchmark by running:
|
| 58 |
+
|
| 59 |
+
```python
|
| 60 |
+
cd 3D-RAD/M3D
|
| 61 |
+
python Bench/eval/eval_3DRAD.py \
|
| 62 |
+
--model_name_or_path={your model_name} \
|
| 63 |
+
--vqa_data_test_path={your test file_path} \
|
| 64 |
+
--output_dir={your saved output_dir}
|
| 65 |
+
```
|
| 66 |
+
|
| 67 |
+
### Scaling with Varying Training Set Sizes.
|
| 68 |
+
To further investigate the impact of dataset scale on model performance, we randomly **sampled 1%, 10% and 100%** of the training data per task and fine-tuned M3D accordingly.
|
| 69 |
+
|
| 70 |
+

|
| 71 |
+
|
| 72 |
+
## π 3D-RAD Dataset
|
| 73 |
+
In the `3DRAD` directory, there are QA data without 3D images.
|
| 74 |
+
You can find the full dataset with 3D images (For efficient model input, the original CT images were preprocessed and converted into .npy format.) in [3D-RAD_Dataset](https://huggingface.co/datasets/Tang-xiaoxiao/3D-RAD).
|
| 75 |
+
|
| 76 |
+

|
| 77 |
+

|
| 78 |
+
|
| 79 |
+
## π Data Source
|
| 80 |
+
The original CT scans in our dataset are derived from [CT-RATE](https://huggingface.co/datasets/ibrahimhamamci/CT-RATE), which is released under a CC-BY-NC-SA license. We fully comply with the license terms by using the data for non-commercial academic research, providing proper attribution.
|
| 81 |
+
|
| 82 |
+
## π Model Links
|
| 83 |
|
| 84 |
| Model | Paper |
|
| 85 |
| ----- | ------------------------------------------------------------ |
|
| 86 |
| [RadFM](https://github.com/chaoyi-wu/RadFM) | Towards Generalist Foundation Model for Radiology by Leveraging Web-scale 2D&3D Medical Data | https://github.com/chaoyi-wu/RadFM |
|
| 87 |
| [M3D](https://github.com/BAAI-DCAI/M3D) | M3D: Advancing 3D Medical Image Analysis with Multi-Modal Large Language Models |
|
| 88 |
+
| OmniV(not open) | OmniV-Med: Scaling Medical Vision-Language Model for Universal Visual Understanding |
|