Tang-xiaoxiao commited on
Commit
102aee2
Β·
verified Β·
1 Parent(s): aa25112

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +75 -10
README.md CHANGED
@@ -1,23 +1,88 @@
1
  ---
2
  license: apache-2.0
3
  ---
4
- # M3D-RAD Model
5
- The official Model for the paper "3D-RAD: A Comprehensive 3D Radiology Med-VQA Dataset with Multi-Temporal Analysis and Diverse Diagnostic Tasks".
 
 
 
 
 
 
 
 
 
 
 
6
 
7
- In our project, we collect a large-scale dataset designed to advance 3D Med-VQA using radiology CT scans, 3D-RAD, encompasses six diverse VQA tasks: anomaly detection (task 1), image observation (task 2), medical computation (task 3), existence detection (task 4), static temporal diagnosis (task 5), and longitudinal temporal diagnosis (task 6).
8
 
9
- ![Main Figure](https://github.com/Tang-xiaoxiao/M3D-RAD/blob/main/Figures/main.png?raw=true)
 
 
10
 
11
- ## Code
12
- You can find our code in [M3D-RAD_Code](https://github.com/Tang-xiaoxiao/M3D-RAD).
13
 
14
- ## 3D-RAD Dataset
15
- You can find our dataset in [3D-RAD_Dataset](https://huggingface.co/datasets/Tang-xiaoxiao/3D-RAD).
16
 
17
- ## Model Links
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
18
 
19
  | Model | Paper |
20
  | ----- | ------------------------------------------------------------ |
21
  | [RadFM](https://github.com/chaoyi-wu/RadFM) | Towards Generalist Foundation Model for Radiology by Leveraging Web-scale 2D&3D Medical Data | https://github.com/chaoyi-wu/RadFM |
22
  | [M3D](https://github.com/BAAI-DCAI/M3D) | M3D: Advancing 3D Medical Image Analysis with Multi-Modal Large Language Models |
23
- | OmniV(not open) | OmniV-Med: Scaling Medical Vision-Language Model for Universal Visual Understanding |
 
1
  ---
2
  license: apache-2.0
3
  ---
4
+ # [ 🎯 NeurIPS 2025 ] 3D-RAD 🩻: A Comprehensive 3D Radiology Med-VQA Dataset with Multi-Temporal Analysis and Diverse Diagnostic Tasks
5
+ <div align="center">
6
+ <a href="https://github.com/Tang-xiaoxiao/3D-RAD/stargazers">
7
+ <img src="https://img.shields.io/github/stars/Tang-xiaoxiao/3D-RAD?style=social" />
8
+ </a>
9
+ <a href="https://arxiv.org/abs/2506.11147">
10
+ <img src="https://img.shields.io/badge/arXiv-Paper-b31b1b.svg?logo=arxiv" />
11
+ </a>
12
+ <a href="https://GitHub.com/Naereen/StrapDown.js/graphs/commit-activity">
13
+ <img src="https://img.shields.io/badge/Maintained%3F-yes-green.svg" />
14
+ </a>
15
+ </div>
16
+ ## πŸ“’ News
17
 
18
+ <summary><strong>What's New in This Update πŸš€</strong></summary>
19
 
20
+ - **2025.10.23**: πŸ”₯ Updated **the latest version** of the paper!
21
+ - **2025.09.19**: πŸ”₯ Paper accepted to **NeurIPS 2025**! 🎯
22
+ - **2025.05.16**: πŸ”₯ Set up the repository and committed the dataset!
23
 
24
+ ## πŸ” Overview
25
+ πŸ’‘ In this repository, we present the model for **["3D-RAD: A Comprehensive 3D Radiology Med-VQA Dataset with Multi-Temporal Analysis and Diverse Diagnostic Tasks"](https://arxiv.org/pdf/2506.11147)**.
26
 
27
+ In our project, we collect a large-scale dataset designed to advance 3D Med-VQA using radiology CT scans, 3D-RAD, encompasses six diverse VQA tasks: **Anomaly Detection** (task 1), **Image Observation** (task 2), **Medical Computation** (task 3), **Existence Detection** (task 4), **Static Temporal Diagnosis** (task 5), and **Longitudinal Temporal Diagnosis** (task 6).
 
28
 
29
+ ![overview](https://github.com/Tang-xiaoxiao/3D-RAD/blob/main/Figures/overview.png?raw=true)
30
+ ![main](https://github.com/Tang-xiaoxiao/3D-RAD/blob/main/Figures/main.png?raw=true)
31
+
32
+ ## πŸ€– M3D-RAD Model
33
+ To assess the utility of 3D-RAD, we **finetuned two M3D model variants** with different parameter scales, thereby constructing the M3D-RAD models. You can find our finetuned model in [M3D-RAD_Models](https://huggingface.co/Tang-xiaoxiao/M3D-RAD).
34
+
35
+ ![finetuned](https://github.com/Tang-xiaoxiao/3D-RAD/blob/main/Figures/finetuned.png?raw=true)
36
+
37
+ ## πŸ“ˆ Evaluation
38
+
39
+ ### Zero-Shot Evaluation.
40
+ We conducted **zero-shot evaluation** of several stateof-the-art 3D medical vision-language models on our benchmark to assess their generalization capabilities.
41
+
42
+ ![zeroshot](https://github.com/Tang-xiaoxiao/3D-RAD/blob/main/Figures/zeroshot.png?raw=true)
43
+
44
+ In the `RadFM` and `M3D` directory, there are code for evaluating RadFM and M3D models on our 3D-RAD benchmark. Note that, the base code is [RadFM](https://github.com/chaoyi-wu/RadFM), and the base code is [M3D](https://github.com/BAAI-DCAI/M3D). To run our evaluation, you should first satisfy the requirements and download the models according to the base code of these models.
45
+
46
+ Compare to the base code, we make the following modifications: In the `RadFM` directory, we add a new Dataset in `RadFM/src/Dataset/dataset/rad_dataset.py` and modify the Dataset to test in `RadFM/src/Dataset/multi_dataset_test.py`. Then we add a new python file to evaluate our benchmark in `RadFM/src/eval_3DRAD.py`. In the `M3D` directory, we add a new Dataset in `M3D/Bench/dataset/multi_dataset.py` and add a new python file to evaluate our benchmark in `M3D/Bench/eval/eval_3DRAD.py`.
47
+
48
+ You can evaluate RadFM on our 3D-RAD benchmark by running:
49
+
50
+ ```python
51
+ cd 3D-RAD/RadFM/src
52
+ python eval_3DRAD.py \
53
+ --file_path={your test file_path} \
54
+ --output_path={your saved output_path}
55
+ ```
56
+
57
+ You can evaluate M3D on our 3D-RAD benchmark by running:
58
+
59
+ ```python
60
+ cd 3D-RAD/M3D
61
+ python Bench/eval/eval_3DRAD.py \
62
+ --model_name_or_path={your model_name} \
63
+ --vqa_data_test_path={your test file_path} \
64
+ --output_dir={your saved output_dir}
65
+ ```
66
+
67
+ ### Scaling with Varying Training Set Sizes.
68
+ To further investigate the impact of dataset scale on model performance, we randomly **sampled 1%, 10% and 100%** of the training data per task and fine-tuned M3D accordingly.
69
+
70
+ ![varysizes](https://github.com/Tang-xiaoxiao/3D-RAD/blob/main/Figures/varysizes.png?raw=true)
71
+
72
+ ## πŸ“Š 3D-RAD Dataset
73
+ In the `3DRAD` directory, there are QA data without 3D images.
74
+ You can find the full dataset with 3D images (For efficient model input, the original CT images were preprocessed and converted into .npy format.) in [3D-RAD_Dataset](https://huggingface.co/datasets/Tang-xiaoxiao/3D-RAD).
75
+
76
+ ![distribution](https://github.com/Tang-xiaoxiao/3D-RAD/blob/main/Figures/distribution.png?raw=true)
77
+ ![construction](https://github.com/Tang-xiaoxiao/3D-RAD/blob/main/Figures/Construction.png?raw=true)
78
+
79
+ ## πŸ“ Data Source
80
+ The original CT scans in our dataset are derived from [CT-RATE](https://huggingface.co/datasets/ibrahimhamamci/CT-RATE), which is released under a CC-BY-NC-SA license. We fully comply with the license terms by using the data for non-commercial academic research, providing proper attribution.
81
+
82
+ ## πŸ”— Model Links
83
 
84
  | Model | Paper |
85
  | ----- | ------------------------------------------------------------ |
86
  | [RadFM](https://github.com/chaoyi-wu/RadFM) | Towards Generalist Foundation Model for Radiology by Leveraging Web-scale 2D&3D Medical Data | https://github.com/chaoyi-wu/RadFM |
87
  | [M3D](https://github.com/BAAI-DCAI/M3D) | M3D: Advancing 3D Medical Image Analysis with Multi-Modal Large Language Models |
88
+ | OmniV(not open) | OmniV-Med: Scaling Medical Vision-Language Model for Universal Visual Understanding |