Tang-xiaoxiao
/

M3D-RAD

Safetensors

Model card Files Files and versions

xet

Community

Tang-xiaoxiao commited on Oct 28, 2025

Commit

102aee2

verified ·

1 Parent(s): aa25112

Update README.md

Browse files

Files changed (1) hide show

README.md +75 -10

README.md CHANGED Viewed

@@ -1,23 +1,88 @@
 ---
 license: apache-2.0
 ---
-# M3D-RAD Model
-The official Model for the paper "3D-RAD: A Comprehensive 3D Radiology Med-VQA Dataset with Multi-Temporal Analysis and Diverse Diagnostic Tasks".
-In our project, we collect a large-scale dataset designed to advance 3D Med-VQA using radiology CT scans, 3D-RAD, encompasses six diverse VQA tasks: anomaly detection (task 1), image observation (task 2), medical computation (task 3), existence detection (task 4), static temporal diagnosis (task 5), and longitudinal temporal diagnosis (task 6).
-![Main Figure](https://github.com/Tang-xiaoxiao/M3D-RAD/blob/main/Figures/main.png?raw=true)
-## Code
-You can find our code in [M3D-RAD_Code](https://github.com/Tang-xiaoxiao/M3D-RAD).
-## 3D-RAD Dataset
-You can find our dataset in [3D-RAD_Dataset](https://huggingface.co/datasets/Tang-xiaoxiao/3D-RAD).
-## Model Links
 | Model | Paper                                                        |
 | ----- | ------------------------------------------------------------ |
 | [RadFM](https://github.com/chaoyi-wu/RadFM) | Towards Generalist Foundation Model for Radiology by Leveraging Web-scale 2D&3D Medical Data | https://github.com/chaoyi-wu/RadFM |
 | [M3D](https://github.com/BAAI-DCAI/M3D)   | M3D: Advancing 3D Medical Image Analysis with Multi-Modal Large Language Models |
-| OmniV(not open) | OmniV-Med: Scaling Medical Vision-Language Model for Universal Visual Understanding |

 ---
 license: apache-2.0
 ---
+# [ 🎯 NeurIPS 2025 ] 3D-RAD 🩻: A Comprehensive 3D Radiology Med-VQA Dataset with Multi-Temporal Analysis and Diverse Diagnostic Tasks
+<div align="center">
+  <a href="https://github.com/Tang-xiaoxiao/3D-RAD/stargazers">
+    <img src="https://img.shields.io/github/stars/Tang-xiaoxiao/3D-RAD?style=social" />
+  </a>
+  <a href="https://arxiv.org/abs/2506.11147">
+    <img src="https://img.shields.io/badge/arXiv-Paper-b31b1b.svg?logo=arxiv" />
+  </a>
+  <a href="https://GitHub.com/Naereen/StrapDown.js/graphs/commit-activity">
+    <img src="https://img.shields.io/badge/Maintained%3F-yes-green.svg" />
+  </a>
+</div>
+## 📢 News
+<summary><strong>What's New in This Update 🚀</strong></summary>
+- **2025.10.23**: 🔥 Updated **the latest version** of the paper!
+- **2025.09.19**: 🔥 Paper accepted to **NeurIPS 2025**! 🎯
+- **2025.05.16**: 🔥 Set up the repository and committed the dataset!
+## 🔍 Overview
+💡 In this repository, we present the model for **["3D-RAD: A Comprehensive 3D Radiology Med-VQA Dataset with Multi-Temporal Analysis and Diverse Diagnostic Tasks"](https://arxiv.org/pdf/2506.11147)**.
+In our project, we collect a large-scale dataset designed to advance 3D Med-VQA using radiology CT scans, 3D-RAD, encompasses six diverse VQA tasks: **Anomaly Detection** (task 1), **Image Observation** (task 2), **Medical Computation** (task 3), **Existence Detection** (task 4), **Static Temporal Diagnosis** (task 5), and **Longitudinal Temporal Diagnosis** (task 6).
+![overview](https://github.com/Tang-xiaoxiao/3D-RAD/blob/main/Figures/overview.png?raw=true)
+![main](https://github.com/Tang-xiaoxiao/3D-RAD/blob/main/Figures/main.png?raw=true)
+## 🤖 M3D-RAD Model
+To assess the utility of 3D-RAD, we **finetuned two M3D model variants** with different parameter scales, thereby constructing the M3D-RAD models. You can find our finetuned model in [M3D-RAD_Models](https://huggingface.co/Tang-xiaoxiao/M3D-RAD).
+![finetuned](https://github.com/Tang-xiaoxiao/3D-RAD/blob/main/Figures/finetuned.png?raw=true)
+## 📈 Evaluation
+### Zero-Shot Evaluation.
+We conducted **zero-shot evaluation** of several stateof-the-art 3D medical vision-language models on our benchmark to assess their generalization capabilities.
+![zeroshot](https://github.com/Tang-xiaoxiao/3D-RAD/blob/main/Figures/zeroshot.png?raw=true)
+In the `RadFM` and `M3D` directory, there are code for evaluating RadFM and M3D models on our 3D-RAD benchmark. Note that, the base code is [RadFM](https://github.com/chaoyi-wu/RadFM), and the base code is [M3D](https://github.com/BAAI-DCAI/M3D). To run our evaluation, you should first satisfy the requirements and download the models according to the base code of these models.
+Compare to the base code, we make the following modifications: In the `RadFM` directory, we add a new Dataset in `RadFM/src/Dataset/dataset/rad_dataset.py` and modify the Dataset to test in `RadFM/src/Dataset/multi_dataset_test.py`. Then we add a new python file to evaluate our benchmark in `RadFM/src/eval_3DRAD.py`. In the `M3D` directory, we add a new Dataset in `M3D/Bench/dataset/multi_dataset.py` and add a new python file to evaluate our benchmark in `M3D/Bench/eval/eval_3DRAD.py`.
+You can evaluate RadFM on our 3D-RAD benchmark by running:
+```python
+cd 3D-RAD/RadFM/src
+python eval_3DRAD.py \
+--file_path={your test file_path} \
+--output_path={your saved output_path}
+```
+You can evaluate M3D on our 3D-RAD benchmark by running:
+```python
+cd 3D-RAD/M3D
+python Bench/eval/eval_3DRAD.py \
+--model_name_or_path={your model_name} \
+--vqa_data_test_path={your test file_path} \
+--output_dir={your saved output_dir}
+```
+### Scaling with Varying Training Set Sizes.
+To further investigate the impact of dataset scale on model performance, we randomly **sampled 1%, 10% and 100%** of the training data per task and fine-tuned M3D accordingly.
+![varysizes](https://github.com/Tang-xiaoxiao/3D-RAD/blob/main/Figures/varysizes.png?raw=true)
+## 📊 3D-RAD Dataset
+In the `3DRAD` directory, there are QA data without 3D images.
+You can find the full dataset with 3D images (For efficient model input, the original CT images were preprocessed and converted into .npy format.) in [3D-RAD_Dataset](https://huggingface.co/datasets/Tang-xiaoxiao/3D-RAD).
+![distribution](https://github.com/Tang-xiaoxiao/3D-RAD/blob/main/Figures/distribution.png?raw=true)
+![construction](https://github.com/Tang-xiaoxiao/3D-RAD/blob/main/Figures/Construction.png?raw=true)
+## 📁 Data Source
+The original CT scans in our dataset are derived from [CT-RATE](https://huggingface.co/datasets/ibrahimhamamci/CT-RATE), which is released under a CC-BY-NC-SA license. We fully comply with the license terms by using the data for non-commercial academic research, providing proper attribution.
+## 🔗 Model Links
 | Model | Paper                                                        |
 | ----- | ------------------------------------------------------------ |
 | [RadFM](https://github.com/chaoyi-wu/RadFM) | Towards Generalist Foundation Model for Radiology by Leveraging Web-scale 2D&3D Medical Data | https://github.com/chaoyi-wu/RadFM |
 | [M3D](https://github.com/BAAI-DCAI/M3D)   | M3D: Advancing 3D Medical Image Analysis with Multi-Modal Large Language Models |
+| OmniV(not open) | OmniV-Med: Scaling Medical Vision-Language Model for Universal Visual Understanding |