pnavard commited on
Commit
c75f280
·
verified ·
1 Parent(s): 766b2dc

created Readme.md

Browse files
Files changed (1) hide show
  1. README.md +126 -0
README.md ADDED
@@ -0,0 +1,126 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # SegFormer3D
2
+
3
+ SegFormer3D is a novel and efficient transformer-based architecture designed specifically for 3D medical image segmentation. It extends the successful 2D SegFormer architecture to handle volumetric medical data while maintaining computational efficiency and strong segmentation performance.
4
+
5
+ [![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/segformer3d-an-efficient-transformer-for-3d/medical-image-segmentation-on-acdc)](https://paperswithcode.com/sota/medical-image-segmentation-on-acdc?p=segformer3d-an-efficient-transformer-for-3d)
6
+
7
+ ## Model Description
8
+
9
+ SegFormer3D introduces several key innovations for efficient 3D medical image segmentation:
10
+
11
+ - **Hierarchical 3D Feature Learning**: Uses a multi-scale transformer encoder with progressively reduced sequence lengths for efficient processing of volumetric data
12
+ - **Efficient Self-Attention**: Implements spatially-reduced attention mechanism adapted for 3D, reducing computational complexity while maintaining performance
13
+ - **All-MLP 3D Decoder**: Lightweight decoder that effectively fuses multi-scale features through simple MLP layers
14
+ - **Memory-Efficient Design**: Optimized architecture that can process full 3D volumes without patch-based inference
15
+
16
+ The model achieves state-of-the-art performance on multiple 3D medical segmentation benchmarks while being computationally efficient.
17
+
18
+ <p align="center">
19
+ <img src="https://raw.githubusercontent.com/OSUPCVLab/SegFormer3D/main/resources/segformer_3D.png" alt="SegFormer3D Architecture" width="500"/>
20
+ </p>
21
+
22
+
23
+ ## Training and Evaluation
24
+
25
+ The model was trained and evaluated on several medical imaging datasets:
26
+
27
+ ### ACDC Dataset
28
+ - Task: Cardiac MRI segmentation
29
+ - Classes: Left ventricle, right ventricle, and myocardium
30
+ - Performance:
31
+ - Dice Score: 90.96%
32
+
33
+ <p align="center">
34
+ <img src="https://raw.githubusercontent.com/OSUPCVLab/SegFormer3D/main/resources/acdc_segformer_3D.png" alt="ACDC Results" width="400"/>
35
+ </p>
36
+
37
+ ### BraTS 2017
38
+ - Task: Brain tumor segmentation
39
+ - Classes: Enhancing tumor, tumor core, and whole tumor
40
+ - Performance:
41
+ - Average Dice: 82.1%
42
+ -
43
+ <p align="center">
44
+ <img src="https://raw.githubusercontent.com/OSUPCVLab/SegFormer3D/main/resources/brats_segformer_3D.png" alt="Brats Results" width="400"/>
45
+ </p>
46
+
47
+
48
+ ### Synapse Dataset
49
+ - Task: Multi-organ segmentation
50
+ - Classes: 8 abdominal organs
51
+ - Performance:
52
+ - Average Dice: 82.15%
53
+
54
+ <p align="center">
55
+ <img src="https://raw.githubusercontent.com/OSUPCVLab/SegFormer3D/main/resources/synapse_segformer_3D.png" alt="Synapse Results" width="400"/>
56
+ </p>
57
+
58
+ ## Usage
59
+
60
+ ```python
61
+ from transformers import SegFormer3DConfig, SegFormer3DModel
62
+ import torch
63
+
64
+ # Initialize configuration
65
+ config = SegFormer3DConfig(
66
+ in_channels=4, # Number of input channels
67
+ num_classes=3, # Number of segmentation classes
68
+ # Model architecture parameters
69
+ embed_dims=[32, 64, 160, 256],
70
+ num_heads=[1, 2, 5, 8],
71
+ depths=[2, 2, 2, 2],
72
+ sr_ratios=[4, 2, 1, 1]
73
+ )
74
+
75
+ # Initialize model
76
+ model = SegFormer3DModel(config)
77
+
78
+ # Example forward pass
79
+ batch_size = 1
80
+ depth, height, width = 128, 128, 128 # Example input dimensions
81
+ x = torch.randn(batch_size, config.in_channels, depth, height, width)
82
+ outputs = model(x)
83
+
84
+ # Get segmentation logits
85
+ logits = outputs.logits # Shape: (batch_size, num_classes, D, H, W)
86
+ ```
87
+
88
+ ## Limitations
89
+
90
+ - Input dimensions must be properly configured to ensure valid spatial dimensions after each stage
91
+ - Memory requirements increase with input volume size
92
+ - Performance may vary on different medical imaging modalities
93
+ - Assumes consistent voxel spacing in input volumes
94
+
95
+ ## Training Tips
96
+
97
+ 1. **Input Preprocessing**:
98
+ - Normalize input volumes to [0, 1] range
99
+ - Consider standardization per modality
100
+ - Use appropriate data augmentation for medical volumes
101
+
102
+ 2. **Training Strategy**:
103
+ - Start with a smaller learning rate (1e-4 recommended)
104
+ - Use gradient clipping to stabilize training
105
+ - Consider mixed precision training for memory efficiency
106
+
107
+ 3. **Memory Management**:
108
+ - Adjust batch size based on available GPU memory
109
+ - Use gradient checkpointing if needed
110
+ - Consider input volume size vs. model depth trade-offs
111
+
112
+ ## Citation
113
+
114
+ ```bibtex
115
+ @InProceedings{Perera_2024_CVPR,
116
+ title={Segformer3d: an efficient transformer for 3d medical image segmentation},
117
+ author={Perera, Shehan and Navard, Pouyan and Yilmaz, Alper},
118
+ booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
119
+ pages={4981--4988},
120
+ year={2024}
121
+ }
122
+ ```
123
+
124
+ ## Acknowledgements
125
+
126
+ This implementation is based on the original SegFormer architecture by Xie et al. and extends it to efficient 3D medical image segmentation. We thank the authors for their valuable contributions to the field.