Aviroy commited on
Commit
2c29544
·
verified ·
1 Parent(s): d5b9ea2

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +232 -0
README.md ADDED
@@ -0,0 +1,232 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ language:
4
+ - en
5
+ metrics:
6
+ - accuracy
7
+ base_model:
8
+ - microsoft/resnet-50
9
+ - timm/vgg19.tv_in1k
10
+ - google/vit-base-patch16-224
11
+ - xai-org/grok-1
12
+ pipeline_tag: image-classification
13
+ tags:
14
+ - Ocular-Toxoplasmosis(FundusImages)
15
+ - Retinal-images(Diabetics,Cataract,Gulocoma,Healthy)
16
+ - Pytorch
17
+ - Transformers
18
+ - Image-Classification
19
+ - Image_feature_extraction
20
+ - Grad-CAM
21
+ - XAI-Visualization
22
+ ---
23
+
24
+ # Model Card: ROYXAI [Vision Transformer + VGG19 + ResNet50 Ensemble with Grad-CAM]
25
+
26
+ ## Model Description
27
+ This model is an ensemble of three deep learning architectures: **Vision Transformer (ViT), VGG19, and ResNet50**. The ensemble approach enhances classification performance on medical image datasets related to ocular diseases. The model also integrates **Grad-CAM** visualization to highlight regions of interest for better interpretability.
28
+
29
+ ## Model Details
30
+ - **Model Name**: ROYXAI
31
+ - **Developed by**: Avishek Roy Sparsho
32
+ - **Framework**: PyTorch
33
+ - **Ensemble Method**: Bagging
34
+ - **Backbone Models**: Vision Transformer, VGG19, ResNet50
35
+ - **Target Task**: Medical Image Classification
36
+ - **Supported Classes**:
37
+ - OT
38
+ - Healthy
39
+ - SC_diabetes
40
+ - SC_cataract
41
+ - SC_glucoma
42
+
43
+ ## Model Sources
44
+
45
+ - **Repository**: [ROYXAI on Hugging Face](https://huggingface.co/Aviroy/ROYXAI)
46
+ - **Paper [optional]**: [More Information Needed]
47
+ - **Demo [optional]**: [More Information Needed]
48
+
49
+ ## Uses
50
+
51
+ ### Direct Use
52
+ This model is designed for medical image classification to detect ocular diseases.
53
+
54
+ ### Downstream Use
55
+ Can be fine-tuned on different medical datasets to improve performance for specific conditions.
56
+
57
+ ### Out-of-Scope Use
58
+ Not suitable for non-medical image classification tasks or use as a standalone medical diagnostic tool.
59
+
60
+ ## Bias, Risks, and Limitations
61
+
62
+ - This model is trained on a specific dataset and may not generalize well to other medical image datasets without fine-tuning.
63
+ - It is **not a substitute for professional medical diagnosis**.
64
+ - The Vision Transformer model is computationally expensive compared to CNNs.
65
+
66
+ ## Training Details
67
+
68
+ ## Dataset
69
+ - **Dataset Name**: Custom Ocular Disease and its Secondary complications Dataset
70
+ - **Dataset Source**: Private Dataset (Medical Images)
71
+ - **Dataset Structure**: Images stored in folders based on class labels
72
+ - **Preprocessing**:
73
+ - Resized images to 224x224 pixels
74
+ - Normalized using ImageNet mean and standard deviation
75
+
76
+
77
+ ### Training Procedure
78
+
79
+ - **Optimizer**: Adam with weight decay
80
+ - **Learning Rate Scheduler**: Cosine Annealing LR
81
+ - **Loss Function**: Cross-Entropy Loss
82
+ - **Batch Size**: 32
83
+ - **Training Epochs**: 20
84
+ - **Hardware Used**: T4 GPU x2
85
+
86
+ ## Model Performance
87
+ - **Accuracy**: 98% on the test dataset
88
+ - **Precision/Recall/F1-score**: Evaluated and optimized for medical diagnosis
89
+ - **Overfitting Prevention**: Implemented **data augmentation, dropout, weight regularization**
90
+
91
+
92
+ ## Installation and Usage
93
+
94
+ ### Clone the Repository
95
+
96
+ ```bash
97
+ git clone https://huggingface.co/Aviroy/ROYXAI
98
+ cd ROYXAI
99
+ ```
100
+
101
+ ### Install Dependencies
102
+
103
+ ```bash
104
+ pip install -r requirements.txt
105
+ ```
106
+
107
+ ### Training the Model
108
+
109
+ To train the model from scratch, run:
110
+
111
+ ```bash
112
+ python train.py --epochs 50 --batch_size 32
113
+ ```
114
+
115
+ ### Load Pretrained Model
116
+
117
+ To directly use the trained model:
118
+
119
+ ```python
120
+ import torch
121
+ from PIL import Image
122
+ import torchvision.transforms as transforms
123
+ from model import ensemble_model # Load the trained ensemble model
124
+
125
+ # Define image transformations
126
+ transform = transforms.Compose([
127
+ transforms.Resize((224, 224)),
128
+ transforms.ToTensor(),
129
+ transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
130
+ ])
131
+
132
+ # Load and preprocess an image
133
+ image_path = "path/to/image.jpg"
134
+ image = Image.open(image_path).convert('RGB')
135
+ image = transform(image).unsqueeze(0).to('cuda' if torch.cuda.is_available() else 'cpu')
136
+
137
+ # Perform inference
138
+ ensemble_model.eval()
139
+ with torch.no_grad():
140
+ output = ensemble_model(image)
141
+ predicted_class = torch.argmax(output, dim=1).item()
142
+
143
+ # Print classification result
144
+ print("Predicted Class:", predicted_class)
145
+ ```
146
+
147
+ ## Grad-CAM Visualization
148
+
149
+ ### Visualizing Attention Maps for Interpretability
150
+
151
+ #### Vision Transformer (ViT)
152
+
153
+ ```python
154
+ from visualization import visualize_gradcam_vit # Function for ViT Grad-CAM
155
+
156
+ # Generate Grad-CAM visualization
157
+ overlay = visualize_gradcam_vit(ensemble_model.models[0], image, target_class=predicted_class)
158
+
159
+ # Display the Grad-CAM output
160
+ import matplotlib.pyplot as plt
161
+ plt.imshow(overlay)
162
+ plt.axis('off')
163
+ plt.title("Grad-CAM for Vision Transformer")
164
+ plt.show()
165
+ ```
166
+
167
+ #### ResNet50
168
+
169
+ ```python
170
+ from visualization import visualize_gradcam # General Grad-CAM function
171
+
172
+ # Generate Grad-CAM visualization for ResNet50
173
+ overlay = visualize_gradcam(ensemble_model.models[2], image, target_class=predicted_class)
174
+
175
+ # Display the Grad-CAM output
176
+ import matplotlib.pyplot as plt
177
+ plt.imshow(overlay)
178
+ plt.axis('off')
179
+ plt.title("Grad-CAM for ResNet50")
180
+ plt.show()
181
+ ```
182
+
183
+ #### VGG19
184
+
185
+ ```python
186
+ from visualization import visualize_gradcam # General Grad-CAM function
187
+
188
+ # Generate Grad-CAM visualization for VGG19
189
+ overlay = visualize_gradcam(ensemble_model.models[1], image, target_class=predicted_class)
190
+
191
+ # Display the Grad-CAM output
192
+ import matplotlib.pyplot as plt
193
+ plt.imshow(overlay)
194
+ plt.axis('off')
195
+ plt.title("Grad-CAM for VGG19")
196
+ plt.show()
197
+ ```
198
+
199
+ ## Environmental Impact
200
+
201
+ - **Hardware Type**: T4 GPU x2
202
+ - **Hours used**: 50
203
+ - **Cloud Provider**: Google Cloud (GCP)
204
+ - **Compute Region**: US-Central1
205
+ - **Carbon Emitted**: Estimated using [Machine Learning Impact Calculator](https://mlco2.github.io/impact#compute)
206
+
207
+ ## Citation
208
+
209
+ If you use this model in your research, please cite:
210
+
211
+ ## Citation
212
+ If you use this model in your research, please cite:
213
+ ```
214
+ @article{Sparsho2025,
215
+ author = {Avishek Roy Sparsho},
216
+ title = {ROYXAI Model For Proper Visualization of Classified Medical Image},
217
+ journal = {Medical AI Research},
218
+ year = {2025}
219
+ }
220
+ ```
221
+
222
+ ## Acknowledgments
223
+
224
+ Special thanks to the open-source community and Kaggle for providing medical datasets for deep learning research.
225
+
226
+ ## Contact
227
+
228
+ For inquiries, please contact: Avishek Roy Sparsho
229
+
230
+ ## License
231
+
232
+ This model is released under the **Apache 2.0 License**. Use it responsibly.