mselmangokmen commited on
Commit
1c7f366
·
verified ·
1 Parent(s): f7f7730

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +38 -37
README.md CHANGED
@@ -18,7 +18,7 @@ co2_emissions:
18
  ---
19
 
20
 
21
- # Model Card for Neuropathology Vision Transformer: NP-TEST-0
22
 
23
  This model is a Vision Transformer adapted for neuropathology tasks, developed using data from the University of Kentucky. It leverages principles from self-supervised learning models like DINOv2.
24
 
@@ -29,8 +29,8 @@ This model is a Vision Transformer adapted for neuropathology tasks, developed u
29
 
30
  * **Model Type:** Vision Transformer (ViT) for neuropathology.
31
  * **Developed by:** Center for Applied Artificial Intelligence (CAAI)
32
- * **Model Date:** 05/2025
33
- * **Base Model Architecture:** Dinov2-giant (https://huggingface.co/facebook/dinov2-giant)
34
  * **Input:** Image (224x224).
35
  * **Output:** Class token and patch tokens. These can be used for various downstream tasks (e.g., classification, segmentation, similarity search).
36
  * **Embedding Dimension:** 1536
@@ -61,7 +61,7 @@ This model is intended for research purposes in the field of neuropathology.
61
 
62
  * **Training System/Framework:** DINO-MX (Modular & Flexible Self-Supervised Training Framework)
63
  * **Training Infrastructure:** 4 x DGS H100 nodes (32 x H100 GPUs)
64
- * **Base Model (if fine-tuning):** Pretrained `facebook/dinov2-giant` loaded from Hugging Face Hub.
65
  * **Training Objective(s):** Self-supervised learning using DINO loss, iBOT masked-image modeling loss.
66
  * **Key Hyperparameters (example):**
67
  * Batch size: 32
@@ -79,48 +79,49 @@ This model is intended for research purposes in the field of neuropathology.
79
  The model achieved strong performance across multiple evaluation methods using the Neuro Path dataset.
80
 
81
  **Linear Probe Performance:**
82
- - Accuracy: 80.17%
83
- - Precision: 79.20%
84
- - Recall: 79.60%
85
- - F1 Score: 77.88%
86
 
87
  **K-Nearest Neighbors Classification:**
88
- - Accuracy: 83.76%
89
- - Precision: 83.34%
90
- - Recall: 83.76%
91
- - F1 Score: 83.40%
92
-
93
- **Clustering Quality:**
94
- - Silhouette Score: 0.267
95
- - Adjusted Mutual Information: 0.473
96
-
97
- **Robustness Score:** 0.574
98
-
99
- **Overall Performance Score:** 0.646
100
 
101
  ### Model Comparison
102
 
103
  #### Models Evaluated
104
- * **NP-TEST-0:** Our model
105
- * **dinov2-giant:** Pretrained [Dinov2 Giant](https://huggingface.co/facebook/dinov2-giant)
106
- * **dinov2-giant_distilled_prov:** [Dinov2 Giant](https://huggingface.co/facebook/dinov2-giant) distilled from [provo-gigapath](https://huggingface.co/prov-gigapath/prov-gigapath)
107
- * **dinov2-large_distilled_prov:** [Dinov2 Large](https://huggingface.co/facebook/dinov2-large) distilled from [provo-gigapath](https://huggingface.co/prov-gigapath/prov-gigapath)
108
- * **distilled_prov_finetuned:** dinov2-giant_distilled_prov was used as a base with additional finetuning without freezing teacher model.
 
 
 
 
 
109
  * **prov-gigapath:** [prov-gigapath/prov-gigapath](https://huggingface.co/prov-gigapath/prov-gigapath)
110
- * **UNI:** [MahmoodLab/UNI](https://huggingface.co/MahmoodLab/UNI)
111
- * **UNI2-h:** [MahmoodLab/UNI2-h](https://huggingface.co/MahmoodLab/UNI2-h)
112
 
113
  #### Linear Probe Comparison
114
  | Model | Accuracy | F1 | Precision | Recall |
115
  |---|---|---|---|---|
116
- | NP-TEST-0 | **0.802** | **0.779** | **0.792** | **0.796** |
117
- | dinov2-giant | 0.667 | 0.648 | 0.669 | 0.667 |
118
- | dinov2-giant_distilled_prov | 0.769 | 0.756 | 0.755 | 0.769 |
119
- | dinov2-large_distilled_prov | 0.772 | 0.758 | 0.758 | 0.772 |
120
- | distilled_prov_finetuned | 0.779 | 0.762 | 0.770 | 0.779 |
121
- | prov-gigapath | 0.776 | 0.762 | 0.764 | 0.776 |
122
- | UNI | 0.741 | 0.731 | 0.734 | 0.741 |
123
- | UNI2-h | 0.768 | 0.750 | 0.753 | 0.768 |
 
 
 
 
 
124
 
125
  *While the evaluation dataset was distinct from the training set, they were from the same institution, using the same staining, and obtained from the same scanner. It is not unexpected that a model fine-tuned on such a closely associated dataset would perform better. An evaluation dataset with broader representation is needed for a proper evaluation of generalized performance.*
126
 
@@ -248,7 +249,7 @@ def get_embeddings_direct(image_path, model_path, mean=[0.83800817, 0.6516568, 0
248
 
249
  return embeddings
250
 
251
- def get_embeddings_resized(image_path, model_path, size=(224, 224), mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]):
252
  """
253
  Extract embeddings with explicit resizing to 224x224.
254
  This approach ensures consistent input size regardless of original image dimensions.
@@ -286,7 +287,7 @@ def get_embeddings_resized(image_path, model_path, size=(224, 224), mean=[0.485,
286
  # Example usage
287
  if __name__ == "__main__":
288
  image_path = "test.jpg"
289
- model_path = "IBI-CAAI/NP-TEST-0"
290
 
291
  # Method 1: Using image processor (recommended for consistency)
292
  embeddings1 = get_embeddings_with_processor(image_path, model_path)
 
18
  ---
19
 
20
 
21
+ # Model Card for Neuropathology Vision Transformer: NP-GIANT
22
 
23
  This model is a Vision Transformer adapted for neuropathology tasks, developed using data from the University of Kentucky. It leverages principles from self-supervised learning models like DINOv2.
24
 
 
29
 
30
  * **Model Type:** Vision Transformer (ViT) for neuropathology.
31
  * **Developed by:** Center for Applied Artificial Intelligence (CAAI)
32
+ * **Model Date:** 10/2025
33
+ * **Base Model Architecture:** Dinov2-with-registers-giant (https://huggingface.co/facebook/dinov2-with-registers-giant)
34
  * **Input:** Image (224x224).
35
  * **Output:** Class token and patch tokens. These can be used for various downstream tasks (e.g., classification, segmentation, similarity search).
36
  * **Embedding Dimension:** 1536
 
61
 
62
  * **Training System/Framework:** DINO-MX (Modular & Flexible Self-Supervised Training Framework)
63
  * **Training Infrastructure:** 4 x DGS H100 nodes (32 x H100 GPUs)
64
+ * **Base Model (if fine-tuning):** Pretrained `facebook/dinov2-with-registers-giant` loaded from Hugging Face Hub.
65
  * **Training Objective(s):** Self-supervised learning using DINO loss, iBOT masked-image modeling loss.
66
  * **Key Hyperparameters (example):**
67
  * Batch size: 32
 
79
  The model achieved strong performance across multiple evaluation methods using the Neuro Path dataset.
80
 
81
  **Linear Probe Performance:**
82
+ - Accuracy: 84.51%
83
+ - Precision: 83.83%
84
+ - Recall: 84.51%
85
+ - F1 Score: 83.68%
86
 
87
  **K-Nearest Neighbors Classification:**
88
+ - Accuracy: 87.20%
89
+ - Precision: 86.91%
90
+ - Recall: 87.20%
91
+ - F1 Score: 86.76%
 
 
 
 
 
 
 
 
92
 
93
  ### Model Comparison
94
 
95
  #### Models Evaluated
96
+ * **NP-GIANT:** Our model
97
+ * **dinov2-with-registers-giant:** [facebook/dinov2-with-registers-giant](https://huggingface.co/facebook/dinov2-with-registers-giant)
98
+ * **dinov3-base:** [facebook/dinov3-vitb16-pretrain-lvd1689m](https://huggingface.co/facebook/dinov3-vitb16-pretrain-lvd1689m)
99
+ * **dinov3-7b:** [facebook/dinov3-vit7b16-pretrain-lvd1689m](https://huggingface.co/facebook/dinov3-vit7b16-pretrain-lvd1689m)
100
+ * **dinov3-small:** [facebook/dinov3-vits16-pretrain-lvd1689m](https://huggingface.co/facebook/dinov3-vits16-pretrain-lvd1689m)
101
+ * **dinov3-small-plus:** [facebook/dinov3-vits16plus-pretrain-lvd1689m](https://huggingface.co/facebook/dinov3-vith16plus-pretrain-lvd1689m)
102
+ * **dinov3-huge:** [facebook/dinov3-vith16plus-pretrain-lvd1689m](https://huggingface.co/facebook/dinov3-vith16plus-pretrain-lvd1689m)
103
+ * **dinov3-large:** [facebook/dinov3-vitl16-pretrain-sat493m](https://huggingface.co/facebook/dinov3-vitl16-pretrain-sat493m)
104
+ * **uni:** [MahmoodLab/UNI](https://huggingface.co/MahmoodLab/UNI)
105
+ * **uni2:** [MahmoodLab/UNI2](https://huggingface.co/MahmoodLab/UNI2)
106
  * **prov-gigapath:** [prov-gigapath/prov-gigapath](https://huggingface.co/prov-gigapath/prov-gigapath)
107
+
 
108
 
109
  #### Linear Probe Comparison
110
  | Model | Accuracy | F1 | Precision | Recall |
111
  |---|---|---|---|---|
112
+ | dinov3-base | 0.643 | 0.620 | 0.640 | 0.643 |
113
+ | dinov2-with-registers-giant (Hierarchical) | **0.845** | **0.837** | **0.838** | **0.845** |
114
+ | uni | 0.767 | 0.762 | 0.760 | 0.767 |
115
+ | dinov3-7b | 0.664 | 0.631 | 0.711 | 0.664 |
116
+ | dinov3-small | 0.508 | 0.449 | 0.626 | 0.508 |
117
+ | dinov3-small-plus | 0.586 | 0.534 | 0.635 | 0.586 |
118
+ | dinov3-huge | 0.493 | 0.448 | 0.583 | 0.493 |
119
+ | uni2 | 0.775 | 0.763 | 0.764 | 0.775 |
120
+ | prov-gigapath | 0.770 | 0.767 | 0.767 | 0.770 |
121
+ | dinov2-with-registers-giant | 0.646 | 0.633 | 0.638 | 0.646 |
122
+ | dinov3-large | 0.690 | 0.669 | 0.683 | 0.690 |
123
+
124
+
125
 
126
  *While the evaluation dataset was distinct from the training set, they were from the same institution, using the same staining, and obtained from the same scanner. It is not unexpected that a model fine-tuned on such a closely associated dataset would perform better. An evaluation dataset with broader representation is needed for a proper evaluation of generalized performance.*
127
 
 
249
 
250
  return embeddings
251
 
252
+ def get_embeddings_resized(image_path, model_path, size=(224, 224), mean=[0.874, 0.805, 0.775], std=[0.087, 0.095, 0.102]):
253
  """
254
  Extract embeddings with explicit resizing to 224x224.
255
  This approach ensures consistent input size regardless of original image dimensions.
 
287
  # Example usage
288
  if __name__ == "__main__":
289
  image_path = "test.jpg"
290
+ model_path = "IBI-CAAI/NP-GIANT"
291
 
292
  # Method 1: Using image processor (recommended for consistency)
293
  embeddings1 = get_embeddings_with_processor(image_path, model_path)