Update README.md
Browse files
README.md
CHANGED
|
@@ -18,7 +18,7 @@ co2_emissions:
|
|
| 18 |
---
|
| 19 |
|
| 20 |
|
| 21 |
-
# Model Card for Neuropathology Vision Transformer: NP-
|
| 22 |
|
| 23 |
This model is a Vision Transformer adapted for neuropathology tasks, developed using data from the University of Kentucky. It leverages principles from self-supervised learning models like DINOv2.
|
| 24 |
|
|
@@ -29,8 +29,8 @@ This model is a Vision Transformer adapted for neuropathology tasks, developed u
|
|
| 29 |
|
| 30 |
* **Model Type:** Vision Transformer (ViT) for neuropathology.
|
| 31 |
* **Developed by:** Center for Applied Artificial Intelligence (CAAI)
|
| 32 |
-
* **Model Date:**
|
| 33 |
-
* **Base Model Architecture:** Dinov2-giant (https://huggingface.co/facebook/dinov2-giant)
|
| 34 |
* **Input:** Image (224x224).
|
| 35 |
* **Output:** Class token and patch tokens. These can be used for various downstream tasks (e.g., classification, segmentation, similarity search).
|
| 36 |
* **Embedding Dimension:** 1536
|
|
@@ -61,7 +61,7 @@ This model is intended for research purposes in the field of neuropathology.
|
|
| 61 |
|
| 62 |
* **Training System/Framework:** DINO-MX (Modular & Flexible Self-Supervised Training Framework)
|
| 63 |
* **Training Infrastructure:** 4 x DGS H100 nodes (32 x H100 GPUs)
|
| 64 |
-
* **Base Model (if fine-tuning):** Pretrained `facebook/dinov2-giant` loaded from Hugging Face Hub.
|
| 65 |
* **Training Objective(s):** Self-supervised learning using DINO loss, iBOT masked-image modeling loss.
|
| 66 |
* **Key Hyperparameters (example):**
|
| 67 |
* Batch size: 32
|
|
@@ -79,48 +79,49 @@ This model is intended for research purposes in the field of neuropathology.
|
|
| 79 |
The model achieved strong performance across multiple evaluation methods using the Neuro Path dataset.
|
| 80 |
|
| 81 |
**Linear Probe Performance:**
|
| 82 |
-
- Accuracy:
|
| 83 |
-
- Precision:
|
| 84 |
-
- Recall:
|
| 85 |
-
- F1 Score:
|
| 86 |
|
| 87 |
**K-Nearest Neighbors Classification:**
|
| 88 |
-
- Accuracy:
|
| 89 |
-
- Precision:
|
| 90 |
-
- Recall:
|
| 91 |
-
- F1 Score:
|
| 92 |
-
|
| 93 |
-
**Clustering Quality:**
|
| 94 |
-
- Silhouette Score: 0.267
|
| 95 |
-
- Adjusted Mutual Information: 0.473
|
| 96 |
-
|
| 97 |
-
**Robustness Score:** 0.574
|
| 98 |
-
|
| 99 |
-
**Overall Performance Score:** 0.646
|
| 100 |
|
| 101 |
### Model Comparison
|
| 102 |
|
| 103 |
#### Models Evaluated
|
| 104 |
-
* **NP-
|
| 105 |
-
* **dinov2-giant:**
|
| 106 |
-
* **
|
| 107 |
-
* **
|
| 108 |
-
* **
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 109 |
* **prov-gigapath:** [prov-gigapath/prov-gigapath](https://huggingface.co/prov-gigapath/prov-gigapath)
|
| 110 |
-
|
| 111 |
-
* **UNI2-h:** [MahmoodLab/UNI2-h](https://huggingface.co/MahmoodLab/UNI2-h)
|
| 112 |
|
| 113 |
#### Linear Probe Comparison
|
| 114 |
| Model | Accuracy | F1 | Precision | Recall |
|
| 115 |
|---|---|---|---|---|
|
| 116 |
-
|
|
| 117 |
-
| dinov2-giant | 0.
|
| 118 |
-
|
|
| 119 |
-
|
|
| 120 |
-
|
|
| 121 |
-
|
|
| 122 |
-
|
|
| 123 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 124 |
|
| 125 |
*While the evaluation dataset was distinct from the training set, they were from the same institution, using the same staining, and obtained from the same scanner. It is not unexpected that a model fine-tuned on such a closely associated dataset would perform better. An evaluation dataset with broader representation is needed for a proper evaluation of generalized performance.*
|
| 126 |
|
|
@@ -248,7 +249,7 @@ def get_embeddings_direct(image_path, model_path, mean=[0.83800817, 0.6516568, 0
|
|
| 248 |
|
| 249 |
return embeddings
|
| 250 |
|
| 251 |
-
def get_embeddings_resized(image_path, model_path, size=(224, 224), mean=[0.
|
| 252 |
"""
|
| 253 |
Extract embeddings with explicit resizing to 224x224.
|
| 254 |
This approach ensures consistent input size regardless of original image dimensions.
|
|
@@ -286,7 +287,7 @@ def get_embeddings_resized(image_path, model_path, size=(224, 224), mean=[0.485,
|
|
| 286 |
# Example usage
|
| 287 |
if __name__ == "__main__":
|
| 288 |
image_path = "test.jpg"
|
| 289 |
-
model_path = "IBI-CAAI/NP-
|
| 290 |
|
| 291 |
# Method 1: Using image processor (recommended for consistency)
|
| 292 |
embeddings1 = get_embeddings_with_processor(image_path, model_path)
|
|
|
|
| 18 |
---
|
| 19 |
|
| 20 |
|
| 21 |
+
# Model Card for Neuropathology Vision Transformer: NP-GIANT
|
| 22 |
|
| 23 |
This model is a Vision Transformer adapted for neuropathology tasks, developed using data from the University of Kentucky. It leverages principles from self-supervised learning models like DINOv2.
|
| 24 |
|
|
|
|
| 29 |
|
| 30 |
* **Model Type:** Vision Transformer (ViT) for neuropathology.
|
| 31 |
* **Developed by:** Center for Applied Artificial Intelligence (CAAI)
|
| 32 |
+
* **Model Date:** 10/2025
|
| 33 |
+
* **Base Model Architecture:** Dinov2-with-registers-giant (https://huggingface.co/facebook/dinov2-with-registers-giant)
|
| 34 |
* **Input:** Image (224x224).
|
| 35 |
* **Output:** Class token and patch tokens. These can be used for various downstream tasks (e.g., classification, segmentation, similarity search).
|
| 36 |
* **Embedding Dimension:** 1536
|
|
|
|
| 61 |
|
| 62 |
* **Training System/Framework:** DINO-MX (Modular & Flexible Self-Supervised Training Framework)
|
| 63 |
* **Training Infrastructure:** 4 x DGS H100 nodes (32 x H100 GPUs)
|
| 64 |
+
* **Base Model (if fine-tuning):** Pretrained `facebook/dinov2-with-registers-giant` loaded from Hugging Face Hub.
|
| 65 |
* **Training Objective(s):** Self-supervised learning using DINO loss, iBOT masked-image modeling loss.
|
| 66 |
* **Key Hyperparameters (example):**
|
| 67 |
* Batch size: 32
|
|
|
|
| 79 |
The model achieved strong performance across multiple evaluation methods using the Neuro Path dataset.
|
| 80 |
|
| 81 |
**Linear Probe Performance:**
|
| 82 |
+
- Accuracy: 84.51%
|
| 83 |
+
- Precision: 83.83%
|
| 84 |
+
- Recall: 84.51%
|
| 85 |
+
- F1 Score: 83.68%
|
| 86 |
|
| 87 |
**K-Nearest Neighbors Classification:**
|
| 88 |
+
- Accuracy: 87.20%
|
| 89 |
+
- Precision: 86.91%
|
| 90 |
+
- Recall: 87.20%
|
| 91 |
+
- F1 Score: 86.76%
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 92 |
|
| 93 |
### Model Comparison
|
| 94 |
|
| 95 |
#### Models Evaluated
|
| 96 |
+
* **NP-GIANT:** Our model
|
| 97 |
+
* **dinov2-with-registers-giant:** [facebook/dinov2-with-registers-giant](https://huggingface.co/facebook/dinov2-with-registers-giant)
|
| 98 |
+
* **dinov3-base:** [facebook/dinov3-vitb16-pretrain-lvd1689m](https://huggingface.co/facebook/dinov3-vitb16-pretrain-lvd1689m)
|
| 99 |
+
* **dinov3-7b:** [facebook/dinov3-vit7b16-pretrain-lvd1689m](https://huggingface.co/facebook/dinov3-vit7b16-pretrain-lvd1689m)
|
| 100 |
+
* **dinov3-small:** [facebook/dinov3-vits16-pretrain-lvd1689m](https://huggingface.co/facebook/dinov3-vits16-pretrain-lvd1689m)
|
| 101 |
+
* **dinov3-small-plus:** [facebook/dinov3-vits16plus-pretrain-lvd1689m](https://huggingface.co/facebook/dinov3-vith16plus-pretrain-lvd1689m)
|
| 102 |
+
* **dinov3-huge:** [facebook/dinov3-vith16plus-pretrain-lvd1689m](https://huggingface.co/facebook/dinov3-vith16plus-pretrain-lvd1689m)
|
| 103 |
+
* **dinov3-large:** [facebook/dinov3-vitl16-pretrain-sat493m](https://huggingface.co/facebook/dinov3-vitl16-pretrain-sat493m)
|
| 104 |
+
* **uni:** [MahmoodLab/UNI](https://huggingface.co/MahmoodLab/UNI)
|
| 105 |
+
* **uni2:** [MahmoodLab/UNI2](https://huggingface.co/MahmoodLab/UNI2)
|
| 106 |
* **prov-gigapath:** [prov-gigapath/prov-gigapath](https://huggingface.co/prov-gigapath/prov-gigapath)
|
| 107 |
+
|
|
|
|
| 108 |
|
| 109 |
#### Linear Probe Comparison
|
| 110 |
| Model | Accuracy | F1 | Precision | Recall |
|
| 111 |
|---|---|---|---|---|
|
| 112 |
+
| dinov3-base | 0.643 | 0.620 | 0.640 | 0.643 |
|
| 113 |
+
| dinov2-with-registers-giant (Hierarchical) | **0.845** | **0.837** | **0.838** | **0.845** |
|
| 114 |
+
| uni | 0.767 | 0.762 | 0.760 | 0.767 |
|
| 115 |
+
| dinov3-7b | 0.664 | 0.631 | 0.711 | 0.664 |
|
| 116 |
+
| dinov3-small | 0.508 | 0.449 | 0.626 | 0.508 |
|
| 117 |
+
| dinov3-small-plus | 0.586 | 0.534 | 0.635 | 0.586 |
|
| 118 |
+
| dinov3-huge | 0.493 | 0.448 | 0.583 | 0.493 |
|
| 119 |
+
| uni2 | 0.775 | 0.763 | 0.764 | 0.775 |
|
| 120 |
+
| prov-gigapath | 0.770 | 0.767 | 0.767 | 0.770 |
|
| 121 |
+
| dinov2-with-registers-giant | 0.646 | 0.633 | 0.638 | 0.646 |
|
| 122 |
+
| dinov3-large | 0.690 | 0.669 | 0.683 | 0.690 |
|
| 123 |
+
|
| 124 |
+
|
| 125 |
|
| 126 |
*While the evaluation dataset was distinct from the training set, they were from the same institution, using the same staining, and obtained from the same scanner. It is not unexpected that a model fine-tuned on such a closely associated dataset would perform better. An evaluation dataset with broader representation is needed for a proper evaluation of generalized performance.*
|
| 127 |
|
|
|
|
| 249 |
|
| 250 |
return embeddings
|
| 251 |
|
| 252 |
+
def get_embeddings_resized(image_path, model_path, size=(224, 224), mean=[0.874, 0.805, 0.775], std=[0.087, 0.095, 0.102]):
|
| 253 |
"""
|
| 254 |
Extract embeddings with explicit resizing to 224x224.
|
| 255 |
This approach ensures consistent input size regardless of original image dimensions.
|
|
|
|
| 287 |
# Example usage
|
| 288 |
if __name__ == "__main__":
|
| 289 |
image_path = "test.jpg"
|
| 290 |
+
model_path = "IBI-CAAI/NP-GIANT"
|
| 291 |
|
| 292 |
# Method 1: Using image processor (recommended for consistency)
|
| 293 |
embeddings1 = get_embeddings_with_processor(image_path, model_path)
|