groffo commited on
Commit Β·
070887a
1
Parent(s): d7087a8
Improve Hugging Face model card formatting and content
Browse files
README.md
CHANGED
|
@@ -1,127 +1,101 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
# π¬ Feature Selection Gates (FSG) for Vision Transformers (ViT)
|
| 2 |
|
| 3 |
-
This repository
|
| 4 |
|
| 5 |
> **Feature Selection Gates with Gradient Routing for Endoscopic Image Computing**
|
| 6 |
> Giorgio Roffo, Carlo Biffi, Pietro Salvagnini, Andrea Cherubini
|
| 7 |
-
>
|
| 8 |
-
> π [Paper](https://papers.miccai.org/miccai-2024/316-Paper0410.html) | π§ [arXiv](https://arxiv.org/abs/2407.04400) | π» [Code](https://github.com/cosmoimd/feature-selection-gates)
|
| 9 |
|
| 10 |
---
|
| 11 |
|
| 12 |
-
##
|
| 13 |
|
| 14 |
-
**FSG** introduces
|
|
|
|
|
|
|
|
|
|
| 15 |
|
| 16 |
-
**Gradient Routing (GR)**
|
| 17 |
-
- One
|
| 18 |
-
- A
|
| 19 |
-
This separation allows **task-specific tuning** and ensures stable learning.
|
| 20 |
|
| 21 |
---
|
| 22 |
|
| 23 |
-
## π‘
|
| 24 |
|
| 25 |
-
β
**
|
| 26 |
-
β
|
| 27 |
-
β
|
| 28 |
-
β
|
| 29 |
-
β
Compatible with **multi-stream CNNs** and hybrid models
|
| 30 |
|
| 31 |
-
|
|
|
|
|
|
|
|
|
|
| 32 |
|
| 33 |
---
|
| 34 |
|
| 35 |
-
## π§ͺ
|
| 36 |
-
|
| 37 |
-
Use the `vit_with_fsg.py` script to augment a pretrained ViT from `torchvision`.
|
| 38 |
|
| 39 |
```python
|
| 40 |
from torchvision.models import vit_b_16, ViT_B_16_Weights
|
| 41 |
from vit_with_fsg import vit_with_fsg
|
| 42 |
import torch
|
| 43 |
|
| 44 |
-
print("π₯ Loading pretrained
|
| 45 |
backbone = vit_b_16(weights=ViT_B_16_Weights.DEFAULT)
|
| 46 |
|
| 47 |
-
print("π§
|
| 48 |
model = vit_with_fsg(vit_backbone=backbone)
|
| 49 |
|
| 50 |
-
print("π§ͺ Running dummy input...")
|
| 51 |
dummy_input = torch.randn(1, 3, 224, 224)
|
| 52 |
output = model(dummy_input)
|
| 53 |
-
|
| 54 |
-
print("β
Done. Output shape:", output.shape)
|
| 55 |
```
|
| 56 |
|
| 57 |
---
|
| 58 |
|
| 59 |
-
##
|
| 60 |
-
|
| 61 |
-
We provide full working training and inference examples:
|
| 62 |
-
|
| 63 |
-
| Dataset | Training Script | Inference Script | Checkpoint Path |
|
| 64 |
-
|-------------|-----------------------------|------------------------------|----------------------------------------------|
|
| 65 |
-
| MNIST | `demo_training_mnist.py` | `demo_inference_mnist.py` | `./checkpoints/fsg_vit_mnist_demo.pth` |
|
| 66 |
-
| Imagenette | `demo_training_imnet.py` | `demo_inference_imnet.py` | `./checkpoints/fsg_vit_imagenette_demo.pth` |
|
| 67 |
|
| 68 |
-
|
| 69 |
-
-
|
| 70 |
-
|
| 71 |
-
|
| 72 |
-
- Saves checkpoints for reproducible inference.
|
| 73 |
|
| 74 |
-
|
| 75 |
-
|
| 76 |
-
```bash
|
| 77 |
-
# Train on Imagenette
|
| 78 |
-
python demo_training_imnet.py
|
| 79 |
-
|
| 80 |
-
# Inference on Imagenette
|
| 81 |
-
python demo_inference_imnet.py --checkpoint ./checkpoints/fsg_vit_imagenette_demo.pth
|
| 82 |
-
```
|
| 83 |
-
|
| 84 |
-
```bash
|
| 85 |
-
# Train on MNIST
|
| 86 |
-
python demo_training_mnist.py
|
| 87 |
-
|
| 88 |
-
# Inference on MNIST
|
| 89 |
-
python demo_inference_mnist.py --checkpoint ./checkpoints/fsg_vit_mnist_demo.pth
|
| 90 |
-
```
|
| 91 |
-
|
| 92 |
-
> β οΈ These demos use reduced test sets and train for few iterations to make training quick. They're not meant for benchmarking, but rather for showcasing FSG integration.
|
| 93 |
-
|
| 94 |
-
---
|
| 95 |
-
|
| 96 |
-
## π§ Applicability Beyond Endoscopy
|
| 97 |
-
|
| 98 |
-
Although designed for **polyp size estimation in colonoscopy**, FSG is a **general mechanism** for:
|
| 99 |
-
- **Image classification**
|
| 100 |
-
- **Medical image analysis**
|
| 101 |
-
- **Multimodal fusion**
|
| 102 |
-
- **NLP Transformers** (e.g., GPTs, BERT) β apply FSG over token embeddings
|
| 103 |
-
|
| 104 |
-
We strongly encourage researchers to test FSG in **non-medical** domains.
|
| 105 |
|
| 106 |
---
|
| 107 |
|
| 108 |
-
## π¦
|
| 109 |
|
| 110 |
```
|
| 111 |
.
|
| 112 |
-
βββ vit_with_fsg.py # ViT
|
| 113 |
βββ demo_training_mnist.py
|
| 114 |
βββ demo_inference_mnist.py
|
| 115 |
βββ demo_training_imnet.py
|
| 116 |
βββ demo_inference_imnet.py
|
| 117 |
-
βββ checkpoints/ #
|
|
|
|
| 118 |
```
|
| 119 |
|
| 120 |
---
|
| 121 |
|
| 122 |
## π Citation
|
| 123 |
|
| 124 |
-
|
| 125 |
|
| 126 |
```bibtex
|
| 127 |
@inproceedings{roffo2024FSG,
|
|
@@ -137,20 +111,7 @@ Please cite our work if you use this repository:
|
|
| 137 |
|
| 138 |
## π¬ Contact
|
| 139 |
|
| 140 |
-
|
| 141 |
π§ giorgio.roffo@gmail.com
|
| 142 |
-
π’ Cosmo Intelligent Medical Devices (IMD), Lainate, Italy
|
| 143 |
-
|
| 144 |
-
For more: [github.com/cosmoimd/feature-selection-gates](https://github.com/cosmoimd/feature-selection-gates)
|
| 145 |
-
|
| 146 |
-
---
|
| 147 |
-
tags:
|
| 148 |
-
- vision
|
| 149 |
-
- transformers
|
| 150 |
-
- vit
|
| 151 |
-
- feature-selection
|
| 152 |
-
- miccai2024
|
| 153 |
-
license: mit
|
| 154 |
-
library_name: PyTorch
|
| 155 |
-
inference: false
|
| 156 |
-
---
|
|
|
|
| 1 |
+
---
|
| 2 |
+
tags:
|
| 3 |
+
- vision
|
| 4 |
+
- transformers
|
| 5 |
+
- vit
|
| 6 |
+
- feature-selection
|
| 7 |
+
- miccai2024
|
| 8 |
+
license: mit
|
| 9 |
+
library_name: PyTorch
|
| 10 |
+
inference: false
|
| 11 |
+
---
|
| 12 |
+
|
| 13 |
# π¬ Feature Selection Gates (FSG) for Vision Transformers (ViT)
|
| 14 |
|
| 15 |
+
This repository implements **Feature Selection Gates (FSG)** and **Gradient Routing (GR)** as a modular extension to Vision Transformers. It is based on our paper presented at **MICCAI 2024**:
|
| 16 |
|
| 17 |
> **Feature Selection Gates with Gradient Routing for Endoscopic Image Computing**
|
| 18 |
> Giorgio Roffo, Carlo Biffi, Pietro Salvagnini, Andrea Cherubini
|
| 19 |
+
> [MICCAI 2024](https://papers.miccai.org/miccai-2024/316-Paper0410.html), [arXiv](https://arxiv.org/abs/2407.04400), [GitHub](https://github.com/cosmoimd/feature-selection-gates)
|
|
|
|
| 20 |
|
| 21 |
---
|
| 22 |
|
| 23 |
+
## π§ What Is FSG?
|
| 24 |
|
| 25 |
+
**FSG** introduces learnable gates on residual branches within Transformer layers. These gates:
|
| 26 |
+
- Dynamically select relevant features
|
| 27 |
+
- Promote **sparse connectivity** during training
|
| 28 |
+
- Serve as a form of **architectural regularization**
|
| 29 |
|
| 30 |
+
To stabilize learning, **Gradient Routing (GR)** performs a **dual-pass** strategy:
|
| 31 |
+
- One forward pass to compute gradients for the base model
|
| 32 |
+
- A separate route to update FSG parameters independently
|
|
|
|
| 33 |
|
| 34 |
---
|
| 35 |
|
| 36 |
+
## π‘ Key Features
|
| 37 |
|
| 38 |
+
- β
**Drop-in**: Easily wraps any `torchvision` ViT model (e.g. `vit_b_16`, `vit_l_16`)
|
| 39 |
+
- β
**General-purpose**: Use on **natural images**, **medical data**, and even **token sequences in NLP**
|
| 40 |
+
- β
**Regularizes ViTs** for low-data regimes (tested on CIFAR-100, endoscopic videos, etc.)
|
| 41 |
+
- β
No ViT surgery: FSG wraps Transformer layers directly
|
|
|
|
| 42 |
|
| 43 |
+
While this method was originally proposed for **polyp size estimation in colonoscopy**, it is designed to generalize across:
|
| 44 |
+
- 𧬠Medical image analysis
|
| 45 |
+
- πΌοΈ General image classification
|
| 46 |
+
- π NLP Transformers (e.g. GPT, BERT)
|
| 47 |
|
| 48 |
---
|
| 49 |
|
| 50 |
+
## π§ͺ Minimal Example
|
|
|
|
|
|
|
| 51 |
|
| 52 |
```python
|
| 53 |
from torchvision.models import vit_b_16, ViT_B_16_Weights
|
| 54 |
from vit_with_fsg import vit_with_fsg
|
| 55 |
import torch
|
| 56 |
|
| 57 |
+
print("π₯ Loading pretrained ViT...")
|
| 58 |
backbone = vit_b_16(weights=ViT_B_16_Weights.DEFAULT)
|
| 59 |
|
| 60 |
+
print("π§ Injecting FSG into backbone...")
|
| 61 |
model = vit_with_fsg(vit_backbone=backbone)
|
| 62 |
|
|
|
|
| 63 |
dummy_input = torch.randn(1, 3, 224, 224)
|
| 64 |
output = model(dummy_input)
|
| 65 |
+
print("β
Output shape:", output.shape)
|
|
|
|
| 66 |
```
|
| 67 |
|
| 68 |
---
|
| 69 |
|
| 70 |
+
## π§ͺ Demos (Quick Training + Inference)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 71 |
|
| 72 |
+
| Dataset | Training Script | Inference Script | Checkpoint Path |
|
| 73 |
+
|-------------|-----------------------------|------------------------------|-----------------------------------------------|
|
| 74 |
+
| MNIST | `demo_training_mnist.py` | `demo_inference_mnist.py` | `./checkpoints/fsg_vit_mnist_demo.pth` |
|
| 75 |
+
| Imagenette | `demo_training_imnet.py` | `demo_inference_imnet.py` | `./checkpoints/fsg_vit_imagenette_demo.pth` |
|
|
|
|
| 76 |
|
| 77 |
+
> β οΈ These demos use reduced datasets and epochs to run quickly and demonstrate the API.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 78 |
|
| 79 |
---
|
| 80 |
|
| 81 |
+
## π¦ Project Structure
|
| 82 |
|
| 83 |
```
|
| 84 |
.
|
| 85 |
+
βββ vit_with_fsg.py # FSG-ViT integration
|
| 86 |
βββ demo_training_mnist.py
|
| 87 |
βββ demo_inference_mnist.py
|
| 88 |
βββ demo_training_imnet.py
|
| 89 |
βββ demo_inference_imnet.py
|
| 90 |
+
βββ checkpoints/ # Model weights (optional)
|
| 91 |
+
βββ README.md # This model card
|
| 92 |
```
|
| 93 |
|
| 94 |
---
|
| 95 |
|
| 96 |
## π Citation
|
| 97 |
|
| 98 |
+
If you use this project, please cite our work:
|
| 99 |
|
| 100 |
```bibtex
|
| 101 |
@inproceedings{roffo2024FSG,
|
|
|
|
| 111 |
|
| 112 |
## π¬ Contact
|
| 113 |
|
| 114 |
+
**Giorgio Roffo**
|
| 115 |
π§ giorgio.roffo@gmail.com
|
| 116 |
+
π’ Cosmo Intelligent Medical Devices (IMD), Lainate, Italy
|
| 117 |
+
π [github.com/cosmoimd/feature-selection-gates](https://github.com/cosmoimd/feature-selection-gates)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|