MorphEm / README.md
Vidit2003's picture
Update README.md
c893904 verified
---
license: mit
pipeline_tag: image-classification
---
## Model Details
### Model Description
MorphEm is a self supervised learning framework trained with the DINO Bag of Channels recipe on the entire CHAMMI-75 dataset.
It serves as a benchmark for performance for self-supervised models.
- **Developed by:** Vidit Agrawal, John Peters, Juan Caicedo
- **Shared by:** [Caicedo Lab](https://morgridge.org/research/labs/caicedo/)
- **Model type:** Vision Transformer Small
- **License:** MIT License
### Model Sources
<!-- Provide the basic links for the model. -->
- **Repository:** https://github.com/CaicedoLab/CHAMMI-75
<!-- - **Paper** -->
- **Demo:** https://github.com/CaicedoLab/CHAMMI-75/tree/main/aws-tutorials
## Uses
<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
### Direct Use
<!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
[More Information Needed]
### Out-of-Scope Use
<!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
[More Information Needed]
## How to Get Started with the Model
Use the code below to get started with the model.
```python
from transformers import AutoModel
import torch
import torch.nn as nn
import torchvision
from torchvision import transforms as v2
import numpy as np
# Noise Injector transformation
class SaturationNoiseInjector(nn.Module):
def __init__(self, low=200, high=255):
super().__init__()
self.low = low
self.high = high
def forward(self, x: torch.Tensor) -> torch.Tensor:
channel = x[0].clone()
noise = torch.empty_like(channel).uniform_(self.low, self.high)
mask = (channel == 255).float()
noise_masked = noise * mask
channel[channel == 255] = 0
channel = channel + noise_masked
x[0] = channel
return x
# Self Normalize transformation
class PerImageNormalize(nn.Module):
def __init__(self, eps=1e-7):
super().__init__()
self.eps = eps
self.instance_norm = nn.InstanceNorm2d(
num_features=1,
affine=False,
track_running_stats=False,
eps=self.eps,
)
def forward(self, x: torch.Tensor) -> torch.Tensor:
if x.dim() == 3:
x = x.unsqueeze(0)
x = self.instance_norm(x)
if x.shape[0] == 1:
x = x.squeeze(0)
return x
# Load model
device = "cuda"
model = AutoModel.from_pretrained("CaicedoLab/MorphEm", trust_remote_code=True)
model.to(device).eval()
# Define transforms
transform = v2.Compose([
SaturationNoiseInjector(),
PerImageNormalize(),
v2.Resize(size=(224, 224), antialias=True),
])
# Generate random batch (N, C, H, W)
batch_size = 2
num_channels = 3
images = torch.randint(0, 256, (batch_size, num_channels, 512, 512), dtype=torch.float32)
print(f"Input shape: {images.shape} (N={batch_size}, C={num_channels}, H=512, W=512)")
print()
# Bag of Channels (BoC) - process each channel independently
with torch.no_grad():
batch_feat = []
images = images.to(device)
for c in range(images.shape[1]):
# Extract single channel: (N, C, H, W) -> (N, 1, H, W)
single_channel = images[:, c, :, :].unsqueeze(1)
# Apply transforms
single_channel = transform(single_channel.squeeze(1)).unsqueeze(1)
# Extract features
output = model.forward_features(single_channel)
feat_temp = output["x_norm_clstoken"].cpu().detach().numpy()
batch_feat.append(feat_temp)
# Concatenate features from all channels
features = np.concatenate(batch_feat, axis=1)
print(f"Output shape: {features.shape}")
print(f" - Batch size (N): {features.shape[0]}")
print(f" - Feature dimension (C * feature_dim): {features.shape[1]}")
```
## Training Details
### Training Data
<!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
MorphEm was pre-trained on the entire CHAMMI-75 pre-training data.
The CHAMMI-75 dataset consists of 75 heterogenous studies and 2.8 million multi-channel images.
### Training Procedure
We have utilized the self-supervised learning framework called DINO. We pre-trained a model which inputs a single channel one at a time. For evaluation, you would concatenate each channel specifically.
#### Preprocessing
We used three transforms mainly for preprocessing: SaturationNoiseInjector(), SelfImageNormalize(), Resize(224,224)
```python
# Noise Injector transformation
class SaturationNoiseInjector(nn.Module):
def __init__(self, low=200, high=255):
super().__init__()
self.low = low
self.high = high
def forward(self, x: torch.Tensor) -> torch.Tensor:
channel = x[0].clone()
noise = torch.empty_like(channel).uniform_(self.low, self.high)
mask = (channel == 255).float()
noise_masked = noise * mask
channel[channel == 255] = 0
channel = channel + noise_masked
x[0] = channel
return x
# Self Normalize transformation
class PerImageNormalize(nn.Module):
def __init__(self, eps=1e-7):
super().__init__()
self.eps = eps
self.instance_norm = nn.InstanceNorm2d(
num_features=1,
affine=False,
track_running_stats=False,
eps=self.eps,
)
def forward(self, x: torch.Tensor) -> torch.Tensor:
if x.dim() == 3:
x = x.unsqueeze(0)
x = self.instance_norm(x)
if x.shape[0] == 1:
x = x.squeeze(0)
return x
```
## Evaluation
<!-- This section describes the evaluation protocols and provides the results. -->
We have evaluated this model on 6 different benchmarks. The model is highly competitive in most of them. The benchmarks are listed below:
1. CHAMMI
2. HPAv23
3. Jump-CP
4. IDR0017
5. CELLPHIE
6. RBC-MC
More details can be found in the paper:
#### Summary
## Environmental Impact
<!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->
- **Hardware Type:** Nvidia RTX A6000
- **Hours used:** 2352
- **Cloud Provider:** Private Infrastructure
- **Compute Region:** Private Infrastructure
- **Carbon Emitted:** 304 kg CO2
## Technical Specifications
The model is a ViT Small trained on 2500 Nvidia A6000 GPU hours. The model was trained on a multi-node system with 2 nodes, each containing 7 GPUs.
## Citation
Can be cited as the following:
<!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
<!-- **BibTeX:** -->
<!-- **APA:** -->
## Model Card Authors
Vidit Agrawal, John Peters, Juan C. Caicedo
## Model Card Contact
vagrawal22@wisc.edu, jgpeters3@wisc.edu, juan.caicedo@wisc.edu