MorphEm / README.md

Vidit2003

Update README.md

c893904 verified about 1 month ago

7.11 kB

	---
	license: mit
	pipeline_tag: image-classification
	---
	## Model Details

	### Model Description

	MorphEm is a self supervised learning framework trained with the DINO Bag of Channels recipe on the entire CHAMMI-75 dataset.
	It serves as a benchmark for performance for self-supervised models.

	- Developed by: Vidit Agrawal, John Peters, Juan Caicedo
	- Shared by: [Caicedo Lab](https://morgridge.org/research/labs/caicedo/)
	- Model type: Vision Transformer Small
	- License: MIT License

	### Model Sources

	<!-- Provide the basic links for the model. -->

	- Repository: https://github.com/CaicedoLab/CHAMMI-75
	<!-- - Paper -->
	- Demo: https://github.com/CaicedoLab/CHAMMI-75/tree/main/aws-tutorials

	## Uses

	<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->

	### Direct Use

	<!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->

	[More Information Needed]

	### Out-of-Scope Use

	<!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->

	[More Information Needed]


	## How to Get Started with the Model

	Use the code below to get started with the model.

	```python
	from transformers import AutoModel
	import torch
	import torch.nn as nn
	import torchvision
	from torchvision import transforms as v2
	import numpy as np

	# Noise Injector transformation
	class SaturationNoiseInjector(nn.Module):
	def __init__(self, low=200, high=255):
	super().__init__()
	self.low = low
	self.high = high

	def forward(self, x: torch.Tensor) -> torch.Tensor:
	channel = x[0].clone()
	noise = torch.empty_like(channel).uniform_(self.low, self.high)
	mask = (channel == 255).float()
	noise_masked = noise * mask
	channel[channel == 255] = 0
	channel = channel + noise_masked
	x[0] = channel
	return x


	# Self Normalize transformation
	class PerImageNormalize(nn.Module):
	def __init__(self, eps=1e-7):
	super().__init__()
	self.eps = eps
	self.instance_norm = nn.InstanceNorm2d(
	num_features=1,
	affine=False,
	track_running_stats=False,
	eps=self.eps,
	)

	def forward(self, x: torch.Tensor) -> torch.Tensor:
	if x.dim() == 3:
	x = x.unsqueeze(0)
	x = self.instance_norm(x)
	if x.shape[0] == 1:
	x = x.squeeze(0)
	return x


	# Load model
	device = "cuda"
	model = AutoModel.from_pretrained("CaicedoLab/MorphEm", trust_remote_code=True)
	model.to(device).eval()

	# Define transforms
	transform = v2.Compose([
	SaturationNoiseInjector(),
	PerImageNormalize(),
	v2.Resize(size=(224, 224), antialias=True),
	])

	# Generate random batch (N, C, H, W)
	batch_size = 2
	num_channels = 3
	images = torch.randint(0, 256, (batch_size, num_channels, 512, 512), dtype=torch.float32)

	print(f"Input shape: {images.shape} (N={batch_size}, C={num_channels}, H=512, W=512)")
	print()

	# Bag of Channels (BoC) - process each channel independently
	with torch.no_grad():
	batch_feat = []
	images = images.to(device)

	for c in range(images.shape[1]):
	# Extract single channel: (N, C, H, W) -> (N, 1, H, W)
	single_channel = images[:, c, :, :].unsqueeze(1)

	# Apply transforms
	single_channel = transform(single_channel.squeeze(1)).unsqueeze(1)

	# Extract features
	output = model.forward_features(single_channel)
	feat_temp = output["x_norm_clstoken"].cpu().detach().numpy()
	batch_feat.append(feat_temp)

	# Concatenate features from all channels
	features = np.concatenate(batch_feat, axis=1)

	print(f"Output shape: {features.shape}")
	print(f" - Batch size (N): {features.shape[0]}")
	print(f" - Feature dimension (C * feature_dim): {features.shape[1]}")
	```


	## Training Details

	### Training Data

	<!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->

	MorphEm was pre-trained on the entire CHAMMI-75 pre-training data.
	The CHAMMI-75 dataset consists of 75 heterogenous studies and 2.8 million multi-channel images.

	### Training Procedure

	We have utilized the self-supervised learning framework called DINO. We pre-trained a model which inputs a single channel one at a time. For evaluation, you would concatenate each channel specifically.

	#### Preprocessing

	We used three transforms mainly for preprocessing: SaturationNoiseInjector(), SelfImageNormalize(), Resize(224,224)

	```python
	# Noise Injector transformation
	class SaturationNoiseInjector(nn.Module):
	def __init__(self, low=200, high=255):
	super().__init__()
	self.low = low
	self.high = high

	def forward(self, x: torch.Tensor) -> torch.Tensor:
	channel = x[0].clone()
	noise = torch.empty_like(channel).uniform_(self.low, self.high)
	mask = (channel == 255).float()
	noise_masked = noise * mask
	channel[channel == 255] = 0
	channel = channel + noise_masked
	x[0] = channel
	return x


	# Self Normalize transformation
	class PerImageNormalize(nn.Module):
	def __init__(self, eps=1e-7):
	super().__init__()
	self.eps = eps
	self.instance_norm = nn.InstanceNorm2d(
	num_features=1,
	affine=False,
	track_running_stats=False,
	eps=self.eps,
	)

	def forward(self, x: torch.Tensor) -> torch.Tensor:
	if x.dim() == 3:
	x = x.unsqueeze(0)
	x = self.instance_norm(x)
	if x.shape[0] == 1:
	x = x.squeeze(0)
	return x
	```


	## Evaluation

	<!-- This section describes the evaluation protocols and provides the results. -->
	We have evaluated this model on 6 different benchmarks. The model is highly competitive in most of them. The benchmarks are listed below:

	1. CHAMMI
	2. HPAv23
	3. Jump-CP
	4. IDR0017
	5. CELLPHIE
	6. RBC-MC

	More details can be found in the paper:

	#### Summary

	## Environmental Impact

	<!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->

	- Hardware Type: Nvidia RTX A6000
	- Hours used: 2352
	- Cloud Provider: Private Infrastructure
	- Compute Region: Private Infrastructure
	- Carbon Emitted: 304 kg CO2

	## Technical Specifications


	The model is a ViT Small trained on 2500 Nvidia A6000 GPU hours. The model was trained on a multi-node system with 2 nodes, each containing 7 GPUs.

	## Citation

	Can be cited as the following:


	<!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->

	<!-- BibTeX: -->


	<!-- APA: -->

	## Model Card Authors

	Vidit Agrawal, John Peters, Juan C. Caicedo

	## Model Card Contact

	vagrawal22@wisc.edu, jgpeters3@wisc.edu, juan.caicedo@wisc.edu