OmniRad-base / README.md
Snarcy's picture
Update README.md
3e42381 verified
---
library_name: timm
license: cc-by-4.0
pipeline_tag: image-feature-extraction
tags:
- radiology
- medical-imaging
- xray
- ct
- mri
- ultrasound
- foundation-model
- vision-transformer
- self-supervised
- dino
- dinov2
model-index:
- name: OmniRad-base
results:
- task:
type: image-feature-extraction
dataset:
name: RadImageNet
type: radimagenet
metrics:
- name: Representation learning
type: other
value: "Self-supervised pretrained encoder"
---
# OmniRad: A General-Purpose Radiological Foundation Model
<!--
[📄 Paper](https://arxiv.org/abs/XXXX.XXXXX) |
-->
[💻 Code](https://github.com/unica-visual-intelligence-lab/OmniRad)
**OmniRad** is a **self-supervised radiological foundation model** designed to learn **stable, transferable, and task-agnostic visual representations** for medical imaging. It is pretrained on large-scale, heterogeneous radiological data and intended for reuse across **classification**, **segmentation**, and **exploratory vision–language** tasks without task-specific pretraining.
This repository provides the **OmniRad-base** variant, a compact Vision Transformer encoder that offers an excellent trade-off between computational efficiency and representational power.
---
## Key Features
- **Radiology-focused foundation model** pretrained on >1M radiological images
- **Self-supervised learning** based on a customized DINOv2 framework
- **Task-agnostic encoder** reusable across classification, segmentation, and multimodal pipelines
- **Strong transferability** across modalities (CT, MRI, X-ray, ultrasound)
- **Radiomics-oriented design**, emphasizing representation stability and reuse
---
## Example Usage: Feature Extraction
```python
from PIL import Image
from torchvision import transforms
import timm
import torch
# Load OmniRad-base from Hugging Face Hub
model = timm.create_model(
"hf_hub:Snarcy/OmniRad-base",
pretrained=True,
num_classes=0 # return embeddings
)
model.eval()
device = "cuda" if torch.cuda.is_available() else "cpu"
model.to(device)
# Preprocessing
transform = transforms.Compose([
transforms.Resize((224, 224)),
transforms.ToTensor(),
transforms.Normalize(
mean=[0.485, 0.456, 0.406],
std=[0.229, 0.224, 0.225],
),
])
# Load image
image = Image.open("path/to/radiology_image.png").convert("RGB")
x = transform(image).unsqueeze(0).to(device)
# Extract features
with torch.no_grad():
embedding = model(x) # shape: [1, 384]
```
---
## Available Downstream Code
The **official OmniRad repository** provides **end-to-end implementations** for all evaluated downstream tasks:
👉 **https://github.com/unica-visual-intelligence-lab/OmniRad**
Including:
- **Image-level classification** (MedMNIST v2 benchmarks)
- **Dense medical image segmentation** (MedSegBench, frozen encoder + lightweight decoders)
- **Radiological image captioning** (BART-based vision–language framework)
- Full training, evaluation, and ablation scripts
- Reproducible experimental configurations matching the paper
---
## Model Details
- **Architecture:** Vision Transformer (ViT-B)
- **Patch size:** 14
- **Embedding dimension:** 768
- **Pretraining framework:** Modified DINOv2 (global crops only)
- **Pretraining dataset:** RadImageNet (~1.2M radiological images)
- **Input resolution:** 224 × 224
- **Backbone type:** Encoder-only (no task-specific heads)
### Pretraining Notes
- Local crops are removed to improve training stability and downstream transferability
- No feature collapse observed during training
- Same hyperparameter configuration used across small and base variants
- Designed to support frozen-backbone adaptation and lightweight fine-tuning
---
## Intended Use
OmniRad is intended as a **general-purpose radiological image encoder** for:
- Image-level classification (e.g., disease or organ recognition)
- Dense prediction (e.g., medical image segmentation via adapters or decoders)
- Radiomics feature extraction
- Representation transfer across datasets, modalities, and institutions
- Exploratory vision–language research (e.g., radiological image captioning)
**Not intended for direct clinical deployment without task-specific validation.**
---
## License
This project and the released model weights are licensed under the Creative Commons
Attribution 4.0 International (CC BY 4.0) license.
<div align="center">
**Made with ❤️ by [UNICA Visual Intelligence Lab](https://github.com/unica-visual-intelligence-lab)**
</div>