---
library_name: timm
license: cc-by-4.0
pipeline_tag: image-feature-extraction
tags:
- radiology
- medical-imaging
- xray
- ct
- mri
- ultrasound
- foundation-model
- vision-transformer
- self-supervised
- dino
- dinov2

model-index:
- name: OmniRad-base
  results:
  - task:
      type: image-feature-extraction
    dataset:
      name: RadImageNet
      type: radimagenet
    metrics:
    - name: Representation learning
      type: other
      value: "Self-supervised pretrained encoder"
---

# OmniRad: A General-Purpose Radiological Foundation Model
<!--
[📄 Paper](https://arxiv.org/abs/XXXX.XXXXX) |
-->
 [💻 Code](https://github.com/unica-visual-intelligence-lab/OmniRad)

**OmniRad** is a **self-supervised radiological foundation model** designed to learn **stable, transferable, and task-agnostic visual representations** for medical imaging. It is pretrained on large-scale, heterogeneous radiological data and intended for reuse across **classification**, **segmentation**, and **exploratory vision–language** tasks without task-specific pretraining.

This repository provides the **OmniRad-base** variant, a compact Vision Transformer encoder that offers an excellent trade-off between computational efficiency and representational power.

---

## Key Features

- **Radiology-focused foundation model** pretrained on >1M radiological images
- **Self-supervised learning** based on a customized DINOv2 framework
- **Task-agnostic encoder** reusable across classification, segmentation, and multimodal pipelines
- **Strong transferability** across modalities (CT, MRI, X-ray, ultrasound)
- **Radiomics-oriented design**, emphasizing representation stability and reuse

---


## Example Usage: Feature Extraction

```python
from PIL import Image
from torchvision import transforms
import timm
import torch

# Load OmniRad-base from Hugging Face Hub
model = timm.create_model(
    "hf_hub:Snarcy/OmniRad-base",
    pretrained=True,
    num_classes=0  # return embeddings
)

model.eval()
device = "cuda" if torch.cuda.is_available() else "cpu"
model.to(device)

# Preprocessing
transform = transforms.Compose([
    transforms.Resize((224, 224)),
    transforms.ToTensor(),
    transforms.Normalize(
        mean=[0.485, 0.456, 0.406],
        std=[0.229, 0.224, 0.225],
    ),
])

# Load image
image = Image.open("path/to/radiology_image.png").convert("RGB")
x = transform(image).unsqueeze(0).to(device)

# Extract features
with torch.no_grad():
    embedding = model(x)  # shape: [1, 384]


```
---

## Available Downstream Code

The **official OmniRad repository** provides **end-to-end implementations** for all evaluated downstream tasks:

👉 **https://github.com/unica-visual-intelligence-lab/OmniRad**

Including:
- **Image-level classification** (MedMNIST v2 benchmarks)
- **Dense medical image segmentation** (MedSegBench, frozen encoder + lightweight decoders)
- **Radiological image captioning** (BART-based vision–language framework)
- Full training, evaluation, and ablation scripts
- Reproducible experimental configurations matching the paper

---
## Model Details

- **Architecture:** Vision Transformer (ViT-B)
- **Patch size:** 14
- **Embedding dimension:** 768
- **Pretraining framework:** Modified DINOv2 (global crops only)
- **Pretraining dataset:** RadImageNet (~1.2M radiological images)
- **Input resolution:** 224 × 224
- **Backbone type:** Encoder-only (no task-specific heads)

### Pretraining Notes

- Local crops are removed to improve training stability and downstream transferability
- No feature collapse observed during training
- Same hyperparameter configuration used across small and base variants
- Designed to support frozen-backbone adaptation and lightweight fine-tuning

---


## Intended Use

OmniRad is intended as a **general-purpose radiological image encoder** for:

- Image-level classification (e.g., disease or organ recognition)
- Dense prediction (e.g., medical image segmentation via adapters or decoders)
- Radiomics feature extraction
- Representation transfer across datasets, modalities, and institutions
- Exploratory vision–language research (e.g., radiological image captioning)

**Not intended for direct clinical deployment without task-specific validation.**

---


## License

This project and the released model weights are licensed under the Creative Commons
Attribution 4.0 International (CC BY 4.0) license.

<div align="center">

**Made with ❤️ by [UNICA Visual Intelligence Lab](https://github.com/unica-visual-intelligence-lab)**

</div>