--- library_name: timm license: cc-by-4.0 pipeline_tag: image-feature-extraction tags: - radiology - medical-imaging - xray - ct - mri - ultrasound - foundation-model - vision-transformer - self-supervised - dino - dinov2 model-index: - name: OmniRad-base results: - task: type: image-feature-extraction dataset: name: RadImageNet type: radimagenet metrics: - name: Representation learning type: other value: "Self-supervised pretrained encoder" --- # OmniRad: A General-Purpose Radiological Foundation Model [💻 Code](https://github.com/unica-visual-intelligence-lab/OmniRad) **OmniRad** is a **self-supervised radiological foundation model** designed to learn **stable, transferable, and task-agnostic visual representations** for medical imaging. It is pretrained on large-scale, heterogeneous radiological data and intended for reuse across **classification**, **segmentation**, and **exploratory vision–language** tasks without task-specific pretraining. This repository provides the **OmniRad-base** variant, a compact Vision Transformer encoder that offers an excellent trade-off between computational efficiency and representational power. --- ## Key Features - **Radiology-focused foundation model** pretrained on >1M radiological images - **Self-supervised learning** based on a customized DINOv2 framework - **Task-agnostic encoder** reusable across classification, segmentation, and multimodal pipelines - **Strong transferability** across modalities (CT, MRI, X-ray, ultrasound) - **Radiomics-oriented design**, emphasizing representation stability and reuse --- ## Example Usage: Feature Extraction ```python from PIL import Image from torchvision import transforms import timm import torch # Load OmniRad-base from Hugging Face Hub model = timm.create_model( "hf_hub:Snarcy/OmniRad-base", pretrained=True, num_classes=0 # return embeddings ) model.eval() device = "cuda" if torch.cuda.is_available() else "cpu" model.to(device) # Preprocessing transform = transforms.Compose([ transforms.Resize((224, 224)), transforms.ToTensor(), transforms.Normalize( mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225], ), ]) # Load image image = Image.open("path/to/radiology_image.png").convert("RGB") x = transform(image).unsqueeze(0).to(device) # Extract features with torch.no_grad(): embedding = model(x) # shape: [1, 384] ``` --- ## Available Downstream Code The **official OmniRad repository** provides **end-to-end implementations** for all evaluated downstream tasks: 👉 **https://github.com/unica-visual-intelligence-lab/OmniRad** Including: - **Image-level classification** (MedMNIST v2 benchmarks) - **Dense medical image segmentation** (MedSegBench, frozen encoder + lightweight decoders) - **Radiological image captioning** (BART-based vision–language framework) - Full training, evaluation, and ablation scripts - Reproducible experimental configurations matching the paper --- ## Model Details - **Architecture:** Vision Transformer (ViT-B) - **Patch size:** 14 - **Embedding dimension:** 768 - **Pretraining framework:** Modified DINOv2 (global crops only) - **Pretraining dataset:** RadImageNet (~1.2M radiological images) - **Input resolution:** 224 × 224 - **Backbone type:** Encoder-only (no task-specific heads) ### Pretraining Notes - Local crops are removed to improve training stability and downstream transferability - No feature collapse observed during training - Same hyperparameter configuration used across small and base variants - Designed to support frozen-backbone adaptation and lightweight fine-tuning --- ## Intended Use OmniRad is intended as a **general-purpose radiological image encoder** for: - Image-level classification (e.g., disease or organ recognition) - Dense prediction (e.g., medical image segmentation via adapters or decoders) - Radiomics feature extraction - Representation transfer across datasets, modalities, and institutions - Exploratory vision–language research (e.g., radiological image captioning) **Not intended for direct clinical deployment without task-specific validation.** --- ## License This project and the released model weights are licensed under the Creative Commons Attribution 4.0 International (CC BY 4.0) license.
**Made with ❤️ by [UNICA Visual Intelligence Lab](https://github.com/unica-visual-intelligence-lab)**