RetFiner-VisionFM

This repo contains the weights of RetFiner-VisionFM from the paper RetFiner: A Vision-Language Refinement Scheme for Retinal Foundation Models.

Project page: RetFiner

Required Libraries

This model requires specific Python libraries:

torch==2.4.1+cu118
timm==0.4.12
torchvision==0.19.1+cu118

To use the model, please download the ViT implementation from visionfm_hf.py

Note: if you are using this for downstream inference, adjust the num_classes based on the target dataset.

import torch
import torch.nn as nn
from huggingface_hub import PyTorchModelHubMixin
from visionfm_hf import VisionTransformer

class RetFiner(nn.Module, PyTorchModelHubMixin):
    """RetFiner: Fine-tuned ViT models for retinal image analysis"""
    
    def __init__(self, model_name: str = "RetFiner-VisionFM", num_classes: int = 2, **kwargs):
        super().__init__()
        
        self.model = VisionTransformer(return_all_tokens=True, qkv_bias=True)
        
        self.config = {
            **kwargs
        }

    def forward(self, x):
        return self.model(x)

model = RetFiner.from_pretrained('ronnief1/RetFiner-VisionFM')

Citation

Please cite the original paper if you use this model:

@misc{fecso2025retfinervisionlanguagerefinementscheme,
      title={RetFiner: A Vision-Language Refinement Scheme for Retinal Foundation Models}, 
      author={Ronald Fecso and José Morano and Ursula Schmidt-Erfurth and Hrvoje Bogunović},
      year={2025},
      eprint={2506.22149},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2506.22149}, 
}

Downloads last month: 4

Paper for ronnief1/RetFiner-VisionFM

RetFiner: A Vision-Language Refinement Scheme for Retinal Foundation Models

Paper • 2506.22149 • Published Jun 27, 2025 • 4