RetFiner-VisionFM / README.md
ronnief1's picture
Update README.md
d00dd2d verified
---
pipeline_tag: image-classification
tags:
- model_hub_mixin
- pytorch_model_hub_mixin
- OCT
- classification
- retinal-imaging
---
## RetFiner-VisionFM
This repo contains the weights of RetFiner-VisionFM from the paper [RetFiner: A Vision-Language Refinement Scheme for Retinal Foundation Models](https://arxiv.org/abs/2506.22149).
Project page: [RetFiner](https://github.com/ronnief1/RetFiner)
### Required Libraries
This model requires specific Python libraries:
```bash
torch==2.4.1+cu118
timm==0.4.12
torchvision==0.19.1+cu118
```
To use the model, please download the ViT implementation from [visionfm_hf.py](https://github.com/ronnief1/RetFiner/blob/main/RetFiner/visionfm_hf.py)
Note: if you are using this for downstream inference, adjust the num_classes based on the target dataset.
```python
import torch
import torch.nn as nn
from huggingface_hub import PyTorchModelHubMixin
from visionfm_hf import VisionTransformer
class RetFiner(nn.Module, PyTorchModelHubMixin):
"""RetFiner: Fine-tuned ViT models for retinal image analysis"""
def __init__(self, model_name: str = "RetFiner-VisionFM", num_classes: int = 2, **kwargs):
super().__init__()
self.model = VisionTransformer(return_all_tokens=True, qkv_bias=True)
self.config = {
**kwargs
}
def forward(self, x):
return self.model(x)
model = RetFiner.from_pretrained('ronnief1/RetFiner-VisionFM')
```
## Citation
Please cite the original paper if you use this model:
```python
@misc{fecso2025retfinervisionlanguagerefinementscheme,
title={RetFiner: A Vision-Language Refinement Scheme for Retinal Foundation Models},
author={Ronald Fecso and José Morano and Ursula Schmidt-Erfurth and Hrvoje Bogunović},
year={2025},
eprint={2506.22149},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2506.22149},
}
```