File size: 1,323 Bytes
967cdb7 b50bf41 967cdb7 b50bf41 7199d25 b50bf41 b7b55c3 b50bf41 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 | ---
tags:
- model_hub_mixin
- pytorch_model_hub_mixin
license: mit
---
## A Reality Check of Vision-Language Pre-training in Radiology: Have We Progressed Using Text?
- Code: [DLILP](https://github.com/jusiro/DLILP)
- Paper: [IPMI 2025](https://link.springer.com/chapter/10.1007/978-3-031-96625-5_20) - [ArXiv](https://arxiv.org/abs/2504.05227)
- Docs: [Documentation](https://github.com/jusiro/DLILP)
- Tutorial: [Notebook](https://colab.research.google.com/drive/1_8Ysd8mCKuLX_Q86e-7pOAHFbSR9F4aZ?usp=sharing)
### About "CONVIRT" weights:
- Pre-trained using a vanilla CLIP contrastive loss - a very similar pre-training as earlier proposed in [CONVIRT](https://arxiv.org/abs/2010.00747) paper (2020).
- Pre-trained on MIMIC.
If you find this repository useful, please consider citing this paper:
```
@inproceedings{convirt,
author = {Yuhao Zhang and others},
booktitle = {MHLC},
pages = {1-24},
title = {Contrastive Learning of Medical Visual Representations from Paired Images and Text},
year = {2022},
}
@inproceedings{dlilp,
title={A Reality Check of Vision-Language Pre-training in Radiology: Have We Progressed Using Text?},
author={Julio Silva-Rodríguez and Jose Dolz and Ismail {Ben Ayed}},
booktitle={Information Processing in Medical Imaging (IPMI)},
year={2025}
}
``` |