metadata
license: mit
tags:
- clip
- feature-extraction
- remote-sensing
Remote-CLIP-ViT-L-14
This model is a mirror/redistribution of the original RemoteCLIP model.
Original Repository and Links
- Original Hugging Face Model: chendelong/RemoteCLIP
- Official GitHub Repository: ChenDelong1999/RemoteCLIP
Description
RemoteCLIP is a vision-language foundation model for remote sensing, trained on a large-scale dataset of remote sensing image-text pairs. It is based on the CLIP architecture and is designed to handle the unique characteristics of remote sensing imagery.
Citation
If you use this model in your research, please cite the original work:
@article{remoteclip,
author = {Fan Liu and
Delong Chen and
Zhangqingyun Guan and
Xiaocong Zhou and
Jiale Zhu and
Qiaolin Ye and
Liyong Fu and
Jun Zhou},
title = {RemoteCLIP: {A} Vision Language Foundation Model for Remote Sensing},
journal = {{IEEE} Transactions on Geoscience and Remote Sensing},
volume = {62},
pages = {1--16},
year = {2024},
url = {https://doi.org/10.1109/TGRS.2024.3390838},
doi = {10.1109/TGRS.2024.3390838},
}