RSCLIP Collections
Collection
A collection of Remote Sensing CLIP models in both huggingface/transformers and huggingface/diffusers text encoder production ready style
•
10 items
•
Updated
This model is a mirror/redistribution of the original lcybuaa/Git-RSCLIP model.
Git-RSCLIP is pre-trained on the Git-10M dataset, a global-scale remote sensing image-text pair dataset consisting of 10 million pairs. It uses a structure similar to SigLIP and is designed for tasks like image-text retrieval and zero-shot classification in the remote sensing domain.
If you use this model in your research, please cite the original work:
@misc{liu2025text2earthunlockingtextdrivenremote,
title={Text2Earth: Unlocking Text-driven Remote Sensing Image Generation with a Global-Scale Dataset and a Foundation Model},
author={Chenyang Liu and Keyan Chen and Rui Zhao and Zhengxia Zou and Zhenwei Shi},
year={2025},
eprint={2501.00895},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2501.00895},
}